Skip to content

Not completed in overlap_collect step #425

@kjyunm

Description

@kjyunm

Hello,
I am trying to build a pangenome using pig data from multiple breeds. The input data is 4.5GB when compressed, and I am using the Docker image ghcr.io/pangenome/pggb:latest.

I ran the following command
docker run -it -v ${PWD}:/data -u $(id -u):$(id -g) ghcr.io/pangenome/pggb:latest pggb -i /data/pigs.assembly.fa.gz -o /data/out -t 20

However, during the execution, it gets stuck at the step
[seqwish::transclosure] 25082.955 81.34% 12280027975-12290027975 overlap_collect
and has not progressed for over a week.

Could you please advise if there are any additional options or preprocessing steps needed to resolve this issue? I would appreciate any suggestions on the cause of this hanging and how to resolve it.

Thank you!

Activity

AndreaGuarracino

AndreaGuarracino commented on Nov 16, 2024

@AndreaGuarracino
Member

@kjyunm maybe you caught a seqwish bug? You could try to increase -k/--min-match-len in pggb (23 by default) to make seqwish's life a bit easier

kjyunm

kjyunm commented on Nov 20, 2024

@kjyunm
Author

Thank you!
I adjusted the parameter k, and it progressed to the next step. However, after finishing the graph construction in the seqwish step, I encountered the following error and log message.

[seqwish::gfa] 31934.154 writing graph
Command terminated by signal 8
seqwish -s /data/pigs.assembly_only.fa.gz -p /data/out/pigs.assembly_only.fa.gz.bf3285f.alignments.wfmash.paf -k 50 -f 0 -g /data/out/pigs.assembly_only.fa.gz.bf3285f.e81ee67.seqwish.gfa -B 10M -t 20 --temp-dir /data/out -P
250262.86s user 18723.53s system 840% cpu 31984.70s total 43804008Kb max memory

I would like to know why I am getting Command terminated by signal 8 and if there is a solution to fix the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @AndreaGuarracino@kjyunm

        Issue actions

          Not completed in overlap_collect step · Issue #425 · pangenome/pggb