-
Notifications
You must be signed in to change notification settings - Fork 20
Example command lines
Hannes Hauswedell edited this page Mar 28, 2016
·
7 revisions
- Download pre-formatted UniprotSprot from Pre-built-Database-Indexes and unpack.
- Select your query file or take this example.
- Run
bin/lambda -q /path/to/1k_long_reads.fna -d /path/to/uniprot_sprot.fasta.gz
You will see something like this:
LAMBDA - the Local Aligner for Massive Biological DatA
======================================================
Version 0.9.4
Loading Subj Sequences... done.
Loading Subj Ids... done.
Loading Database Index... done.
Loading Database Masking file... done.
Loading Query Sequences and Ids...translating... done.
Searching and extending hits on-line...progress:
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
|....:....:....:....:....:....:....:....:....:....|
Number of valid hits: 77763
Number of Queries with at least one valid hit: 709
Since you did not specify it, the default output file name and format was used: output.m8.
Browse the output file with your editor of choice, or less on the command line:
% less output.m8
SHAA001TF sp|Q00593|ALKJ_PSEOL 31.03 290 191 7 1 849 84 371 2e-34 147
SHAA001TF sp|Q9WWW2|ALKJ_PSEPU 31.03 290 191 6 1 849 84 371 1e-32 141
SHAA001TF sp|Q8BJ64|CHDH_MOUSE 24.28 276 205 3 1 819 132 406 1e-20 101
SHAA001TF sp|Q6UPE0|CHDH_RAT 24.01 279 202 3 1 819 135 409 5e-20 99.0
SHAA001TF sp|Q8NE62|CHDH_HUMAN 24.28 276 205 3 1 819 130 404 6e-19 95.5
SHAA002TR sp|A7MD48|SRRM4_HUMAN 27.78 216 144 5 25 657 403 611 1e-06 54.7
SHAA002TR sp|O13547|CCW14_YEAST 28.95 114 76 1 382 723 105 213 0.070 38.9
SHAA004TF sp|P0A916|OMPW_SHIFL 50.43 115 49 2 389 733 1 107 8e-23 108
SHAA004TF sp|P0A915|OMPW_ECOLI 50.43 115 49 2 389 733 1 107 8e-23 108
SHAA004TF sp|P17266|OMPW_VIBCH 52.21 113 45 2 410 733 4 112 2e-22 107
SHAA004TF sp|Q8ZP50|OMPW_SALTY 50.43 115 49 2 389 733 1 107 7e-21 102
SHAA004TF sp|Q8Z7E2|OMPW_SALTI 50.43 115 49 2 389 733 1 107 7e-21 102
[...]
NOTE: Because Lambda uses multiple threads by default, the output is not guaranteed to be in the same order (however matches of one query sequence always appear en-bloc and sorted by e-value).
Follow above instructions, but choose .sam-format as output. Also use an e-value cutoff of 1e-4.
How would the command line look?
`bin/lambda -q /path/to/1k_long_reads.fna -d /path/to/uniprot_sprot.fasta.gz -o output.sam -e 1e-4`The program will now print:
LAMBDA - the Local Aligner for Massive Biological DatA
======================================================
Version 0.9.4
Loading Subj Sequences... done.
Loading Subj Ids... done.
Loading Database Index... done.
Loading Database Masking file... done.
Loading Query Sequences and Ids...translating... done.
Searching and extending hits on-line...progress:
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
|....:....:....:....:....:....:....:....:....:....|
Number of valid hits: 73294
Number of Queries with at least one valid hit: 666
As you can see, the number of hits has been reduced slightly due to the more stringent cutoff.
View the output again to verify that it is beautiful SAM:
@HD VN:1.4 GO:query
@PG ID:lambda PN:lambda VN:0.9.4 CL:bin/lambda -d /home/h4nn3s/sequences/uniprot_sprot.fasta.gz -q /home/h4nn3s/sequences/1k_long_reads.fasta -o output.sam -e 1e-4
@CO Lambda is a high performance BLAST compatible local aligner, please see http://seqan.de/lambda for more information.
@CO SAM/BAM dialect documentation is available here: https://github.com/seqan/lambda/wiki/Output-Formats
@CO If you use any results found by Lambda, please cite Hauswedell et al. (2014) doi: 10.1093/bioinformatics/btu439
@CO Optional tags as follow ZE:expect value AS:bit score ZI:% identity (in protein space unless BLASTN) ZF:query frame NM:edit distance (in protein space unless BLASTN)
SHAA001TF 0 sp|Q00593|ALKJ_PSEOL 84 255 63M6D57M6D60M3I270M3D42M3D153M3D171M3I27M35H * 0 0 GGCTCTGGCTCAATAAATGCAATGGTCTATGCAAGAGGATTAGAAACAGATTATGAGAATTGGGGCACCAATAAGGAATGGAGTTTTGAAAATATAAAAAAAATATACAGATCTATGGAGCAACAAATAAATGATGATAAAGAATTTCTTACAAAAGAAAAGATTCCAGTAAATAATGTAAGTAAGCATCATCATCCAATTTTAGAATATTTTTTTAATGCTAGTAATGAAATTGGTATTAAAAAAAATACAAATTTAACTACATCAATCGAAAATCAAGTAGGTCATTATAATATTAATACTTACAACGGTACTAGACATTCATCATCAAAAGTATTTTTAAAGCCTGTATTAAAAAATCCTAGGCTAACTATACTAGACAATACTCAAGTTAAAAACTTAATAATTAAAGATAAAAAGATTACTGGTATAAAAATTCAAAATAAATCAATAGAACAAATCATACACTTATCTCATGGAGCAATTTTATGTTCAGGTTCTATAATGACACCTTATTTGTTAATGCATTCTGGTATTGGTGATAAAGAGCATTTAAAACAATTTGATAAAGAAATAATAATTGATAATACTAATGTTGGAAGAAATCTTCAGGATCATCTTGGGTTAGATTATTTATTTAAAACTCATCATCATTCACTAAATAAATCACTAGGAACTTGGCCAGGTAGAATTACTTCTGTATTAAAATATATTTATAATAGGAAAGGACCATTATCACTTAGTATTAATCAGTCTGGGGGATATGTAAATTGGAATTCAAAACATCATTATCCCAACTTACAGATATATTTCAATCCGTTGACTTATTCTATCACTCATAAAAATAAA * ZE:f:2.29386e-34 AS:i:146 ZI:i:31 ZF:i:1 NM:i:200
SHAA001TF 256 sp|Q9WWW2|ALKJ_PSEPU 84 255 63M6D87M6D30M3I288M6D186M3D162M3I27M35H * 0 0 * * ZE:f:1.25869e-32 AS:i:140 ZI:i:31 ZF:i:1 NM:i:200
SHAA001TF 256 sp|Q8BJ64|CHDH_MOUSE 132 255 72M6D555M3I72M3D117M65H * 0 0 GGCTCTGGCTCAATAAATGCAATGGTCTATGCAAGAGGATTAGAAACAGATTATGAGAATTGGGGCACCAATAAGGAATGGAGTTTTGAAAATATAAAAAAAATATACAGATCTATGGAGCAACAAATAAATGATGATAAAGAATTTCTTACAAAAGAAAAGATTCCAGTAAATAATGTAAGTAAGCATCATCATCCAATTTTAGAATATTTTTTTAATGCTAGTAATGAAATTGGTATTAAAAAAAATACAAATTTAACTACATCAATCGAAAATCAAGTAGGTCATTATAATATTAATACTTACAACGGTACTAGACATTCATCATCAAAAGTATTTTTAAAGCCTGTATTAAAAAATCCTAGGCTAACTATACTAGACAATACTCAAGTTAAAAACTTAATAATTAAAGATAAAAAGATTACTGGTATAAAAATTCAAAATAAATCAATAGAACAAATCATACACTTATCTCATGGAGCAATTTTATGTTCAGGTTCTATAATGACACCTTATTTGTTAATGCATTCTGGTATTGGTGATAAAGAGCATTTAAAACAATTTGATAAAGAAATAATAATTGATAATACTAATGTTGGAAGAAATCTTCAGGATCATCTTGGGTTAGATTATTTATTTAAAACTCATCATCATTCACTAAATAAATCACTAGGAACTTGGCCAGGTAGAATTACTTCTGTATTAAAATATATTTATAATAGGAAAGGACCATTATCACTTAGTATTAATCAGTCTGGGGGATATGTAAATTGGAATTCAAAACATCATTATCCCAACTTACAGATATATTTCAATCCG * ZE:f:1.44347e-20 AS:i:100 ZI:i:24 ZF:i:1 NM:i:209
SHAA001TF 256 sp|Q6UPE0|CHDH_RAT 135 255 72M6D579M12D21M12I135M65H * 0 0 * * ZE:f:5.48516e-20 AS:i:98 ZI:i:24 ZF:i:1 NM:i:212
SHAA001TF 256 sp|Q8NE62|CHDH_HUMAN 130 255 72M6D555M3I72M3D117M65H * 0 0 * * ZE:f:6.06456e-19 AS:i:95 ZI:i:24 ZF:i:1 NM:i:209
SHAA002TR 0 sp|A7MD48|SRRM4_HUMAN 403 255 24H84M9I183M9D57M12I189M3D33M3D66M249H * 0 0 TACACCGCACGGAAGGCGCGCGTCCGCGCGTGCTCGGCAAGCACGACGCCGTCGGCGGCTGCCTGCTGTCGGAGCTCGGCGAGCTGCGTCCGTCGCGCATCCTGCCGGTGTTCGCCGACTGGCTCGCGCGCCACAAGCCGGCGCTCGACCGCCGCGAGCGCGTGGTCGACCTCGTCGCGCCGCAGATCCTGTCGAACGAGGCGGACGCGGTGAAGCGCACGCCGTACTTCTGCTCGGGCTGCCCGCACAACACGTCGACGAAGGTGCCGGAAGGCTCGATCGCGCAGGCCGGCATCGGCTGCCACTTCATGGCGTCGTGGATGGAGCGCGACACCACTGGCCTGATCCAGATGGGTGGCGAAGGCGTCGACTGGGCCGCGCACGCGATGTTCACGAACACGAAGCACGTGTTCCAGAACCTCGGCGACGGCACCTACTTCCACTCGGGCATCCTCGCGATCCGCCAGGCGGTCGCCGCGAAAGCGAACATCACGTACAAGATCCTCTACAACGACGCGGTCGCGATGACGGGCGGCCAGCCGGTCGACGGCAGCATTTCGGTGCCGCAGATCGCGCGGCAGGTCGAGGCGGAGGGCGTGTCGCGCTTCGTGGTCGTGTCCGACGAGCCGGAGA * ZE:f:1.23377e-06 AS:i:54 ZI:i:27 ZF:i:1 NM:i:156
SHAA004TF 0 sp|P0A916|OMPW_SHIFL 1 255 388H45M12I93M12I183M70H * 0 0 ATGAATAAAACTACTGTTTCTACACTGATCGCCGCCACCCTGTTAGCCGCTGGTTTCTCTGCTTCTGTTTCTGCCCATCAAGCGGGCGATATCATTGTTCGTGCTGGTGCTGTGGTTGTCGCACCCAATGAATCAAGTGATGATGTTGTAATTCCTGGGGTAGGTAATTTAGGTGAGTTTAAAGTCAGTAACGATACTCAACTTGGCTTAAATTTCGGGTATATGTTGACCGATAACATTGGTATTGAGCTATTAGCAGCGACTCCATTTAGCCATGATGTATCTCTAGCGGGTGTTGGTAAAATTGCGGAGACTAAGCATTTACCACCAACCTTAGTTGCACGG * ZE:f:7.69801e-23 AS:i:108 ZI:i:50 ZF:i:2 NM:i:57
SHAA004TF 256 sp|P0A915|OMPW_ECOLI 1 255 388H45M12I93M12I183M70H * 0 0 * * ZE:f:7.69801e-23 AS:i:108 ZI:i:50 ZF:i:2 NM:i:57
SHAA004TF 256 sp|P17266|OMPW_VIBCH 4 255 409H129M12I132M15D51M70H * 0 0 ACACTGATCGCCGCCACCCTGTTAGCCGCTGGTTTCTCTGCTTCTGTTTCTGCCCATCAAGCGGGCGATATCATTGTTCGTGCTGGTGCTGTGGTTGTCGCACCCAATGAATCAAGTGATGATGTTGTAATTCCTGGGGTAGGTAATTTAGGTGAGTTTAAAGTCAGTAACGATACTCAACTTGGCTTAAATTTCGGGTATATGTTGACCGATAACATTGGTATTGAGCTATTAGCAGCGACTCCATTTAGCCATGATGTATCTCTAGCGGGTGTTGGTAAAATTGCGGAGACTAAGCATTTACCACCAACCTTAGTTGCACGG * ZE:f:2.23978e-22 AS:i:106 ZI:i:52 ZF:i:2 NM:i:54
If anything is unclear, don't hesitate to contact to me.