Skip to content

Example command lines

Hannes Hauswedell edited this page Mar 28, 2016 · 7 revisions

Simple BlastX-like run

  1. Download pre-formatted UniprotSprot from Pre-built-Database-Indexes and unpack.
  2. Select your query file or take this example.
  3. Run bin/lambda -q /path/to/1k_long_reads.fna -d /path/to/uniprot_sprot.fasta.gz

You will see something like this:

LAMBDA - the Local Aligner for Massive Biological DatA
======================================================
Version 0.9.4

Loading Subj Sequences... done.
Loading Subj Ids... done.
Loading Database Index... done.
Loading Database Masking file... done.
Loading Query Sequences and Ids...translating... done.
Searching and extending hits on-line...progress:
0%  10%  20%  30%  40%  50%  60%  70%  80%  90%  100%
|....:....:....:....:....:....:....:....:....:....|
Number of valid hits:                           77763
Number of Queries with at least one valid hit:    709

Since you did not specify it, the default output file name and format was used: output.m8. Browse the output file with your editor of choice, or less on the command line:

% less output.m8
SHAA001TF       sp|Q00593|ALKJ_PSEOL    31.03   290     191     7       1       849     84      371     2e-34    147
SHAA001TF       sp|Q9WWW2|ALKJ_PSEPU    31.03   290     191     6       1       849     84      371     1e-32    141
SHAA001TF       sp|Q8BJ64|CHDH_MOUSE    24.28   276     205     3       1       819     132     406     1e-20    101
SHAA001TF       sp|Q6UPE0|CHDH_RAT      24.01   279     202     3       1       819     135     409     5e-20   99.0
SHAA001TF       sp|Q8NE62|CHDH_HUMAN    24.28   276     205     3       1       819     130     404     6e-19   95.5
SHAA002TR       sp|A7MD48|SRRM4_HUMAN   27.78   216     144     5       25      657     403     611     1e-06   54.7
SHAA002TR       sp|O13547|CCW14_YEAST   28.95   114     76      1       382     723     105     213     0.070   38.9
SHAA004TF       sp|P0A916|OMPW_SHIFL    50.43   115     49      2       389     733     1       107     8e-23    108
SHAA004TF       sp|P0A915|OMPW_ECOLI    50.43   115     49      2       389     733     1       107     8e-23    108
SHAA004TF       sp|P17266|OMPW_VIBCH    52.21   113     45      2       410     733     4       112     2e-22    107
SHAA004TF       sp|Q8ZP50|OMPW_SALTY    50.43   115     49      2       389     733     1       107     7e-21    102
SHAA004TF       sp|Q8Z7E2|OMPW_SALTI    50.43   115     49      2       389     733     1       107     7e-21    102
[...]

NOTE: Because Lambda uses multiple threads by default, the output is not guaranteed to be in the same order (however matches of one query sequence always appear en-bloc and sorted by e-value).

SAM output and E-Value cutoff

Follow above instructions, but choose .sam-format as output. Also use an e-value cutoff of 1e-4.

How would the command line look? `bin/lambda -q /path/to/1k_long_reads.fna -d /path/to/uniprot_sprot.fasta.gz -o output.sam -e 1e-4`
[scientific e-value notation is supported!]

The program will now print:

LAMBDA - the Local Aligner for Massive Biological DatA
======================================================
Version 0.9.4

Loading Subj Sequences... done.
Loading Subj Ids... done.
Loading Database Index... done.
Loading Database Masking file... done.
Loading Query Sequences and Ids...translating... done.
Searching and extending hits on-line...progress:
0%  10%  20%  30%  40%  50%  60%  70%  80%  90%  100%
|....:....:....:....:....:....:....:....:....:....|
Number of valid hits:                           73294
Number of Queries with at least one valid hit:    666

As you can see, the number of hits has been reduced slightly due to the more stringent cutoff.

View the output again to verify that it is beautiful SAM:

@HD     VN:1.4  GO:query
@PG     ID:lambda       PN:lambda       VN:0.9.4        CL:bin/lambda -d /home/h4nn3s/sequences/uniprot_sprot.fasta.gz -q /home/h4nn3s/sequences/1k_long_reads.fasta -o output.sam -e 1e-4
@CO     Lambda is a high performance BLAST compatible local aligner, please see http://seqan.de/lambda for more information.
@CO     SAM/BAM dialect documentation is available here: https://github.com/seqan/lambda/wiki/Output-Formats
@CO     If you use any results found by Lambda, please cite Hauswedell et al. (2014) doi: 10.1093/bioinformatics/btu439
@CO     Optional tags as follow ZE:expect value AS:bit score    ZI:% identity (in protein space unless BLASTN)  ZF:query frame  NM:edit distance (in protein space unless BLASTN)
SHAA001TF       0       sp|Q00593|ALKJ_PSEOL    84      255     63M6D57M6D60M3I270M3D42M3D153M3D171M3I27M35H    *       0       0       GGCTCTGGCTCAATAAATGCAATGGTCTATGCAAGAGGATTAGAAACAGATTATGAGAATTGGGGCACCAATAAGGAATGGAGTTTTGAAAATATAAAAAAAATATACAGATCTATGGAGCAACAAATAAATGATGATAAAGAATTTCTTACAAAAGAAAAGATTCCAGTAAATAATGTAAGTAAGCATCATCATCCAATTTTAGAATATTTTTTTAATGCTAGTAATGAAATTGGTATTAAAAAAAATACAAATTTAACTACATCAATCGAAAATCAAGTAGGTCATTATAATATTAATACTTACAACGGTACTAGACATTCATCATCAAAAGTATTTTTAAAGCCTGTATTAAAAAATCCTAGGCTAACTATACTAGACAATACTCAAGTTAAAAACTTAATAATTAAAGATAAAAAGATTACTGGTATAAAAATTCAAAATAAATCAATAGAACAAATCATACACTTATCTCATGGAGCAATTTTATGTTCAGGTTCTATAATGACACCTTATTTGTTAATGCATTCTGGTATTGGTGATAAAGAGCATTTAAAACAATTTGATAAAGAAATAATAATTGATAATACTAATGTTGGAAGAAATCTTCAGGATCATCTTGGGTTAGATTATTTATTTAAAACTCATCATCATTCACTAAATAAATCACTAGGAACTTGGCCAGGTAGAATTACTTCTGTATTAAAATATATTTATAATAGGAAAGGACCATTATCACTTAGTATTAATCAGTCTGGGGGATATGTAAATTGGAATTCAAAACATCATTATCCCAACTTACAGATATATTTCAATCCGTTGACTTATTCTATCACTCATAAAAATAAA       *       ZE:f:2.29386e-34        AS:i:146        ZI:i:31 ZF:i:1  NM:i:200
SHAA001TF       256     sp|Q9WWW2|ALKJ_PSEPU    84      255     63M6D87M6D30M3I288M6D186M3D162M3I27M35H *       0       0       *       *       ZE:f:1.25869e-32        AS:i:140        ZI:i:31 ZF:i:1  NM:i:200
SHAA001TF       256     sp|Q8BJ64|CHDH_MOUSE    132     255     72M6D555M3I72M3D117M65H *       0       0       GGCTCTGGCTCAATAAATGCAATGGTCTATGCAAGAGGATTAGAAACAGATTATGAGAATTGGGGCACCAATAAGGAATGGAGTTTTGAAAATATAAAAAAAATATACAGATCTATGGAGCAACAAATAAATGATGATAAAGAATTTCTTACAAAAGAAAAGATTCCAGTAAATAATGTAAGTAAGCATCATCATCCAATTTTAGAATATTTTTTTAATGCTAGTAATGAAATTGGTATTAAAAAAAATACAAATTTAACTACATCAATCGAAAATCAAGTAGGTCATTATAATATTAATACTTACAACGGTACTAGACATTCATCATCAAAAGTATTTTTAAAGCCTGTATTAAAAAATCCTAGGCTAACTATACTAGACAATACTCAAGTTAAAAACTTAATAATTAAAGATAAAAAGATTACTGGTATAAAAATTCAAAATAAATCAATAGAACAAATCATACACTTATCTCATGGAGCAATTTTATGTTCAGGTTCTATAATGACACCTTATTTGTTAATGCATTCTGGTATTGGTGATAAAGAGCATTTAAAACAATTTGATAAAGAAATAATAATTGATAATACTAATGTTGGAAGAAATCTTCAGGATCATCTTGGGTTAGATTATTTATTTAAAACTCATCATCATTCACTAAATAAATCACTAGGAACTTGGCCAGGTAGAATTACTTCTGTATTAAAATATATTTATAATAGGAAAGGACCATTATCACTTAGTATTAATCAGTCTGGGGGATATGTAAATTGGAATTCAAAACATCATTATCCCAACTTACAGATATATTTCAATCCG     *       ZE:f:1.44347e-20        AS:i:100        ZI:i:24 ZF:i:1  NM:i:209
SHAA001TF       256     sp|Q6UPE0|CHDH_RAT      135     255     72M6D579M12D21M12I135M65H       *       0       0       *       *       ZE:f:5.48516e-20        AS:i:98 ZI:i:24 ZF:i:1  NM:i:212
SHAA001TF       256     sp|Q8NE62|CHDH_HUMAN    130     255     72M6D555M3I72M3D117M65H *       0       0       *       *       ZE:f:6.06456e-19        AS:i:95 ZI:i:24 ZF:i:1  NM:i:209
SHAA002TR       0       sp|A7MD48|SRRM4_HUMAN   403     255     24H84M9I183M9D57M12I189M3D33M3D66M249H  *       0       0       TACACCGCACGGAAGGCGCGCGTCCGCGCGTGCTCGGCAAGCACGACGCCGTCGGCGGCTGCCTGCTGTCGGAGCTCGGCGAGCTGCGTCCGTCGCGCATCCTGCCGGTGTTCGCCGACTGGCTCGCGCGCCACAAGCCGGCGCTCGACCGCCGCGAGCGCGTGGTCGACCTCGTCGCGCCGCAGATCCTGTCGAACGAGGCGGACGCGGTGAAGCGCACGCCGTACTTCTGCTCGGGCTGCCCGCACAACACGTCGACGAAGGTGCCGGAAGGCTCGATCGCGCAGGCCGGCATCGGCTGCCACTTCATGGCGTCGTGGATGGAGCGCGACACCACTGGCCTGATCCAGATGGGTGGCGAAGGCGTCGACTGGGCCGCGCACGCGATGTTCACGAACACGAAGCACGTGTTCCAGAACCTCGGCGACGGCACCTACTTCCACTCGGGCATCCTCGCGATCCGCCAGGCGGTCGCCGCGAAAGCGAACATCACGTACAAGATCCTCTACAACGACGCGGTCGCGATGACGGGCGGCCAGCCGGTCGACGGCAGCATTTCGGTGCCGCAGATCGCGCGGCAGGTCGAGGCGGAGGGCGTGTCGCGCTTCGTGGTCGTGTCCGACGAGCCGGAGA       *       ZE:f:1.23377e-06        AS:i:54 ZI:i:27 ZF:i:1  NM:i:156
SHAA004TF       0       sp|P0A916|OMPW_SHIFL    1       255     388H45M12I93M12I183M70H *       0       0       ATGAATAAAACTACTGTTTCTACACTGATCGCCGCCACCCTGTTAGCCGCTGGTTTCTCTGCTTCTGTTTCTGCCCATCAAGCGGGCGATATCATTGTTCGTGCTGGTGCTGTGGTTGTCGCACCCAATGAATCAAGTGATGATGTTGTAATTCCTGGGGTAGGTAATTTAGGTGAGTTTAAAGTCAGTAACGATACTCAACTTGGCTTAAATTTCGGGTATATGTTGACCGATAACATTGGTATTGAGCTATTAGCAGCGACTCCATTTAGCCATGATGTATCTCTAGCGGGTGTTGGTAAAATTGCGGAGACTAAGCATTTACCACCAACCTTAGTTGCACGG       *       ZE:f:7.69801e-23        AS:i:108        ZI:i:50 ZF:i:2  NM:i:57
SHAA004TF       256     sp|P0A915|OMPW_ECOLI    1       255     388H45M12I93M12I183M70H *       0       0       *       *       ZE:f:7.69801e-23        AS:i:108        ZI:i:50 ZF:i:2  NM:i:57
SHAA004TF       256     sp|P17266|OMPW_VIBCH    4       255     409H129M12I132M15D51M70H        *       0       0       ACACTGATCGCCGCCACCCTGTTAGCCGCTGGTTTCTCTGCTTCTGTTTCTGCCCATCAAGCGGGCGATATCATTGTTCGTGCTGGTGCTGTGGTTGTCGCACCCAATGAATCAAGTGATGATGTTGTAATTCCTGGGGTAGGTAATTTAGGTGAGTTTAAAGTCAGTAACGATACTCAACTTGGCTTAAATTTCGGGTATATGTTGACCGATAACATTGGTATTGAGCTATTAGCAGCGACTCCATTTAGCCATGATGTATCTCTAGCGGGTGTTGGTAAAATTGCGGAGACTAAGCATTTACCACCAACCTTAGTTGCACGG    *       ZE:f:2.23978e-22        AS:i:106        ZI:i:52 ZF:i:2  NM:i:54

Clone this wiki locally