|
| 1 | +--- |
| 2 | +layout: default |
| 3 | +parent: FAQ |
| 4 | +title: Accuracy vs. passes |
| 5 | +--- |
| 6 | + |
| 7 | +## What impacts the number and quality of HiFi reads that are generated? |
| 8 | +The longer the polymerase read gets, more passes of the SMRTbell |
| 9 | +are produced and consequently more evidence is accumulated per molecule. |
| 10 | +This increase in evidence translates into higher consensus accuracy, as |
| 11 | +depicted in the following plot: |
| 12 | + |
| 13 | +<p align="center"><img width="600px" src="../img/ccs-acc.png"/></p> |
| 14 | + |
| 15 | +## How is number of passes computed? |
| 16 | +Each read is annotated with a `np` tag that contains the number of |
| 17 | +full-length subreads used for polishing. Full-length subreads are flanked by |
| 18 | +adapters and thus cover the full insert. |
| 19 | +Since the first version of _ccs_, number of passes has only accounted for |
| 20 | +full-length subreads. In version v3.3.0 windowing has been added, which |
| 21 | +takes the minimum number of full-length subreads across all windows. |
| 22 | +Starting with version v4.0.0, minimum has been replaced with mode to get a |
| 23 | +better representation across all windows. Only subreads that pass the subread |
| 24 | +length filter (please see next FAQ about filters) and were not dropped during |
| 25 | +polishing are counted. |
| 26 | + |
| 27 | +Similarly, the tag `ec` reports effective coverage, the average subread coverage |
| 28 | +across all windows. This metric includes all subreads, independent of being |
| 29 | +full- or partial-length subreads, that pass length filters and did not fail |
| 30 | +during polishing. In most cases `ec` will be roughly `np + 1`. |
| 31 | + |
| 32 | +## Why do I get more yield if I increase `--min-passes`? |
| 33 | +For versions newer than 3.0.0 and older than 4.2.0, we required that after |
| 34 | +draft generation, at least `--min-passes` subreads map back to the draft. |
| 35 | +Imagine the following scenario, a ZMW with 10 subreads generates a draft to which |
| 36 | +only a single subread aligns. This draft is of low quality and does not |
| 37 | +represent the ZMW, yet if you ask for `--min-passes 1`, this low-quality draft |
| 38 | +is being used. Starting with version 4.2.0, we switch to an additional |
| 39 | +percentage threshold of more than 50% aligning subreads to avoid this problem. |
| 40 | +This fixes the majority of discrepancies for fewer than three passes. |
| 41 | + |
| 42 | +Why do we have this problem at all, shouldn't the draft stage be robust enough? |
| 43 | +Robustness comes with inherent speed trade-offs. We have a cascade of different draft |
| 44 | +generators, from very fast and unstable to slow and robust. If a ZMW fails |
| 45 | +to generate a draft for a fast generator, it falls back multiple times until it |
| 46 | +reaches the slower and more robust generator. This approach is still much faster |
| 47 | +than always relying on the robust generator. |
| 48 | + |
| 49 | +## Is there an upper limit on number of passes used? |
| 50 | +Per default, _ccs_ uses at most the top 60 full-length passes after sorting |
| 51 | +by median length. |
| 52 | +Beyond this threshold, it has been shown that quality does not improve. |
| 53 | +You can change this limit with `--top-passes`, whereas `0` means unlimited. |
0 commit comments