Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

salmon seems to remove transcripts #1500

Open
jedi-bogate opened this issue Jan 31, 2025 · 0 comments
Open

salmon seems to remove transcripts #1500

jedi-bogate opened this issue Jan 31, 2025 · 0 comments

Comments

@jedi-bogate
Copy link

Hi,

I'm running a pipeline for unique mapping in order to map TEs to the reference genome. I'm quantifying using transcript ID. When I run the pipeline, salmon seems to have removed some of the transcripts that are present in the annotation file when quantifying.

I read that this happens because salmon removes duplicates, however I'm unable to find the duplicate_clusters.tsv file that is supposed to be generated when that happens. I have also tried supplying the --keepDuplicates option to salmon index and salmon quant. However, none of it worked.

I also tried running the pipeline with star_rsem instead of star_salmon. The same thing seems to be happening as with salmon. Perhaps it isn't a quantification problem.

What are my options here? Should I just remove the transcripts that don't show up in salmon.merged.transcript_counts.tsv from the annotation file?

Thank you in advance for your answer! I'm not sure if this issue has been raised/resolved already (sorry if it has), I haven't been able to find a solution online.

These are the line counts for the annotation and each of the quantification output files:
Image

Image Image
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant