Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpreting aBSREL results #1817

Open
00-kelvin opened this issue Mar 10, 2025 · 9 comments
Open

Interpreting aBSREL results #1817

00-kelvin opened this issue Mar 10, 2025 · 9 comments

Comments

@00-kelvin
Copy link

Hi there Sergei,

I was wondering if you could help me interpret what I think might be somewhat strange results from aBSREL.

For context, I first ran BUSTED-PH on 4756 orthogroups generated from transcriptomes of ~100 spiders. 105 genes came up as hits for positive selection associated with my phenotype of interest (orb-weaving).

I then ran aBSREL on the hits, hoping to determine which branches/species from each of the genes were contributing to the positive BUSTED results.

Not all of the orthogroups identified as positively selected by BUSTED-PH had any evidence of positive selection in aBSREL; this didn't surprise me too much, since I understand aBSREL is doing more multiple testing correction and therefore might have less power. What did surprise me when I began parsing the results was the dN and dS (and omega) values for the branches/nodes that aBSREL identified as being under positive selection.

Image

Most of the omega values (taken from the JSON field 'branch attributes'>'0'>GENE/NODE>'Baseline MG94xREV omega ratio') for the branches identified as positively selected are recorded as 10000000000.0, resulting from their dS values being equal to 1e-10. This struck me as odd -- is it normal for a gene to have essentially 0 synonymous substitutions per site? I also noticed that of the few branches that showed up as positively selected which did not have omega=10000000000.0, the omega value was actually <1, which makes me question why they would come up as positively selected at all. I wondered whether these results could be due to alignment errors or issues with transcriptome quality, though my understanding of the algorithms behind these tests is limited enough that perhaps they are normal after all.

I have attached the table screenshotted above as well as the alignments, tree files and json files for a few of the orthogroups with positive results in case that is helpful. Thank you in advance for any advice!

Calvin

absrel_res.zip

@00-kelvin
Copy link
Author

After reading a bit more, I now understand that the Baseline MG94xREV omega ratio does not correspond to the "Full adaptive model (synonymous subs/site)" and "Full adaptive model (non-synonymous subs/site)" as these are coming from 2 different model fits, and also it seems that the "Full adaptive model ([non-]synonymous subs/site)" data which are reported for each branch are not actually dN and dS values per se, but rather the synonymous and non-synonymous components of the branch lengths for the adaptive model fit (right?)

With this in mind, I am now looking instead at the "Rate Distributions" result in the json, and still the dN/dS ratios for most hits are equal to 10000000000 (revised csv attached). So I suppose my question still stands -- is this a normal-looking result? Thank you again!

absrel_res.csv

@stevenweaver
Copy link
Member

Dear @00-kelvin,

I wouldn't say this is a normal result. Such a low synonymous rate is indicative that the model is struggling to estimate them accurately. When I looked at your tree, I saw some very long terminal branches, which may be contributing to the estimation issues (Uloborus_diversus_jg27513_t1_p1, Tekellina sadamotoi , etc.). I would double-check the alignment and ensure that there aren't any misaligned codon regions or sequencing artifacts. You may also want to screen for recombinants if no other issues could be accounted for.

Best,
Steven

@spond
Copy link
Member

spond commented Mar 11, 2025

Dear @00-kelvin,

Let me take a closer look. LRT of >1,000,000 is a convergence / underflow issue for sure. That's not normal.

Best,
Sergei

@00-kelvin
Copy link
Author

Thank you both! Let me know if it would be helpful to share more of the alignments or json files.

@spond
Copy link
Member

spond commented Mar 11, 2025

Dear @00-kelvin,

It appears that in both of the examples you provided (thanks!, very helpful), there's a convergence issue which makes HyPhy produce nonsensical results. A "tell" is the presence of infinite branch lengths in the standard MG94 tree.

Image

Which version of HyPhy are you running? I am unable to replicate the behavior you described with 2.5.69. I am attaching the results files using --branches TEST --srv Yes --multiple-hits Double+Triple

For example, on N5.HOG0022135 there's a single branch with evidence of selection and a reasonable p-value.

Image

The signal derives from these two codons

Image

which seems sensible seeing how this is a short branch with no apparent synonymous variation.

When you have short branches with no synonymous variation and some non-synonymous substitutions, aBSREL will often flag them as selected. This is expected behavior

HTH,
Sergei

@00-kelvin
Copy link
Author

00-kelvin commented Mar 12, 2025

Hi Sergei, I have been using 2.5.65, since that is the latest version that is available through Anaconda and I am not able to build from source on the system I am using. Do you have the ability to upload the newest version of hyphy to Anaconda? I'm not super familiar with how that works from the developer end.

@stevenweaver
Copy link
Member

Dear @00-kelvin,

I think BioConda's bot stopped working. The last PR was bioconda/bioconda-recipes#52892, it used to update (somewhat) automatically before then.

Best,
Steven

@stevenweaver
Copy link
Member

stevenweaver commented Mar 13, 2025

Dear @00-kelvin,

There's a new PR for an updated version of hyphy in BioConda. Please follow its progress here.

Best,
Steven

@00-kelvin
Copy link
Author

@stevenweaver Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants