-
Notifications
You must be signed in to change notification settings - Fork 77
Fix mutation ordering #3257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix mutation ordering #3257
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3257 +/- ##
==========================================
+ Coverage 89.72% 89.74% +0.01%
==========================================
Files 29 29
Lines 32500 32456 -44
Branches 5949 5940 -9
==========================================
- Hits 29160 29127 -33
+ Misses 1894 1888 -6
+ Partials 1446 1441 -5
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
What's the difference now between sort_mutations and sort_canonical? Would a simpler way of thinking about this be to say that we're changing things by making |
The only difference now is that canonical has a sort by node id, which I think is still needed in the case of nodes with equal times? |
Let's just use sort_canonical for now, and see how we get on with test suite breakage? I don't see much point in having another stricter sort if we're already changing the semantics. |
Sorting is now identical to canonical - only a couple of python tests failing. Will fix those and add some new tests for the new sort order. |
I've fixed up the tests - the only one I wasn't sure about was |
Looks great - I think we need @petrelharp's input on the haplotype extender stuff |
Whoops - I missed two topology tests. These got a bit more involved - I've had to switch from checking for exact mutational equality to mutational functional equivalence in the simplify tests. |
b5df665
to
189e790
Compare
I think this is ready for final review. I've added @hyanwong's initial issue failing code as a test. |
Addressing #3253
This PR changes the mutation sort order from
site->time->ID
tosite->time->node_time->num_descendants->ID
This is required as if mutations have unknown times, and no parent info, we need to use the node time to have parents before children. For mutations at the same node, we assume the existing order is parents before children. We need to use num descendants to use parent info if it is present for mutations at the same node.
I've updated a few of the breaking Python tests to give a flavour of the impact. This PR still needs tests for the new order, and the rest of the broken tests fixing, but thought this was a good point for feedback.