-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
analysis: re-run civic/moa variant analysis + add zip for moa features to reuse #25
Conversation
@wesleygoar Some reasons why the normalization rate dropped is because our regex is case sensitive in the refactor. Here are some examples from CIViC that fail due to case sensitivity:
We added case sensitivity to make it clear what the variant passed. For example, if |
@wesleygoar we dropped in both civic and moa but not by much (see analysis files for percent changes), so let me know if you want me to merge these changes or if you'd like us to come up with solutions in the normalizer refactor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@korikuzma The VRS ID changes were due to the variant being valid in both builds?
@wesleygoar Yup |
Yeah we should fix that. |
I know, we discussed me solving the linked issue before via slack |
I know. ; ) Just doing my part to keep provenance via github like you wanted. |
I think we can resolve this without case delineation. For any given expression of this form, there can only be: For example, If a string has no valid interpretation, then it is simply invalid and we move on. |
@ahwagner What if we had a case like |
Yeah. Wes and I were discussing this down here while you typed up this example. This is part of the narrow class of n-3 length expressions where we have this challenge. I would say we just attempt to interpret both ways, and if it is ambiguous we say we can't normalize due to ambiguity. |
@wesleygoar PR is made here. |
0ebb708
to
111432f
Compare
@wesleygoar re-ran with input_assembly parameter for civic. |
@korikuzma thanks. Ill take a gander at the data. |
There are some VRS ID changes. Here's an issue I made for gnomad vcf queries in the variation normalizer. I'm going to keep this as a draft PR for now to investigate why the normalization rate went down