-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Related to #103
GWASpoly uses the DS field for posterior mean dosage. This is essentially posterior mean allele frequency * ploidy. However, the (informal standard) DS field doesn't report values for the reference allele. This is fine when reported for the full posterior distribution because the dose of the reference allele can be imputed as ploidy - sum(alts)
. But this is an issue for mchap assemble
because we don't necessarily report all alleles (excluded infrequent alleles) resulting in a concatenated posterior distribution. This means the reference allele dosage can't be imputed. More importantly, the dose of the alternate alleles can't be normalized without the reference allele value. Use of the results from mchap assemble
without normalization may bias downstream analysis.
There are a few options:
- Make DS an option for
machap call
but notmchap assemble
becausemchap call
always reports the full posterior - Normalise dosage for reported alleles in
mchap assemble
before discarding the reference allele: the resulting values will no-longer match the AFP field - Break the 'standard' and report the reference allele value: may cause downstream issues
- Don't make DS an option: may loose ease of compatibility with down stream tools
The first option is probably the best option