You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The count of rows in metadata_for_augur_build_v3 is greater than the count of select count(*) from warehouse.sample where identifier is not null. As of now, they're and 34557 and 34468, respectively.
The numbers are the same if you do
select count(distinct(strain)) from shipping.metadata_for_augur_build_v3
There should probably not be duplicates for this view. The following join likely introduces duplicates:
left join shipping.incidence_model_observation_v2 on sample.identifier = incidence_model_observation_v2.sample
Did some more digging into this.
The incidence model observation views seem to have duplicates for encounters that are linked to multiple locations (i.e. both residence and lodging locations) due to this bit:
select encounter_id, hierarchy->'tract' as residence_census_tract
from warehouse.encounter_location
left join warehouse.location using (location_id)
where relation = 'residence'
or relation = 'lodging'
Ah, nice digging. I think the appropriate thing is to be preferring residences but falling back to lodging, so conceptually a coalesce on it (but could be a reducing aggregation in practice).
The count of rows in
metadata_for_augur_build_v3
is greater than the count ofselect count(*) from warehouse.sample where identifier is not null
. As of now, they're and 34557 and 34468, respectively.The numbers are the same if you do
There should probably not be duplicates for this view. The following join likely introduces duplicates:
(See original Slack conversation for context)
The text was updated successfully, but these errors were encountered: