Develop an N3C mapping metadata schema to support downstream reproducibility and analyses #37
Labels
Harmonization & Analytics
Issues which involve both Data Ingestion & Harmonization & Analytics workstreams
per @hlehmann17 in a team meeting: Just a thought that this problem has the flavor of derived data that we’ve discussed in the OMOP mapping where I thought we agreed to bump derivations to the Palantir Phase. The “indices” that Harold mentioned as well as computable phenotypes and cohort definitions need a place between DI&H and analysis.
Could we record a mapping type version or method along with the source data – metadata about the maps. I am advocating for making all these definitions explicitly and helping the analysts organize themselves for transparency, reproducibility and reuse. The goal is not to build more data but to preserve traceability. We are doing a data harmonization step and what we have done must be transparent and reproducible. So perhaps that is the key goal: can someone else verify and reproduce whatever we do to the data to get it ready for analysis? This is a subset of the derived data problem.
The text was updated successfully, but these errors were encountered: