Skip to content

Meeting minutes

Michael Kallfelz edited this page Jun 26, 2023 · 92 revisions

This is a collection of meeting minutes. Latest entries are in the top.

Monday, June 26, 2023 - wrap up meeting

  • Use sofa.sql and rebuild for the OMOP CDM dataset
  • Finalize PhysioNet project
  • Create dump or if not possible other means of transfer to PhysioNet

Monday, Apr 3, 2023 - Bi-Weekly requirements meeting

Attendees - Andrew, Mik, Tom

  • MIMIC @ Tufts also to combine with work around SSSOM for (custom) mapping
  • Tufts planning to convert the MIMIC ETL to work on MS SQLServer (OHDSI SQL)
  • Mik to review Polina's finding of unmapped items against the latest metrics
  • find a time for UAT - May 4th 8am ET
  • compare with Rishi/Alistair work on MIMIC IV

Monday, Mar 20, 2023 - Bi-Weekly requirements meeting

Attendees - Andrew, Mik

  • ETL for MIMIC v2.2 has been run, DQD (v1.4.1, the json is not readable by newer versions) and Achilles were run, the GCP BigQuery schema is connected to Atlas => the UAT can commence again
  • we hear reports that there are huge gaps in mapping ditems to concepts even for high frequency concepts

we will run metrics to check this! if a large number of new ditems had been added, this would explain it (as no new mapping was introduced)

  • The OMOP CDM can be transferred to PhysioNet after UAT, the publication for the full version finalized
  • We are asking Anna to
  1. refresh github with her latest changes (code and mapping)
  2. run the metrics
  3. maybe adjust the instructions to reflect the latest version of the logic / procedure for running the ETL

Monday, Jan 23, 2023 - Bi-Weekly requirements meeting

  • UAT still stalled because another ETL has not been run
  • if another ETL run is not successfully, the entire operation will be shifted to the Tufts environment including UAT
  • access to the OMOP'ed MIMIC dataset within the Tufts environment must be governed according to PhysioNet access restrictions

Monday, Dec 9, 2022 - Bi-Weekly requirements meeting

Attendees - Gigi, Andrew, Mik

  • UAT document being revised
  • Polina (and Kyle) to support in the UAT, get 80% adherence to the IAC publication
  • current obstacle: Ventilation mode value "Standby" has not been transferred into the OMOP CDM
  • possible cause: ETL has been run on an outdated source
  • re-run the ETL and check the results
  • in parallel Tufts gets access to the MIMIC IV source data and transfers it to their environment, adopts the existing ETL
  • adjustments to the ETL should go into a separate branch / folder to reflect changes in a non-GCP environment

Monday, Oct 31, 2022 - Bi-Weekly requirements meeting

Attendees - Tom, Andrew, Mik, Abdulrahman, Manlik

  • Atlas is finally up again and UAT can be concluded
  • CHORUS initiative would like to have the OMOP CDM dataset for demo versions
  • waveform representation for CHORUS - learn from Manlik's work
  • phenotype development for QA, start with a MIMICinOMOP (at Tufts?) instance for building those?
  • access restrictions to OMOP'ed MIMIC content as of PhysioNet
  • conclude project (and meetings) and finalize v1.0 PhysioNet publication

Monday, Oct 3, 2022 - Bi-Weekly requirements meeting

Attendees - Mik, Abdulrahman, Manlik

  • Latest ETL is now in Atlas, but the Atlas instance was outdated and had to be upgraded (still pending)
  • Manlik has converted MIMIC IV waveform to an OMOP CDM, applied data compression techniques, data reduction, created a local vocabulary, used source value field to keep information about waveform file
  • add to PhysioNet publication v1.0: mappings, new findings in waveform integration

Monday, Sep 19, 2022 - Bi-Weekly requirements meeting

Attendees - Mik, Abdulrahman, Andrew

  • ETL with MIMIC IV v2.0 and new mappings has been performed
  • DQD results look pretty acceptable, spot checks of failed messages still look like the results should be okay
  • inputevents are still missing to enrich the drug exposure
  • care site / location: have we fixed the problem recorded in November 2021? (probably not, would require an enhancement in Atlas => are there vocabulary / mapping related improvements?) Research question: can we identify patients we can calculate SOFA scores for that are on ICU?
  • Reproducing the IAC study might now have become easier with better measurements mapping, imputing that a patient is on a ventilator by having a ventilator related measurement (15 possible values for that in the data):
concept_id concept_name measurement_source_value
21490854 Tidal volume Ventilator --on ventilator 224684
3022875 Positive end expiratory pressure setting Ventilator 50819
3017594 Tidal volume.spontaneous --on ventilator 224421
40760766 Tidal volume.inspired maximum setting Ventilator alarm 223874
3004921 Ventilation mode Ventilator 223849
3012410 Tidal volume setting Ventilator 50826
3004921 Ventilation mode Ventilator 50828
21490854 Tidal volume Ventilator --on ventilator 224685
40762903 Apnea interval Ventilator alarm 223876
3025439 Airway pressure Ventilator --at peak inspiratory flow maximum setting 223873
3022875 Positive end expiratory pressure setting Ventilator 220339
3007469 Breath rate setting Ventilator 224688
3017594 Tidal volume.spontaneous --on ventilator 224686
36303946 Pressure.plateau Respiratory system airway --on ventilator 224696
3000461 Pressure support setting Ventilator 224701

Monday, June 27, 2022 - Bi-Weekly requirements meeting

Attendees - Mik, Abdulrahman, Tom, Dana, Xavi

  • MIMIC IV v2.0 is out and can be accessed from BQ: mimiciv_hosp and mimiciv_icu
  • Information / release notes is here
  • waveform (numeric measurement) mappings have been added to the shared folder => determine if they can be used for the current MIMIC IV v2.0 setup or how they are used together with waveforms (consult with Manlik)
  • merging mappings with existing mapped concepts to prepare for a new ETL run
  • inputevents still to be decided if it can be added
  • after finalizing, fix and publish the fulldata publication including all the other contributors (see attendee list)

Monday, June 13, 20022 - Bi-Weekly requirements meeting

Attendees - Mik, Abdulrahman, Tom, Manlik, Dana

  • ODHSI vocabularies (latest release) are available in ODS BQ
  • V2 of MIMIC IV has been released, but not yet published to PhysioNet BQ
  • once available, please add Mik and Anna as users
  • Anna and Mik to work on a new ETL run (small adjustment for core > hosp schema) with the new vocabularies and mappings
  • inputevents to be reviewed if they can be added / what has been changed
  • target a paper to illustrate what has been done and how it can be used, recognition of all participants (e.g. mapping process, building of concept sets)

Monday, May 16, 20022 - Bi-Weekly requirements meeting

Attendees - Mik, Abdulrahman, Tom, Dana, Anna

  • ODS has the new vocabularíes (including new LOINC) and will make available in BQ
  • PN (Tom) provides access to public release of MIMIC IV 2.0 in BQ (add users for Anna and Mik)
  • ODS converts mappings and adds them to github and BQ, then processes ETL with new vocabularies, mappings and MIMIC IV v2.0
  • timeline: vocabularies and mappings until end of this week, ETL until end of next week

Monday, May 2, 20022 - Bi-Weekly requirements meeting

Attendees - Mik, Abdulrahman, Tom, Dana

  • version 2.0 still in review
  • mappings are finalized and will be shared by G-Drive
  • Mik reviews the final mappings
  • Anna to include mappings in repository and BigQuery, then fire up ETL once version 2.0 is available and use the improved mappings
  • labitems: LOINC codes 99603-3 & 99604-1 are from v 2.72 (not yet OMOP'ed)

Monday, April 4, 2022 - Bi-Weekly requirements meeting

Attendees - Mik, Abdulrahman, Xavier, Dana

  • version 2.0 is now in final review
  • add to remaining mappings: procedures (mainly mapped to SNOMED), chartevents (mapped as measurements to LOINC/SNOMED)
  • eventually look into "Values Unmapped for chartevents" in G-Drive for "observation style" items
  • outputevents are currently ignored: is there a use case? if so, define priority, map to possible targets (measurements?); maybe determine a couple of valid targets like "urine output", "blood collection output", "surgical drain output", "CSF drainage output", "GI fluids output"...
  • inputevents are not part of the ETL yet, but mapping to medication concepts exists and should be included in the next ETL

Monday, March 21, 2022 - Bi-Weekly requirements meeting

Attendees - Mik, Abdulrahman, Tom, Xavier, Anna, Sicheng

  • MIMIC v2.0 ETA unchanged, work in progress
  • lab item mapping in progress, unmapped items of interest to be identified and mapping to be attempted, later counts could be generated to exclude we have missed any high count item
  • refresh vocabulary with Synthea script (preserve 2 billion custom mapping in separate step)

Monday, March 7, 2022 - Bi-Weekly requirements meeting

Attendees - Mik, Abdulrahman, Tom, Dana, Xavier, Manlik

  • mapping with correct item_ids is now available
  • this mapping however is aligned with MIMIC IV v2.0 which is due for release within the next 2 months (Tom to reach out once available)
  • demo and full release may happen at different times
  • A new ETL run makes only sense if both, mapping and new MIMIC version are available.
  • a new version of bio-signal / waveform data is being developed, now linkable to clinical data (by header files with subject id)
  • mimic_derived schema: add more derived mappings inspired from the OMOP conversion such as lab_item to LOINC, medication items to RxNorm
  • eventually include the concepts created by Manlik's waveform extraction and annotation process (Manlik and Dana)
  • consider SSSOM as a mapping format
  • github repository for mappings => action items: ODS - fine tune ETL logic, ingest new mappings, vocabulary refresh, wait for new version / MIT - release new version, collaborate with Manlik on waveform annotation concepts

more input by Andrew:

  1. multi site study (CRITICAL) with ICU data mapped to OMOP CDM, gaps detected that "cannot be mapped" => being collected to inform the community about possible shortcomings, how to bridge both efforts (Andrew / Tom)?
  2. create more metadata (and a place for it), e.g. about annotation process to make a difference between automated and manual human annotation
  3. identification of "relevant" parts of waveform data (epochs) by either other clinical event (e.g. drug exposure) or particular waveform changes (e.g. VT) to help segmenting and eventually reducing size, allowing for higher resolution for interesting parts

Monday, February 21, 2022 - Bi-Weekly requirements meeting

Attendees - Mik, Gigi, Abdulrahman, Anna, Xavier (BCN Hospital Clinic)

  • item_id's have been shifted again with the latest MIMIC update
  • Abdulrahman clarifies how this can be prevented and provides a fixed list that could be used for another ETL run

Monday, February 7, 2022 - Bi-Weekly requirements meeting

Attendees - Mik, Manlik, Abdulrahman, Anna, Dana

  • LOINC mapping by Abdulrahman is complete, however this is still based on the shifted item_id's
  • Dana and Abdulrahman work together to fix the LOINC mapping table to reflect correct itemids
  • Manlik's waveform signal processing is still ongoing to produce annotations of waveforms
  • Manlik could provide a description of the requirements (hardware, processing time) around the entire process

Monday, January 24, 2022 - Bi-Weekly requirements meeting

Attendees - Mik, Gigi, Abdulrahman, Anna, Tom, Dana

  • run one more ETL with completed / fixed mappings? upload mappings to github after that?
  • Mik would run DQD, Konstantin Achilles
  • mapping efforts: remove LOINC codes from next MIMIC release but limit to d_item / lab_item to target concept mapping... LOINC mappings will be preserved in a "derived" table, LOINC codes coming from the source should be looked at in particular (and preserved with a special flag?)

Monday, January 10, 2022 - Bi-Weekly requirements meeting

Attendees - Mik, Dana, Gigi, Abdulrahman, Anna, Sicheng, Andrew

  • shift in d_items: Dana keeps on analyzing
  • admission / discharge locations: check against domain = visit, also check gcpt_vis_admission for proper domains (some are observation, class location: are they used for care_site?)
  • measurements: Abdulrahman continued reviewing and fixing the mappings, stores results in working folder!

Monday, December 13, 2021 - Bi-Weekly requirements meeting

Attendees - Mik, Tom, Dana, Gigi, Abdulrahman, Anna

  • inputevent-medications look good, store in mappings in github repository and address code extension to include inputevents (Anna: best approach?, Dana offers help)
  • labevents show a mismatch / shift in d_items: issue #17
  • admission / discharge location: mapped to SNOMED? no, but rather other "OHDSI" standards (review CSV files)

Monday, November 29, 2021 - Bi-Weekly requirements meeting

Attendees - Mik, Tom, Dana, Gigi, Abdulrahman, Anna

  • ventilator start / stop time coding is in the MIMIC code repository and could be used to identify durations of ventilation
  • try another UAT with Andrew and Gigi: possible timeslot Monday 6th between 10-14:00 ET
  • Abdulrahman worked on inputevent-medications => great progress, Mik to review
  • Dana and Abdulrahman to review lab event mappings

Monday, November 15, 2021 - Bi-Weekly requirements meeting

Attendees - Mik, Tom, Andrew, Dana, Sicheng, Abdulrahman, Manlik, Brian

  • visit occurrence can only hold one care site: does source data hold multiple care sites per one "visit" / how does MIMIC hadm_id translate into OMOP visits => it becomes one visit occurrence, a transfer would lead to a new visit detail entry (Atlas cannot deal with visit detail)
  • use mimic_core.transfers and the transfer_id to generate visit occurrence entries (with a care site)?
  • MIMIC team provides clearer explanation of hadm_id vs. transfer_id => stick with current ETL logic, hope for a better ATLAS that can deal with visit_details!

Monday, November 1, 2021 - Bi-Weekly requirements meeting

Attendees - Mik, Tom, Gigi, Andrew

  • Dana Moukheiber reached out to Anna: source of concept mappings / suggestions for mapping remaining concepts => continue mapping and fine tuning (related to NW University project)
  • UAT continued, stuck a little bit on identifying ICU patients as the first step => Mik reach out to Anna - create guideline about representation of facts in OMOP (e.g. not necessarily use a CPT code but rather a specific visit occurrence)
    • identify patients on a ventilator
  • Brian Gow to take over responsibility and management of MIMIC at PhysioNet
  • Manlik: annotation process is still ongoing
  • Andrew follows up with Tom: CHORUS group - cross institutional project (waveform related, maybe OMOP transformation), high resolution data "V"

Monday, October 18, 2021 - Bi-Weekly requirements meeting

Attendees - Mik, Sicheng, Brian,

  • UAT had to be cancelled (sigh) due to scheduling conflicts, new time and date to be agreed on (Mik to reach out to Andrew and Gigi)
  • Gigi worked on concept sets to support UAT (fixed an issue with wrong ICD9 Proc mappings in the original)
  • Manlik to complete github waveform tools section
  • Manlik starting the annotation run today => identify every beat / QRS in EKG (and BP) data, generate images,
  • still no real information exchange between Ben and Manlik established
  • inputevents - what is missing and what has to be done: Mik and Anna write it up!

Monday, October 4, 2021 - Bi-Weekly requirements meeting

Attendees - Mik, Manlik, Andrew

  • change to bi-weekly (starting with this one)
  • UAT testing had to be postponed to this week
  • full PN publication still to be moved forward (Mik)
  • any follow up project opportunities? maybe in context of U01? -
  • OMOP on FHIR part of U01 (part of ETL as well as querying OMOP from the outside?)
  • CDM proposal for linkage of objects (such as waveforms)
    • new datatype accomodation beyond the capacity for an OMOP CDM
    • Manlik: use a two step approach, first identify the cohort patients, then extract the waveform information into a separate (temporary) OMOP CDM for these patients (Andrew does not approve of this approach :-) )
  • CHORUS - what is the benefit of OMOP'ing waveforms?
    • how can we preserve feature vector information and keep that information in an OMOP digestible format? (without storing the entire vector raw data)

Monday, September 27, 2021 - Weekly requirements meeting

Attendees - Gigi, Mik, Brian, Manlik, Tom, Andrew, Dana, Sicheng

  • UAT testing 2 hours Sep 30 - 3 pm ET (Gigi and Andrew)
  • Publication for full MIMIC to be addressed (Mik & Vojtech)
  • ETL additional documentation to be uploaded
  • Combine Tufts & MIMIC data into an OMOP CDM? => in context of U01 grant "critical"
  • after UAT, switch to bi-weekly or monthly touchpoints to discuss upcoming topics
  • after publication of project, PhysioNet provides access to data (consistency check needed before)
  • other PhysioNet ICU datasets to be transformed to OMOP CDM?
  • Tom to help Ben finalizing the waveform related datasets (date shifts, DST time change events)
  • Andrew shared - critical care data exchange format: CCDEF / paper / tooling, influenced by CHORUS group

Monday, September 20, 2021 - Weekly requirements meeting

Attendees - Gigi, Mik, Brian, Manlik, Tom, Dana

  • Tom has transferred datasets to PhysioNet BQ (TADA!)
  • Ben and Manlik to discuss possible time shift issues in wfdb files, Manlik's approach
  • UAT testing 2 hours Sep 29 - 9:30-11:30 am ET (Gigi and Andrew, hopefully)

Monday, September 13, 2021 - Weekly requirements meeting

Attendees - Gigi, Mik, Brian, Manlik

  • Symposium poster due the following two days
  • UAT testing to be reinitiated week of Sep 27 (Gigi, Andrew)
  • project to be concluded, Atlas available roughly until christmas time
  • demo and full cdm to be transferred to PhysioNet:
    • option 1: activate Data Transfer Service and allow access (admin rights) => creates costs
    • option 2: package each table of the dataset into an AVRO file, re-import into a dataset at PhysioNet
    • other options? Brian will check if we have other easier options
  • Brian / Tom to reach out to Ben and ask about his requirements in regard to collaboration with Manlik
  • Manlik close to reach phase 2: chunks of 10 minutes in XML format, to be fed to the annotation logic, identify segments of data, extract more finely granular data to a separate place / dataset / files?,
  • Define OMOP CDM - waveform practical use / approach: patient has experienced VTach during a period of time (indicated by very high frequency), link to respective waveform, get to the episode in question through trending data that has been consumed as measurements in the OMOP CDM (with a given resolution), dynamic filter for trending data to increase / decrease resolution depending on variation of values - Manlik researches around that request

Monday, August 30, 2021 - Weekly requirements meeting

Attendees - Gigi, Mik, Brian, Jeff, Tom, Vojtech

  • Ben and Manlik to meet in the near future to discuss collaboration opportunities
  • Mik checked mapping gaps from Sicheng's files: apart from known missing input/outputevents not too many missing
  • UAT: should be continued, Andrew would be useful in this
  • issue list in progress (github issues as a starting point to collect)
  • ETL documentation to go to github
  • final version of full content on PhysioNet - AVRO files to storage bucket / direct access to Odysseus dataset => import to PhysioNet
  • Vojtech: SQLlite/Andromeda era date problem on import (future date offset?), Vojtech to explore possible solutions

Monday, August 23, 2021 - Weekly requirements meeting

  • Ben Moody is preparing a public release of the MIMIC IV waveforms; additional different format, adopting Manlik's approach (xml) => timeline unknown at this point
  • Ben and Manlik to meet to discuss possible collaboration steps / save time and effort? => Manlik reaches out to Ben (cc Tom)
  • inputevents can unfortunately not be processed at that time because of lack of resources
  • put together open issue list for continuation / follow-up project / effort estimation
  • Mik to double check mapping status of itemids as of Sicheng's files => prepare for additional UAT steps

Monday, August 16, 2021 - Weekly requirements meeting

Attendees - Gigi, Manlik, Mik, Sicheng, Anna

  • Atlas back online
  • UAT to be scheduled / continued: Wednesday Aug 18, 10-12
  • OHDSI poster to be started / reviewed
  • drug representation (e.g. for sympathomimetics) low because inputevents not used for that. However, there is only little overlap with the prescriptions / pharmacy tables from the hosp module. We are therefore missing the majority of medication events for ICU patients => focus on converting the 4 top sympathomimetics from inputevents, later extend to more medication...
  • 340000 patients in DB, only a part of it are ICU patients => find number of patients with actual ICU stay (Anna) / how is it represented in the current OMOP CDM? visits?
  • Sicheng: in MIMIC look up table icustay to identify ICU patients (hadmid)
  • waveform processing is ongoing, amount of data is massive and transformation of extracted values to actual measurements would most probably exceed the OMOP CDM capabilities
  • POC in waveform extraction is done for 4 patients and varying numbers of actual measurements
  • data reduction / aggregation algorithms to process resolution of measurements (e.g. besides a regular period such as 1 hour, only produce a new measurement if there is a substantial value difference, maybe if > 20%)

Monday, August 9, 2021 - Weekly requirements meeting

Attendees - Gigi, Manlik, Mik, Jeff

  • funding can probably be provided soon (administrative steps)
  • let's try re-starting UAT this week
  • no waveforms for ED patients
  • start preparing poster for OHDSI symposium based on UAT results and overall findings

Gigi away 22-26.08.

Monday, August 2, 2021 - Weekly requirements meeting

Attendees - Tom, Gigi, Manlik, Brian, Mik, Jeff

  • MIMIC poster for OHDSI symposium has been accepted
  • Funding for ATLAS (4 months) can be provided by Jeff
  • last meeting was about opportunities for more OHDSI / PhysioNet collaboration (Tom provides minutes)
  • Gregory Mason from UCLA inquired about representation of pulse oximetry waveforms in the ED portion of MIMIC IV => should be able to extract this information from a well processed OMOP CDM (visits in ED and visit details with pulse oximetry measurements).
  • UAT reloaded: Mik to check with Greg on ATLAS availability timeline, then schedule time with Gigi, Tom, Jeff, Andrew to re-start the UAT as planned earlier, separate step to review waveform representation with Manlik, compile a report (including Achilles, DQD and actual ATLAS UAT results).
  • Manlik has processed around 40% of the waveform data (10 minute segments), normalizing values. Up to now the size is about 8 TB of data. This does not include png images of the waveform (which would double the size). => On the fly rendering from existing XML format? Manlik could provide a java library for building a (web-)tool.
  • Trending data (converted to mapped OMOP csv format) processing is completed. Tools exist to create averaging values out of the now high-frequency trending data (e.g. measurement every other minute).
  • MJ to contribute his improvements to a separate branch. Anna to review and merge.

on leave

  • Jeff: 16 + 23 August

Monday, July 12, 2021 - Weekly requirements meeting

Attendees - Tom, Gigi, Manlik, Brian, Sicheng, Vojtech

  • Vojtech: downloaded data from the PhysioNet demo project. Had some difficulty importing data into SQLite with "Andromeda" due to handling of dates. Looking into it.
  • Tom: should make a note on the MIT-LCP MIMIC-III OMOP GitHub repository to let people know about the new MIMIC-IV transform.

Monday, June 28, 2021 - Weekly requirements meeting

Attendees - Mik, Gigi, Tom, Andrew, Jeff, Brian

  • next meeting to be skipped (national holiday)
  • 2K USD would keep Odysseus ATLAS running for 4 months: Andrew and Jeff look for respective funding / maybe split
  • vacation period:
    • Mik: July 12 - July 30
    • Andrew: Aug 1 - Aug 17
    • Tom: (some parental leave in September)
    • Gigi: mid August
    • Jeff: a good part of July, one week offsite in August
  • encourage people to run tests (!) on the PhysioNet Demo OMOP CDM?
  • highlight the "beta" status of the current publication => news items / associated with project - Tom "early development version, comments / feedback welcome, testing is underway, new version is planned"
  • commence UAT once ATLAS is back: perform UAT during regular weekly meeting
  • once UAT is completed, switch the meeting to biweekly
  • NIH/Clem McDonald & Regenstrief: derived waveform features to become new LOINC concepts
    • start with already known WF features without a proper target in LOINC to prepare a submission to Regenstrief? => Manlik to investigate for more candidates, maybe 15-20 to start with?
    • establish a process for new waveform extracted features to become standardized terms (e.g. LOINC), e.g. what are the criteria supporting a selection as a candidate?
    • investigate the role of Machine Learning in the discovery of unknown features? => compare waveforms for patients with similar clinical data? sample use case: Patients having developed Afib => review/compare all previous EKGs to find possible predictors...
  • github collaboration for outside users: conventions / desired interaction to be defined - reviewer / actual person to be named, time required for this is obviously depending on frequency of use and number of requests

Monday, June 21, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Gigi, Tom, Andrew, Jeff, Sicheng, Brian (welcome!)

  • submission, working copy has been made for the OHDSI symposium
  • UAT: resuscitate Odysseus ATLAS with a little bit of funding and / or host local ATLAS: transfer entire CDM content to PhysioNet BigQuery or create csv directly from Odysseus GCP / download content from PhysioNet / observe access restrictions locally
    • Mik to create access for Tom to Odysseus dataset, Tom can pull the data to the PhysioNet instance
    • Mik to check with Anna about current waveform representation (do we need more tables?)
  • after UAT is closed, prepare full availability of ETL logic (so that the whole process can be reproduced elsewhere) as well as content
  • waveform: currently only selected samples, trending data is around 1TB of data (csv format) => convert to a separate table in BigQuery / PhysioNet, actual signal information much more (around 16TB, in 10min chunks, XML format, gzipped)
    • check with Anna about trending data format previously used
    • move trending data to a server at Tufts
  • Clem McDonald: vocabulary changes, maybe waveform related data? => Tom facilitates meeting, Jun-25, 9am ET?
  • discuss next time: how to involve non-MIMIC ICU data holders and discuss their particular questions / invite them to work with the OMOP CDM variant?

Monday, June 14, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Gigi, Tom, MJ, Jeff, Andrew

  • MJ presenting his work (the recording is here, pw = k$fR4BL^) => partition (on person id, how would this be applied to other databases such as postgre?, queries can become smaller and cheaper), outputevents 2 measurements, automatic separation between staging and final dataset
  • MJ reproduced (and improved) the full cycle of MIMIC ETL => obstacles: Python coding needs more structure, jumping between files was unclear (can we sort the files by name to indicate sequence?, put a readme in every directory such as scripts)
  • MJ tries to point out possible options where to improve guidance / make suggestions how to improve coding:
    • making things more modular
    • implement error handling
  • Existing BigQuery performance improvements in standard MIMIC?
  • pip install the python scripts? (requires creating a real package to be hosted at pypi)
  • Jeff showed automatic generation of function specific documentation from comments into SphinxDocs, could maybe be done in a similar way with generating documentation in readthedocs
  • How could the Rabbit-in-Hat ETL test framework (R) be employed for unit testing? => Andrew

Monday, June 7, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Andrew, Sicheng, Gigi, Tom, Vojtech

  • PhysioNet publication in progress
  • OHDSI Symposium submission? (focus: collaboration between research communities; advantages / disadvantages of respective approach)
  • Funding sources to continue working on completion of UAT...
    • standing up an ATLAS at Tufts or UF and host a local copy (create a csv export from BQ)
  • pull MIMIC IV full (and Demo) to PhysioNet
  • UAT @ PhysioNet: invite Leo Celi (Tom)
  • inputevents still need to be made accessible for the OMOP CDM conversion of drug exposures
    • possible duplication with prescription / pharmacy / emar tables
  • MIMIC IV web site: provide more background about source of data in the MIMIC IV structures
  • DQUEEN - data source description, information loss
  • documentation to be completed for current status of MIMIC2OMOP:
    • ETL description, add latest changes (e.g. itemid preservation)
    • waveform alignment
  • paired team approach: one PhysioNet, one OHDSI user - trying to use the OMOP CDM for "a regular scientific task" they have been doing before
  • invite Melanie P. to assess the completeness, usefulness and source to target approaches?
  • release v.0.9 (with the known issue of missing inputevents)?
  • next meeting: Jeff introduces MJ Gellada to present some of his work around MIMIC IV to OMOP
  • run the csv import of the Demo CDM (Tom + Mik)

Monday, May 17, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Andrew, Sicheng, Gigi, Anna, Vojtech

  • mapping improvements still in progress, now with better material! (Mik)
  • where to maintain more user-facing documentation:
    • wiki
    • readme.md files (Markdown)
    • github pages (Markdown)
    • read the docs
  • PhysioNet submission v.1 re-submitted
  • Create a draft document for best practice guideline
  • UAT (next try) on the following Wednesday
  • Publication strategy:
    • OHDSI symposium: prepare a submission
    • papers targetting technical approach / feasibility (ICU dataset, OHDSI research tools, BigQuery as platform...)?
  • Explore PhysioNet's motivation for OMOP CDM (Tom to send a brief definition)
  • return to waveform integration implementation: CDM extension, plans / research opportunities (Elsevier's BioInformatics?)

Monday, May 10, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Andrew, Tom, Sicheng, Gigi

  • mapping improvements in progress
  • extended mapping information for chartevents and labevents => build a best practice guideline for approaching a mapping exercise around measurements...
  • submission progress - Mik checks again with Vojtech
  • UAT tentatively next week: Monday, Tuesday, Wednesday (review Tom's example) / Sicheng to join the UAT
  • Revise documentation artifacts for use on github => Mik
    • Tailor documentation for OHDSI community as well as MIMIC community
    • Introduce people interested in ETL to OMOP and analysis of Intensive care data to the model
  • Later to be defined: Examples for Characterization, Prediction, Population Level Estimation => point out benefits as well as limits
  • PhysioNet plans to extend into multi-center studies? (via OMOP-CDM?)

Monday, May 3, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Andrew, Anna, Vojtech, Tom

  • mapping approach:
    • phase 1: use what we have now (curated mappings)
    • phase 2: check against itemids that had been previously used in research (Sicheng's file) and add what is missing
    • phase 3: create 2-billion concepts (non standard) waiting for volunteers to pick up and continue our mapping work => mark / leave out items that are definitely out of scope?
      • use additional information such as datatype (flag, string, int...) to identify meaning of MIMIC concepts
      • get context / permissable values for each itemid of interest => add 5 top ranked entries as additional columns
      • provide flags like "ambiguous" / "irrelevant" / ...
  • provide extended mapping information for chartevents and labevents
    • include artifacts and process definition in helping define a "best practice" mapping step
  • demo and full MIMIC ETL done, some issues with Achilles (SQL-Renderer)
  • demo / full submission draft: Mik and Andrew take one last look at it for the demo submission, Vojtech presses the submit button tomorrow

Monday, April 26, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Andrew, Anna, Tom, Gigi, Vojtech

  • include unknown mappings by mapping to 0 / itself => create 2 billion concepts for itemid entries for all source concepts?
    • create a full representation of MIMIC content in OMOP CDM?
    • create non mapped entries as non-standard?
  • discuss submission for full conversion
  • DQD vs. "metrics"? (Mik to discuss with Anna a format for publishing the metrics report)
  • go into more detailed explanation of mapping approach (2 billion for itemids, mapped to standard / percentage of non mapped entries)
  • Jeff provided input/outputevent itemids (of importance) => Mik to review possibility to include fluid balance information
  • body temperature in Fahrenheit and Celsius: convert to unit "Fahrenheit" or "Celsius"? => reserve the conversion step for this and more units by making use of a future OHDSI infrastructure for unit harmonization... / align with NIH (Clem McDonald) in LOINC & Unit harmonization

Monday, April 19, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Andrew, Anna, Tom, Gigi

  • chartevents had been omitted in earlier logic, will now be included and itemid's preserved in the process
  • duplicates between chartevents and other source tables
    • vitals (e.g. Blood pressure), lab results (POC vs. central lab) => Tom provided link
  • include nursing documentation? (e.g. device exposure, assessments?) => is there demand?
  • previously used concepts of interest => Tom suggests to look at the code repository and pull itemids that are in use -> Tom tries to pull itemids
  • input / outputevents: priority itemids can be provided by Tom in the near future to build fluid balance information
  • immediate plan: provide a new ETL including chartevents, resume UAT (tentatively 10am ET, Thursday 29th)

Monday, April 12, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Andrew, Anna, Tom, Jeff

  • Exploring improvements around custom mapping: increase amount of unmapped items (chartevents)
  • create crosswalk for itemid and concept ID mappings (versioned) including human readable names => Anna
  • include input / outputevents: Jeff's Team provides itemid to Concept IDs mappings from previous work around Sepsis Tom to look up relevant itemids for fluid balance <> make sure to identify the best place for iv meds and flow rates combined with eMAR/detail
  • Team up with Clem McDonald to improve itemid / LOINC mapping between both projects
  • strategy for submission of Demo and Full MIMIC IV conversion: keep them together, proceed to viable version asap

Monday, March 29, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Andrew, Anna, Tom, Jeff

  • drive future replacement of the visit detail solution for linking objects (waveforms, images, ...)
    • meet with Christian to find common ground
    • additional linking information in each measurement source value field?
  • visit types to best indicate specialties in (intensive) care / provider / care site
    • discussion with Christian / Claire to define best approach
  • dose era discussion: how to represent flow rates and administered dose over time
    • how to observe existing conventions vs. conventional multi-day use of dose era?
    • type concepts: more refined source types for drug administration and waveform data (trending vs. high res)
  • make use of qualifiers to indicate MEAN / AVERAGE as data reduction method - how to use observations vs. measurements (> CDM) here?
  • TODO: Mik/Andrew to complete submission work, Vojtech to finalize next week
  • Wednesday March 31st, 9am: reproduce IAC paper in ATLAS (record session!)

Monday, March 22, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Andrew, Vojtech, Gigi, Anna

  • Keeping the MIMIC to OMOP initiative alive:
    • Waveform related efforts: e.g. Roux Inst., OHDSI Cardiology WG (maybe lead by Chan / Korea)? => initiate CDM extension?
    • Post Acute CoViD sequelae, possible grant
    • Engage with PhysioNet community (revive the OMOP lab), Northeastern, provide an ongoing platform for people to work with OMOP'ed MIMIC data (let's not forget: MIMIC authorizations limited by PhysioNet credentialing), educate PN about a possible extension of reach for their data use
  • finalizing documentation: review ETL specs, then publish in github
  • add to the readme, how (future) contributions will be handled
  • present the results at an OHDSI community call, invite PhysioNet to participate

Monday, March 15, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Andrew, Tom, Gigi, Anna

  • Manlik to provide delta for vitals mapping
  • Reproduce IAC setup and reconnect with Alistair (Mik to team up with Andrew and Gigi + student), probably calendar week 13 => sql code can be found here
  • try to sign on to the Atlas instance: after first successful sign on, the authorizations can be granted
  • Mik to drive submission with Vojtech
  • discuss future processes (maintenance, WG-style continuation of the project) next week

Monday, March 8, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Andrew, Tom, Jeff

  • schedule time with Vojtech to complete submission => Mik
  • Manlik's SignalMethod Document: send a list of vitals codes of interest to Tom for clear definitions => Manlik (@manlik-brownsrdr)
  • eventually use "news panel" at PhysioNet to collect community feedback
  • provide access to Odysseus ATLAS: Tom, Alistair, Manlik,
  • Albert Einstein Sao Paolo - Adriano Perreira? Convert ICU data to OMOP - can share 100 patients (Cerner Millennium EHR?), review MIMIC github, maybe try the EHR workgroup / Melanie Philofsky, include Andrew & Robert Miller

Monday, March 1, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Gigi, Andrew, Tom, Jeff

  • Remove Waveform discussion from Demo submission v1.0 and add it back with v1.1
  • add to the Demo submission as indicated in the central notes document
  • Atlas session, rescheduling
  • review Manlik's SignalMethod document (add comments to PDF): Mik, Vojtech, Tom - deadline March 8th!
  • pick up conversation with the CDM team about: visit concepts, linking to an external object (file, binary), dose_era information derived from flow rate information / administered dose or volume => use cases
  • Age calculation: difference between MIMIC III and IV? See also.

Monday, February 22, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Gigi, Andrew, Tom, Vojtech

  • Christian Reich opposes use of visit detail for waveform linking
    1. For the POC remain with the chosen model
    2. introduce a new model that represents a link to an external object (biosignal data, imaging data) and also connects to derived information, e.g. measurements
    3. use fact relationship to represent links between waveform / image source (e.g. as observation or procedure) and other objects
  • standard concepts in visit domain underrepresent department information (e.g. Neurology) => a) workaround by capturing information separately b) introduce (new) standard visit concepts [Forum] -> Mik
  • age calculation could be wrong (age at first observation between -1 and +1) - anchor age at first admission, then compute age at later admissions
  • Number of persons: 340.02k - taken from core, but maybe not been passing through ICU (but only hosp, ED) => we have probably taken all patients, not only ICU (see condition: single live birth, medications: newborn vitamins) => Tom follows up on documentation what set of patients is in MIMIC (and where)
  • schedule Atlas session to create concept sets and analyze cohorts, Mik sends doodle
  • once UAT passed, give demo / presentation to PhysioNet lab staff

Monday, February 15, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Andrew, Vojtech

  • Waveforms - 3 sample implementations:
    1. mass data with 8k rows from 11 richest days in Manlik's big file,
    2. the summarized 5-min intervals data,
    3. tiny portion of trending data, where approximation method is 'None'
  • waveforms - Manlik sent 5 more processed waveform measurements - to be used in ETL; Manlik provides some matching sample XMLs (not too large)
  • waveforms - use observations to store metadata per individual waveform / visit_detail? => ask Christian / Claire (Mik)
  • postulate extension of the OMOP CDM to accomodate large objects and metadata for those (publication)
  • higher frequency "lost drugs" added (e.g. send vial)
  • ETL and custom mapping approximates 97% completeness
  • Vojtech: Gigi to be added to PhysioNet project, github release to support the Demo PhysioNet submission,
  • definition of release:
    1. github - represents a milestone, accompanied by providing material (e.g. Demo OMOP CDM as csv) - Mik evaluates the github release function
    2. documentation - ETL description, waveform processing and model, point out current status (e.g. delta to a full MIMIC IV Database)
  • release the current status (as built for the Demo submission) as v.1.0
  • what is missing for v.1.1: finalized ETL, ETL documentation, Waveform POC and documentation, Test Plan => declare
  • how to organize the MIMIC github going forward (e.g. control merging branches etc.)?

Monday, February 08, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Andrew, Tom

  • waveforms => Trending data creates huge dataload (one patient, one waveform from bedside monitor: 112MB of data resulting in > 1 million lines)
  • define waveform strategies in more detail: when and what to filter, how to reduce data, extract only selected meaningful data
  • waveform samples:
  • do one trial case and create measurement table entries from a recording with a limited timeframe, to be able to test / demonstrate a "high resolution" data replication into OMOP
  • link one regular 3 lead ECG wfdb file with extraction of some structured information (e.g. QT interval)
  • link one Respiration waveform wfdb file with extraction of some structured information (e.g. RR)
  • where to keep the resolution rate / "chunk size" / "reduction to mean" information: keep in visit_detail_source_value?
  • "123" for regular wfdb link, "123.csv" for trending data file, "123|25.6"
  • alternative: create observation record (linked to visit detail) to hold this metadata information => is this within the conventions currently in place for OMOP? Andrew: discuss best practice around waveform reduction / extraction with Rishi, Ben, Rai Winslow
  • Tom to follow up on the csv import of the demo CDM (following publication)
  • Tom to clarify the interaction between pharmacy and prescription table and the meaning of "Send Vial"
  • EMA table (electronic Medication Administration record) was not yet processed (too hard to parse / records might need aggregation / mostly infusion drugs?), more detailed documentation from Alistair?
  • Mik highlights source tables (and the omission of them) in the specifications document.
  • PhysioNet OMOP CDM will have to be accessed through a separate "billing project / account" (CoLab?)
  • full MIMIC IV OMOP CDM to be pulled from Odysseus to PhysioNet (Mik to clarify possible ways to do that)

Monday, February 01, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Andrew, Tom, Jeff

  • latest ETL: better mapping rate for measurements, visits for events without hadm_id, visit detail entries for (added) waveform entries: 4 random patients, created folders, assigned waveforms and waveform measurements => identification of patient from waveform to be investigated (Manlik)
  • plan: introduce improved mappings from Gigi (thanks!), improve some visit concepts and other minor issues, extract more information from pharmacy tables (call Jeff for help if difficult), produce documentation (specifications, github), work on publication
  • PhysioNet publication: introduce Waveforms or reserve for a separate publication? "MIMIC IV linked waveforms"?
  • trending data now available in MIMIC IV waveforms, potentially easier to convert to OMOP measurements? Manlik investigates options.
  • Andrew calls for inviting Paul Nagy (Johns Hopkins Precision Medicine) to Wednesday waveform call - eager to see current process
  • Tom to import Demo OMOP CDM to PhysioNet
  • Previous / current use of MIMIC for educational and research purposes?

Monday, January 25, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Andrew

  • some mismappings to be fixed (wrong fluid), Gigi helps us!
  • specimen id with labevents - figure out logic and where it links to (to support even better mapping) Mik
  • Achilles results preferably to be exported to JSON so it can be used with AchillesWeb
  • Achilles csv export still required
  • 4th iteration of ETL in process, after fixing above mentioned mismappings
  • Waveform ETL integration in process
  • Waveform processing documentation and quality assessment approaches have to be discussed (and added to github)
  • Publication: Andrew and Mik support Vojtech, Manlik's waveform processing chapter would have to be added => deadline for feedback (Demo version): February 4th - submission date: February 5th (placeholders [link to wiki page] for incomplete parts)
  • Co-Authors for PhysioNet publication, possible future publication outside PhysioNet (Symposium / Applied Medical Informatics)?
  • separate publication with a focus on the Waveform and its integration?
  • third publication with a focus on the actual POC with MIMIC IV?

Thursday, January 21, 2021 - PhysioNet publication discussion

  • Consult with Nicolas Paris, Adrien Parrot and Alistair Johnson for authorship in the PN publication
  • Central notes / draft document will be used to compile all additional sections
  • as a rule: explain all acronyms in the documentation (e.g. Google Storage (GS) bucket)
  • Export Achilles results tables to csv (demo)

Monday, January 18, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Andrew

  • Mik organizes a meeting with Vojtech and Andrew to discuss the publication
  • Waveform Harddisk still not shipped to Manlik
  • Tom imports the MIMIC Demo OMOP CDM to the PhysioNet BigQuery
  • The 3rd iteration of the MIMIC full ETL is reviewed for areas that need fine tuning
  • Odysseus might be able to host a full MIMIC IV OMOP CDM with Atlas access as long as Google credits are available. Users will be allowed following PhysioNet certification and only by specific request.

Monday, January 4, 2021 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Jeff, Tom, Andrew, Vojtech

  • Submission: Vojtech started draft; check on sequence (MIMIC IV Demo dataset available, conversion performed, submission uploaded)
  • DQD: reduce load and limit checks omitting "Atemporal" subcategories
  • Waveform Harddisk waiting to be shipped to Manlik
  • Future Space for converted Waveforms to be determined
  • PhysioNet CDM funding: Greg & Andrew to discuss with Google people possible credit donation

Monday, December 21, 2020 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Jeff, Tom, Andrew

Initiate submission to PhysioNet => Mik, Andrew collaborate with Vojtech to create submission, watch quality and requirements (Tom provides standard requirements, guidelines exist on PhysioNet)

  • guide Tom for implementation of local PhysioNet Demo OMOP CDM

  • describe method for creating csv extracts for delivery

  • create snapshot of github repository (to preserve the version that has been used for the final ETL)

  • usage notes: mention limitations, challenges experienced,

  • introduce reduced POC waveform data into CDM

  • disk with waveforms ready to be sent to Manlik => Manlik continues to analyze

  • DQD results: in the documentation explain FAILed steps that cannot be fixed

  • continue exploring possibilities supporting PhysioNet (e.g with Google credits / grant) so that an ATLAS and other OHDSI tools can be run inside the environment

  • User Acceptance Test: Gigi and Andrew to participate - together define the purpose & approach of the Tests.

Monday, December 14, 2020 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Jeff, Tom

Anna/Mik/Tom: Move to PhysioNet (Demo & full separately)

  • extract to storage bucket, grant access to Tom for pulling it from there
  • DDL to create OMOP CDM
  • Local Vocabulary import scripts (Athena vocab import in github)
  • CDM import scripts => next meeting to cover the Demo import, then maybe kick off the full import after

Manlik to provide reduced / small sample of waveform derived measurements for POC

Atlas pointing to PhysioNet with google authentication - presumably requires to be in one environment

  • costs occuring for Atlas hosting
  • costs occuring for queries => how can cost be born by accessing user?

PhysioNet prerequisites: no additional costs other than data hosting / no additional user maintenance other than central IAM for BigQuery

Jeff to research a solution for the above.

=> Mik: can we grant access to Jeff ([email protected]) to Demo Dataset in Odysseus GCP for pointing an external Atlas to it?

Mik to create WhiteRabbit / RiaH extract to support the specifications documentation

meeting recording ( pw: .%aT%0^e )

Monday, December 07, 2020 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Andrew, Tom, Vojtech

MIMIC full ETL in advanced stage - mapping gaps: drugs 84% - place of service - measurements (chartevents) - value & value_num: how to deal with entries in both columns (when non matching after string to number conversion) - run SQL to detect such cases

Move data to PhysioNet:

  • file formats tested for export: avro / csv
  • preferred import format by PhysioNet: single csv for each table, instructions / ddl for GBQ, (PostgreS)
  • move GBQ project over to PhysioNet?

Next steps: provide metrics report for full MIMIC ETL (SQL for this will go to the github), package the OMOP Demo dataset (avro for PhysioNet, csv for Vojtech to run DQD)

Monday, November 30, 2020 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Andrew, Jeff, Tom, Vojtech

Jeff researches how to make ATLAS traffic payable to the requestor instead of the data holder

For waveform storage: define Base-URL at Physionet GCP Storage, define folder / filename conventions

next steps: complete / fix / enrich MIMIC IV full dataset, add waveforms, move to PhysioNet (submit files in PhysioNet review process => Tom & Vojtech clarify the means to do this)

Unit Testing vs. integration tests:

  • basic logic / functional / data consistency test scripts (take some inspiration from MIMIC unit tests, adopt WhiteRabbit/RiaH approach) how much of what can be tested has been tested? => create test data input?
  • Coverage rate (tables / fields) => metrics
  • Achilles / DQD (Vojtech)

Release Achilles data in the restricted PhysioNet schemas

Future: set up a process for sending feedback, contributions and creating issues => keep the github alive, enrich existing ETL and documentation,

Monday, November 23, 2020 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Dima, Tom, Andrew, Jeff, Vojtech

PhysioNet environment is prepared to host data but not the application To use Atlas with Google login (IAM), apparently the Atlas should be running inside the same project as the data.

Benjamin ready to copy MIMIC IV waveforms to hard disk for Manlik. Tom gets the address!

Waveform POC with reduced coverage: cover Demo Dataset (ECG, RR) / 10 patients in MIMIC IV (ECG, RR)

Refining Custom Mapping in Demo Dataset to increase Mapping Rate -

Transition to MIMIC IV full conversion now.

Compare results between selected MIMIC IV unit tests and respective OMOP Unit Tests

Extract MIMIC OMOP CDM for Vojtech (csv, rds) to support adding this to the PhysioNet data release.

Work in github issue tracker, publish current version (as 0.5) and start discussion in Forum, advocate for collecting contributions in the github.

Establish a (desktop) connection by R to the OMOP CDM @ BQ (google SDK / console?)

Community involvement for (User Acceptance) Testing:

  • Concept prevalence testing?
  • Run Tests as of Test Plan (reproducing 2 literature based studies)
  • obesity study
  • Vasopressin + Lactate
  • Review Custom Mapping Spreadsheets
  • histogram checking: statistical distribution (Jeff)

Next step: actual study?

Monday, November 16, 2020 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Dima, Tom

employ Achilles / DQD for initial checks

acceptance criteria for Unit Testing (UT):

  • general OMOP consistency check
  • adopt some of the MIMIC unit tests

acceptance criteria for User Acceptance Testing (UAT):

  • reproduce two literature results (Tom wants to ask for code associated with that)
  • for waveform testing - compare charted values (e.g. Blood pressure, Resp Rate) against waveform extracted data by approximate timestamp

=> produce documentation based on UT / UAT results (including mapping rates etc.)

ToDo:

  • Jeff might be able to host an Atlas instance
  • verify how billing can be correctly addressed (needs a project for each 3rd party?)
  • MIMIC IV waveform data not yet available (Tom pings Benjamin)

plan: start next week with MIMIC IV full dataset / prepare transition to PhysioNet

Monday, November 9, 2020 - Weekly requirements meeting

Attendees - Manlik, Mik, Anna, Gigi, Andrew, Jeff, Juan, Tom

Waveform processing:

  • de-duplication
  • annotation of values with algorithm / provenance

Andrew checks with N3C how to provide / update DOIs; PhysioNet might be able to provide DOIs for the project: maps to an URN / URL - PhysioNet would take responsibility for maintaining the landing page (alternatively submit to zenodo.org and link to github repository)

Current state:

  • explain specimen (Sepsis use case) and dose era (pick the algorithm) impediments

Next meeting:

  • Define together the tests that would satisfy as acceptance criteria for the project

Monday, November 2, 2020 - Weekly requirements meeting

Attendees - Manlik, Dima, Mik, Anna, Gigi, Andrew, Jeff, Connor, Tom

  • moving on to PhysioNet: deliver full load as csv for starters, provide ETL going forward

  • ATLAS in a different place than PhysioNet? Probably no resources available for Maintenance. OHDSI community? => Andrew / Gigi

  • no news around MIMIC IV waveforms, Tom contacts Benjamin

  • timeline change - mid of December

  • how to proceed after end of current project? => First Author drives

  • benefit over existing MIMIC IV BigQuery? new dataset, tools / platform, community, waveform

  • Other MIMIC III to MIMIC IV transition projects / publications? in PhysioNet? reproducibility of previous MIMIC III analyses on MIMIC IV OMOP

  • Gigi provides papers about MIMIC III

  • Tom to provide unit tests and use cases from MIMIC IV for testing the OMOP CDM

  • Andrew investigates battery of tests for synthetic data potentially to be used on MIMIC IV source and OMOP CDM target

  • Manlik: while processing waveforms, discovered that there are no LOINC standard concepts for a number of EKG leads; For arterial blood pressure no matching segment designations (peak systolic pressure) => what to implement in first phase? identify gaps to 100% coverage and prioritize

  • Next meeting: discuss gaps and priorities, present unit test procedures (involve RiaH?)

=> meeting recording (6Vp#MQr2)

Monday, October 26, 2020 - Weekly requirements meeting

Attendees - Manlik, Dima, Mik, Anna, Gigi, Vojtech, Andrew, Jeff, Juan

  • check out the first preliminary representation of our MIMIC IV OMOP CDM (Demo dataset at this time) as viewed through an ATLAS.
  • discussion on how we can facilitate moving on to the PhysioNet environment for Demo as well as full dataset: Mik addresses this with Tom
  • explore how we could establish a Data Quality Dashboard for these PhysioNet instances: Andrew/Vojtech (maybe file based approach)
  • how can we make the waveform data processing workflow a reality: Manlik to provide material; Data Quality process needed (e.g. spot checks of output, mass checks? - Manlik to define approach)

Andrew: Documentation generated by White Rabbit / RiaH? - useful to define unit tests Vojtech: Order of parsing source tables?; Documentation of current ETL steps / specs?; Instructions and set of scripts to be performed by outside person including creating OMOP CDM csv files (aka as an export script)

Apache 2.0 license for the ETL code

Vojtech is responsible for uploading material to PhysioNet publication. First Authorship can still be claimed :-)

meeting recording (nb+b8aR&)

Monday, October 12, 2020 - Weekly requirements meeting

Attendees - Manlik, Dima, Mik, Anna, Gigi, Jeff

Manlik reached out to Benjamin Moody for MIMIC IV waveform samples / files

Define process to access individual waveforms out of a collection built in OHDSI tool-set => brainstorm meeting in our next regular call (26th)

=> Gigi collects possible pitfalls they encountered in their project

Monday, October 5, 2020 - Weekly requirements meeting

Attendees - Manlik, Andrew, Dima, Mik, Anna, Juan Banda, Vojtech, Gigi, Derek, Tom, Jeff, Chris

Clarification around qualifier values: keep with basic standard concepts for phase 1, preserve information that could be covered in CDM v6.0, so we can return to it later.

Collaboration opportunities around other MIMIC to OMOP projects, waveform processing => Gigi Lipori, Derek Merck, Christopher Harle - join the meetings

Here’s a link to the Rhode Island Hospital Emergency Med telemetry data set.

Request by Tom: COVID19 related concepts that have no mapping to LOINC - please report these to Tom for follow up (e.g. test vs. results) Hua Xus UT site - (Andrew can provide direct contact address if needed)

Request by Vojtech: contact him to discuss unit-testing (once we get there)

Monday, September 28, 2020 - Weekly requirements meeting

Attendees - Manlik, Andrew, Dima, Mik, Anna, Juan Banda, Vojtech, Tom, Jeff, Ben Moody

Manlik showed waveform extraction progress recording to be found here ( Passcode: %7J2h9Xx )

Monday, September 21, 2020 - Weekly requirements meeting

Attendees - Manlik, Andrew, Dima, Mik, Anna, Juan Banda, Vojtech

We initiated a discussion around Waveform embedding into the OMOP CDM. Visit_detail table has been brought up as a possibility to create waveform instances and link them to derived data via the visit_detail_id. The table gives us visit_detail_source_value to keep the link to the actual waveform data and ways to classify the waveform type (to be discussed).

Anna has made her first commit to a preliminary new branch of the project to show some of the ETL work currently in progress.

Deliverables: Production OMOP CDMs in BQ @ PhysioNet GCP, one for the demo dataset, a second one for the full MIMIC IV dataset Additionally a downloadable zip file with OMOP CDM csv files formatted for a postgreSQL DB / SQLLite (Vojtech) - how to deal with waveforms?

Task for Vojtech: Investigate data elements for features derived from waveforms

Next meeting (September 28) Andrew and Manlik will give insights into their discussions around waveform and waveform formats with the CHOruS group (including the AtriumDB solution). We will continue the design and decision discussion around waveform representation / infrastructure.

Monday, September 14, 2020 - Weekly requirements meeting

Attendees - Manlik, Andrew, Dima, Mik, Anna, Juan Banda, Greg, Dave

Odysseus GCP Development Environment live, started on ETL for dimension tables, Anna shows first results from demo dataset Training for PhysioNet credentialing done - credentials pending Manlik provides summary about meeting with the CHoRus waveform group, more interaction for the waveform area with PhysioNet experts would be helpful => Tom provides contact into the group

Monday, August 31, 2020 - Weekly requirements meeting

Attendees - Manlik, Andrew, Tom Pollard, Dima, Benjamin, Mik, Anna, Juan Banda, Greg, Vojtech, Dave

MIMIV IV demo dataset - approval pending, maybe delivery of preliminary version to Dev team

Odysseus Team is performing trainings -

Waveform:

  • explore alternative formats (10 different flavors present in MIMIC DB)
  • keep existing storage location for waveforms? => avoid duplicating waveform data

Manlik is currently exploring wfdb conversion to xml the existing wfdb tools are partially outdated, existing python library still maintained

Considerations for waveform processing:

  • import of waveform (raw) data
  • accessibility for research / analysis
  • size (storage)
  • normalization (device independent) [normalizing against 1mV for ECGs, unit conversion for individual values]
  • extraction of metadata / clinical findings (source: automatic findings vs. actual confirmed clinical finding) / structured values => annotate WF (during ETL or before?)
  • waveform capturing event as base information

Andrew to lead waveform sub-group:

Manlik co-leads

Approach Benjamin Moody to explore possibilities regarding waveform formats and storage location

Next steps: Odysseus takes on core tables to be represented in GCP OMOP CDM (v5.3.1, v6 tables optionally to be filled?) Vojtech would like to contribute in consulting the ETL logic for core tables (visit detail!) Volunteers can approach the ED and CXR sections to come up with ETL logic: ask Nicolas Paris, Adrien Parrot (maybe reservations against GCP), Jose Posada,

Friday, August 28, 2020 - OMOP CDM Environment discussion

Attendees - Tom Pollard, Dima, Anna, Konstantin, Mik

GCP instance for development cannot be provided by PhysioNet, Odysseus needs to allocate funding for this in the Odysseus GCP Dev instance can be run with manual user maintenance (access to MIMIC full dataset) Authorization / restriction for production instance provided by PhysioNet has to take into account an automatic credential check for PhysioNet/MIMIC users (e.g. in ATLAS). Means to facilitate that have to be explored (ATLAS using PhysioNet API?).

Monday, August 24, 2020 - Weekly requirements meeting

Attendees - Manlik, Vojtech, Andrew, Tom Pollard, Mik, Anna, Juan Banda

Dev Environment will likely be GCP => Odysseus to clarify – 1:1 Tom/Odysseus Development of ETL logic will probably need to be done from scratch, using the learnings we can take from previous work How can the Google ETL Framework be employed for this

Platform considerations must take into account openness of access and restriction by PhysionNet rules at the same time

Waveform: Use cases: predictive analysis / multimodal research WFDB vs. other waveform formats: Manlik to provide input about alternative waveform formats, collect information on github

previous information in google drive

  • Former minutes
  • prioritization tasks (results will go to project schedule)

Monday, August 17, 2020 - Weekly requirements meeting

Attendees - Manlik, Vojtech, Andrew, Dave, Tom Pollard, Mik, Melanie, Anna, Benjamin

Scope - MIMIC-IV, waveform data - Discuss how to prioritize modules

  • Dataset is de-identified per HIPAA, but there is sensitive information possible in the notes data
  • Demo MIMIC-IV is 100 patients => Tom to provide 1st version of the generated data set

Collaboration approach for items out of original scope to be taken over by other volunteers

Create new git repo in OHDSI

Waveform – meeting on Wednesday 08/19 to discuss use cases

Volunteer tasks – RiaH => create in depth expert knowledge

Liaise with Tom for tricky questions

Particular interests / focus of group members:

  • MIMIC in a common data model so it can be used by the OMOP community and can be compared with other sources => validation of MIMIC data (Tom)
  • Expand the kinds of data that are available for OMOP (e.g. waveform) (Andrew)
  • Use in National CoViD collaborative
  • Device based Trial platforms
  • Uniting Physionet and OHDSI
  • Mapping Tufts ICU data
  • Historic interest for high quality rich dataset (Vojtech)
  • Advancing the OMOP CDM
  • Show worth of ICU data use cases (especially waveform) in OMOP (Benjamin)
  • Rescue / salvage waveform data (Manlik)

Results: Working ETL’ed MIMIC IV dataset in OMOP CDM Documentation on how to apply learnings to your own environment with ICU data / work with the Physionet OMOP environment

Clone this wiki locally