Skip to content

Dfp pclr zoo #594

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions model_zoo/PCLR/.gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.h5 filter=lfs diff=lfs merge=lfs -text
20 changes: 19 additions & 1 deletion model_zoo/PCLR/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ python -i get_representations.py # test the setup worked
You can get ECG representations using [get_representations.py](./get_representations.py).
`get_representations.get_representations` builds `N x 320` ECG representations from `N` ECGs.

The model expects 10s 12-lead ECGs with a specific lead order and interpolated to be 4,096 samples long.
The model expects 10s 12-lead ECGs meaured in milli-volts with a specific lead order and interpolated to be 4,096 samples long.
[preprocess_ecg.py](./preprocess_ecg.py) shows how to do the pre-processing.

### Use git LFS to localize the model file
Expand Down Expand Up @@ -103,6 +103,24 @@ the model only takes lead I of the ECG as input.
## Lead II PCLR
[Lead II PCLR](./PCLR_lead_II.h5) is like lead I PCLR except it was trained with all ECGs sampled to 250Hz.

## C3PO PCLR and AUG C3PO PCLR
We also provide PCLR models trained using subjects from the C3PO cohort, with and without augmentation.
The model files are available via:

`git lfs pull --include model_zoo/PCLR/c3po_pclr.h5`

`git lfs pull --include model_zoo/PCLR/aug_c3po_pclr.h5`

You can get ECG representations using for example [get_representations.py(ecgs, model_name='c3po_pclr')](./get_representations.py).
`get_representations.get_representations` builds `N x 320` ECG representations from `N` ECGs.

The model expects 10s 12-lead ECGs measured in milli-volts with a specific lead order and interpolated to be 2,500 samples long. Note that this interpolation is different from the standard PCLR model.
[preprocess_ecg.py](./preprocess_ecg.py) shows how to do the pre-processing; when calling it remember to set `ecg_samples=2500`.

The code snippet above showing example inference with UKB ECGs is also appropriate for these models. Remember to:
1. Load `c3po_pclr.h5` or `aug_c3po_pclr.h5` instead of `PCLR.h5`.
2. Interpolate to 2500 instead of 4096.

## Alternative save format
The newer keras saved model format is available for the 12-lead and single lead models at [PCLR](./PCLR)
and [PCLR_lead_I](./PCLR_lead_I) and [PCLR_lead_II](./PCLR_lead_II).
3 changes: 3 additions & 0 deletions model_zoo/PCLR/aug_c3po_pclr.h5
Git LFS file not shown
3 changes: 3 additions & 0 deletions model_zoo/PCLR/c3po_pclr.h5
Git LFS file not shown
15 changes: 11 additions & 4 deletions model_zoo/PCLR/get_representations.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,27 @@
from preprocess_ecg import process_ecg, LEADS


def get_model() -> Model:
def get_model(model_name = 'pclr') -> Model:
"""Get PCLR embedding model"""
return load_model("./PCLR.h5")
if model_name == 'pclr':
return load_model("./PCLR.h5")
elif model_name == 'c3po_pclr':
return load_model("./c3po_pclr.h5")
elif model_name == 'aug_c3po_pclr':
return load_model("./aug_c3po_pclr.h5")


def get_representations(ecgs: List[Dict[str, np.ndarray]]) -> np.ndarray:
def get_representations(ecgs: List[Dict[str, np.ndarray]], model_name:str = 'pclr') -> np.ndarray:
"""
Uses PCLR trained model to build representations of ECGs
:param ecgs: A list of dictionaries mapping lead name to lead values.
The lead values should be measured in milli-volts.
Each lead should represent 10s of samples.
:param model_name: Specifies the model to use: either 'pclr', 'c3po_pclr' or 'aug_c3po_pclr'.
Default is 'pclr'
:return:
"""
model = get_model()
model = get_model(model_name)
ecgs = np.stack(list(map(process_ecg, ecgs)))
return model.predict(ecgs)

Expand Down