Skip to content

Conversation

@dalloliogm
Copy link
Contributor

Hi,
I was working on the VAE bounty. That bounty is based on the COVID-19 dataset.

While looking at it, I wondered how users could customize the dataset, to use other images than this specific dataset.

So, I took the liberty of creating a BaseImageDataset class, and use that as based for the COVID19 dataset class.

I've also converted examples/covid19cxr_conformal.py into a notebook, as I thought that would be easier to read and fit better in an examples/ folder.

There are also a couple of test files for the new dataset class.

I've also improved the error message when the data is not found, suggesting users where they can download it.

I've added openpyxl as dependency because it is needed to read the metadata files.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey I think you might be misunderstanding how these BaseDatasets work here.

Namely, if you're doing a VAE and you assume you're going from image to an image, that's a Task.

Datasets are technically distinct from ML tasks as they can often contain many different ML Tasks.

Please see here for https://colab.research.google.com/drive/1kKkkBVS_GclHoYTbnOtjyYnSee79hsyT?usp=sharing

For more tutorials https://pyhealth.readthedocs.io/en/latest/tutorials.html

Let me know if that makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants