Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick attempt at model interpretation #7

Closed
wants to merge 6 commits into from
Closed

Conversation

metazool
Copy link
Collaborator

@metazool metazool commented Jul 5, 2024

Considered leaving this on a branch as the results were so inconclusive, but there's some useful utils and improved test coverage in here as well that seem worth a merge - would appreciate eyes on it.

What's in here

  • Tiny bit of refactoring for ease of loading the scivision model either with or without its final layer (the classification one)
  • Associated tests, increased overall coverage
  • A notebook which leans heavily on the Captum tutorial walkthroughs, applying a range of methods to heatmap areas of plankton images which have influenced the predicted class

The visualisations don't look very meaningful to me, wrote a bit about probable reasons why, and what's worth doing next, in #6

To test

  • run the python tests as per the README (minimum: export PYTHONPATH=.; py.test
  • if you've run through the outstanding PR in Proof of concept of similarity search with the scivision model #5 then you'll have the chromadb index with the images and embeddings in it
  • The notebook should be a matter of Run all - it needs the location of the s3 object store set in .env - see also the README

@metazool metazool requested review from albags and a team July 5, 2024 06:32
Copy link

github-actions bot commented Jul 5, 2024

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
115 101 88% 0% 🟢

New Files

File Coverage Status
cyto_ml/data/s3.py 0% 🟢
cyto_ml/tests/test_image_embeddings.py 100% 🟢
cyto_ml/tests/test_intake_utils.py 100% 🟢
cyto_ml/tests/test_vector_store.py 100% 🟢
TOTAL 75% 🟢

Modified Files

File Coverage Status
cyto_ml/data/intake.py 67% 🟢
cyto_ml/data/vectorstore.py 81% 🟢
cyto_ml/models/scivision.py 96% 🟢
TOTAL 81% 🟢

updated for commit: c278b48 by action🐍

@metazool metazool changed the base branch from main to more_data July 5, 2024 06:37
@metazool metazool marked this pull request as draft July 9, 2024 12:35
@metazool
Copy link
Collaborator Author

metazool commented Jul 9, 2024

Switched this to draft as I have a small set of changes testing fallback approaches which didn't look right to me either!

@metazool metazool marked this pull request as ready for review July 10, 2024 08:37
@metazool
Copy link
Collaborator Author

metazool commented Jul 10, 2024

I've switched this back out of draft status and added a few final touches with non-plankton images and a generic ImageNet set of weights. The outcome's still really open, but there are higher priority things to work on - the fact that I can't easily connect this line of work to anything active on the RSE group discussions forum is a helpful indicator of that!

wasp_occ

@metazool
Copy link
Collaborator Author

See also #6

@metazool
Copy link
Collaborator Author

Abandoning this, the outcome wasn't very illuminating, can easily be reconstructed, and the underlying changes (dropping scivision, shifting the project layout) would make reconstruction less effort than merging

@metazool metazool closed this Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant