Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding iSIM as a trackable value during training #192

Merged
merged 9 commits into from
Mar 11, 2025

Conversation

khuddzu
Copy link
Contributor

@khuddzu khuddzu commented Jan 28, 2025

Sorry it took me so long to get to this.
I have made the changes needed in the REINVENT code to track the iSIM average similarity across each training step, and add this value to the tensorboard output.

Below is a list of the changes I made:

  1. In learning.py I have imported iSIM, computed binary RDKIT fingerprints from the smilies array, and computed isim from the generated fingerprints. I then added isim as a variable to the report_data dictionary.
  2. In data.py I have added the isim variable to RLReportData.
  3. In tensorboard.py I have added iSIM:Average Similarity to the report as a scalar per each step.

I want to mention that I started with the absolute barebones steps for this implementation. I want to make sure this is what you all are looking for.

Below are other changes I can make:

  1. I have coded this by importing iSIM functions from the installable package. This now makes the iSIM package a dependency for REINVENT. If this is not what you all want I can either hard code the iSIM functions I used into the script, or add some variable so that iSIM is only imported if the user sets this to True.
  2. For the fingerprint generation I am currently computing binary fingerprints from RDKIT. We support different fingerprint types in this function, I can also make this optional for the user
  3. With iSIM, we can compute different types of average pairwise similarities. In this current implementation I have used Tanimoto. I can also make it optional for the user to choose the other similarity types we support.

Please let me know if this was not what you were looking for, or if you are interested in any of the above additions. Also, let me know if there was a place in the REINVENT code that I missed adding the iSIM variable. With the test code I ran, this successfully adds the iSIM value per step in the tensorboard event file.

Best,
Kate Huddleston

khuddzu and others added 2 commits January 28, 2025 14:45

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@halx
Copy link
Contributor

halx commented Jan 29, 2025

Many thanks for this Kate! The code so far looks very good.

The iSIM functionality should be optional i.e. a switch in the TOML file should control whether the user wants to see this or not. As for fingerprint/distance choice, I would leave that open for the moment.

One problem though is that iSIM does not seem to have a proper installation procedure. The specific issue is that a pip install does not copy the necessary files into the installation location. The assumption is being made that the package will be installed in editable mode. That is not really an option for the end user. Eventually it should be possible to install via pip install git+ssh://[email protected]/mqcomplab/iSIM.git because that we can than add to the REINVENT installation setup.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@khuddzu
Copy link
Contributor Author

khuddzu commented Feb 3, 2025

I have made iSIM installable via:
pip install git+ssh://[email protected]/mqcomplab/iSIM.git

I am trying to fix the issue with making isim tracking an optional variable in the TOML file. I am tracking back from learning.py and data.py and trying to find how the config file variables are accessed and treated. I am running into a bit of an issue locating this.
Any advice you could give would be greatly appreciated.

Thanks for your patience.

@halx
Copy link
Contributor

halx commented Feb 4, 2025

Many thanks.

Best would be to add a parameter tb_isim to runmodes/RL/validation.py in class SectionParameters. You would the need to pass this in from run_staged_learning.py (via config.tb_isim) via the call to model_learning (almost at the end of the file).

@khuddzu
Copy link
Contributor Author

khuddzu commented Feb 11, 2025

I now have implemented isim tracking as optional. I had to make changes in a few more places than you mentioned.
For the toml file, the config variable tb_isim must be under [parameters].
Let me know if you have any questions or concerns about the changes I have made.

@@ -17,6 +17,10 @@
from torch.utils.tensorboard import SummaryWriter
import numpy as np

#ISIM imports
from iSIM.comp import calculate_isim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but we would also need to update the requirements file and the pyproject.toml file

@@ -219,7 +219,7 @@ def run_staged_learning(
)

parameters = config.parameters

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, set up your IDE or editor to not add spurious whitespace. This distracts from actual code review. Thanks.

@@ -15,6 +15,7 @@ class ReinventConfig(GlobalConfig):
use_cuda: Optional[bool] = Field(True, deprecated="use 'device' instead")
tb_logdir: Optional[str] = None
json_out_config: Optional[str] = None
tb_isim: Optional[bool] = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed as global parameter because it is only relevant in the RL parameter section.

halx added 2 commits March 11, 2025 10:13

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@halx halx merged commit 1f8c5da into MolecularAI:main Mar 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants