-
Notifications
You must be signed in to change notification settings - Fork 32
Tools (Contrib)
Description:
Converts the fingerprints generated using rdkit to gfp format.
Author/owner: C3/Eli Lilly and Co
Sample 1:
rdkitfp2gfp -dlm ' ' -p 100 -pfn output.gfp -gfp 1k_rand_pubchem.fps
Explanation:
Convert the fingerprint file 1k_rand_pubchem.fps generated from rdkit to output.gfp (-pfn output.gfp) in gfp format (-gfp). Report program every 100 lines (-p 100) and use ' ' as delimiter in the input file (-dlm ' ')
Help command:
rdkitfp2gfp -h
Description:
For an input SMILES file (Molecule SMILES and numeric ID) generate a CSV of Matched Molecular Pairs
Author/owner: C3/Eli Lilly and Co
Sample 1:
getMMPfromSMI.py -i input.smi -o output.csv
Explanation:
Reads input smiles from input.smi, applies the MMP algorithm and writes out all MMP's to output.csv
Help command:
getMMPfromSMI.py -h
Description:
For an input CSV file (Molecule SMILES, numeric ID and numerical data column) generate a CSV of Matched Molecular Pairs with associated delta for the given data column(s).
Author/owner: C3/Eli Lilly and Co
Sample 1:
getMMPStatsfromCSV.py -i input.csv -o output.csv -s SMILES -n ID -c SINGLE
Explanation:
For the input CSV, with columns SMILES and ID, apply the MMP algorithm and write out all MMP's to output.csv. If numerical data column(s) are present in the CSV, additional column(s) will be added to the output CSV containing the data delta between any two MMP's identified.
Help command:
getMMPStatsfromCSV.py -h
Description:
For an input SMILES file (Molecule SMILES and numeric ID) generate a CSV of the MMP that has the smallest change between the two molecules (by atom count). This is roughly equivalent to the Maximum Common Substructure (MCSS) between the two molecules but is based on SMILES.
Author/owner: C3/Eli Lilly and Co
Sample 1:
getMMPbasedMCSSfromSMI.py -i input.csv -o output.csv
Explanation:
Reads input smiles from input.smi, applies the MMP algorithm and writes out only the largest MMP to output.csv (an approximation to the MCSS)
Help command:
getMMPbasedMCSSfromSMI.py -h
Description:
For an input SMI file (Molecule SMILES and numeric ID) mutate the molecule to create new idea molecules based on a prebuilt or custom set of Transforms.
Author/owner: C3/Eli Lilly and Co
Sample 1:
getMMPEnumeratedNewMols.py -i input_smi.csv -p input_transforms.csv -o output_eunm.csv --frag_left_col FRAG_L --frag_right_col FRAG_R
Explanation:
Reads the input smiles in input_smi.csv and fragments them using dicer. Reads the transforms in input_transforms.csv with the columns FRAG_L, FRAG_R. Searches for FRAG_L in the fragmented input molecules and replaces with FRAG_R if matched. Outputs fully enumerated new molecules in output_eunm.csv.
Help command:
getMMPEnumeratedNewMols.py -h
Description:
For an input CSV file (Molecule SMILES, numeric ID and numerical data column) generate a CSV of any Matched Molecular Series.
Author/owner: C3/Eli Lilly and Co
Sample 1:
getMMPSeriesfromCSV.py -i input.csv -o output.csv -s SMILES -n ID -a PIC50
Explanation:
For the unput CSV file with columns SMILES, ID and PIC50, generate all matched series ordered by PIC50. Write results for output.csv.
Help command:
getMMPSeriesfromCSV.py -h
Description:
For an input SMI file, generate all matched series, then search for any series that can be extended by an equivalent series in the same set. This is an automated approach to SAR transfer and deriving new idea compounds with improved activity. Alternatively a pregenerated series file, or directory of files can be used.
Author/owner: C3/Eli Lilly and Co
Sample 1:
getMMPSeriesSuggestfromCSV.py -i input_smi.csv -o output.csv -s SMILES -n CHEMBL_ID -a hERG_pIC50 -y
Explanation:
For the input CSV with columns SMILES, CHEMBL_ID and hERG_pIC50 - generate matched molecular series. The series will be ordered by the potency data in the column hERG_pIC50 (assumed to be in log form).
Help command:
getMMPSeriesSuggestfromCSV.py -h