-
Notifications
You must be signed in to change notification settings - Fork 4
Fitting an off rate series
You ran an offrate experiment! Starting with fluorescent ligand bound to clusters, you allowed dissociation over time, and you have an image series representing the fraction ligand bound to clusters at increasing times. This page will bring you through an example on how to quantify the off rate of the ligand, starting with quantified images.
All of the files generated or used in this example are in /lab/sarah/RNAarray/example/ on greendragon.
This pipeline requires the following input files:
- Fluorescent values.
- A CPseries file: a pandas DataFrame, indexed on the clusterId, with columns corresponding to the fluorescent values for different times.
- Ideally this file only includes the clusters you would like to fit.
- Time series per tile.
- A dictionary (save as a pickle with extension ".p") giving the per-tile times associated with the columns of the CPseries file.
- Tile series
- A CPtiles file: a pandas Series, indexed on the clusterId, with values corresponding to the tile that cluster corresponds to.
- Library members.
- A CPannot file: a pandas DataFrame, indexed on the clusterID, with a single column 'variant' that indicates which library member is at that cluster.
If you don't yet have these files, refer to this page on processing CPfluor files ([data preprocessing](Processing quantified images into an experimental series)) and mapping barcodes: (sequence preprocessing).
from fittinglibs import fileio
timeDict = fileio.loadFile('offRates/rates.timeDict.p')
print timeDict
Produces the following mess:
{'003': [14.046, 251.281, 481.828, 718.75, 951.187, 1180.671, 1423.156, 1774.828, 2006.906, 2245.14, 2770.093, 3603.078, 4437.843, 5676.765, 6507.281, 7334.359, 8176.375, 9011.875, 9846.234, 14248.453], '005': [30.171, 265.515, 496.578, 732.031, 965.734, 1195.265, 1437.921, 1789.187, 2022.031, 2258.734, 2785.328, 3618.671, 4451.609, 5690.953, 6521.984, 7349.703, 8191.953, 9026.859, 9860.515, 14275.656], '002': [6.546, 243.375, 476.046, 711.281, 945.625, 1171.562, 1415.921, 1767.656, 2000.187, 2237.375, 2761.953, 3594.781, 4430.765, 5669.468, 6500.828, 7325.796, 8170.765, 9003.375, 9838.875, 14237.156], '001': [0.0, 236.843, 467.984, 703.843, 938.812, 1165.984, 1407.875, 1758.843, 1992.937, 2229.937, 2755.375, 3588.078, 4423.875, 5662.453, 6492.375, 7318.14, 8163.687, 8995.953, 9831.593, 14224.593], '010': [65.656, 300.39, 532.296, 767.046, 999.843, 1234.968, 1471.828, 1824.453, 2054.718, 2293.015, 2823.046, 3656.265, 4486.765, 5725.921, 6554.687, 7388.39, 8226.671, 9059.859, 9899.031, 14337.0], '011': [71.859, 308.093, 539.812, 775.046, 1006.562, 1242.078, 1478.281, 1831.421, 2061.781, 2299.937, 2829.843, 3664.0, 4493.843, 5733.515, 6561.953, 7396.281, 8232.953, 9066.64, 9906.656, 14350.328], '012': [78.906, 315.25, 546.843, 781.796, 1012.25, 1249.421, 1485.421, 1838.031, 2067.875, 2307.296, 2836.875, 3670.75, 4500.0, 5741.281, 6568.328, 7403.953, 8240.343, 9073.531, 9913.109, 14361.734], '013': [86.828, 322.265, 553.656, 789.265, 1019.328, 1257.265, 1492.562, 1844.421, 2074.453, 2313.828, 2843.453, 3677.281, 4506.796, 5748.343, 6575.703, 7411.718, 8248.171, 9080.328, 9921.25, 14375.359], '007': [45.687, 280.296, 510.968, 745.89, 978.515, 1210.562, 1451.671, 1804.218, 2033.953, 2274.515, 2799.984, 3632.515, 4465.14, 5704.671, 6535.671, 7365.625, 8206.328, 9039.437, 9875.109, 14300.093], '006': [38.265, 272.765, 503.468, 737.968, 970.984, 1202.656, 1445.156, 1796.625, 2027.375, 2266.953, 2792.453, 3625.968, 4459.14, 5698.828, 6529.484, 7357.734, 8198.828, 9032.875, 9867.765, 14287.156], '016': [110.453, 343.5, 574.796, 811.984, 1039.265, 1278.39, 1513.906, 1866.125, 2094.656, 2333.343, 2864.25, 3698.046, 4526.859, 5769.968, 6595.468, 7433.609, 8270.781, 9104.218, 9944.468, 14412.234], '004': [21.859, 258.515, 489.703, 724.578, 957.984, 1187.843, 1430.234, 1782.187, 2014.343, 2251.078, 2778.25, 3611.046, 4444.546, 5684.031, 6514.328, 7342.187, 8184.046, 9019.609, 9852.843, 14262.281], '014': [94.953, 328.046, 561.39, 796.546, 1026.593, 1262.812, 1498.812, 1732.921, 1968.625, 2204.171, 2432.859, 3267.234, 4100.296, 4925.562, 5754.89, 6582.375, 7418.484, 8255.968, 9088.578, 10826.546], '009': [58.265, 293.953, 525.468, 760.765, 993.484, 1226.468, 1465.156, 1699.468, 1934.734, 2167.296, 2398.968, 3229.437, 4063.75, 4889.875, 5720.14, 6547.843, 7380.828, 8219.515, 9053.453, 10764.75], '008': [52.593, 286.812, 518.437, 753.234, 986.031, 1218.046, 1457.281, 1692.734, 1926.812, 2160.234, 2391.125, 3223.203, 4056.187, 4882.671, 5711.859, 6541.671, 7373.046, 8212.687, 9046.843, 10753.296], '015': [102.593, 336.546, 568.078, 803.89, 1032.828, 1270.921, 1506.562, 1741.046, 1975.687, 2211.093, 2439.546, 3273.109, 4107.265, 4933.656, 5762.375, 6588.984, 7425.765, 8263.437, 9096.593, 10837.093]}
clusterID
M00653:72:000000000-AKPP5:1:2101:19526:15124 001
M00653:72:000000000-AKPP5:1:2102:4005:7276 002
M00653:72:000000000-AKPP5:1:2102:14626:8517 002
M00653:72:000000000-AKPP5:1:2102:19935:11152 002
M00653:72:000000000-AKPP5:1:2102:23710:15626 002
I normalize my data by a single value from the red channel.
# normalize green channel CPseries file by red channel CPseries file.
basename=offRates/AKPP5_ALL_Bottom_filtered_reduced
python -m normalizeSeries -b $basename.CPseries.pkl -a $basename"_red.CPseries.pkl"
This produces a CPseries file, where the fluorescence values have been normalized. (see [normalization](Normalization of cluster fluorescence intensities) for more info).
Fit single clusters to either off rate curve or on rate curve (Default is off).
# fit single clusters
basename=offRates/AKPP5_ALL_Bottom_filtered_reduced
normbasename=$basename"_normalized"
python -m fitRatesPerCluster -cs $normbasename.CPseries.pkl -t $basename.CPtiles.pkl -td offRates/rates.timeDict.p
Note: if you'd like to only fit a subset to make sure the script is working,
use the --subset
flag, which should take less than a minute to run.
This produces two outputs:
- Single cluster fits.
- A CPfitted file: a pandas DataFrame, indexed on the clusterId, with columns corresponding to the fit parameters (fmax, koff, fmin) and their associated stde errors (fmax_stde, etc), as well as the coefficient of determination (rsq), the exit flag (from lmfit), and the root mean squared error (rmse).
- koff is in 1/second.
- Fit parameters.
- A fitParameters file giving the initial guesses, upper- and lower-bounds on the three fit parameters (fmax, dG, fmin). Note: these are the parameters that are shared across all clusters. The initial guess for the fmax will be estimated per cluster (by the maximum value of the fluorescence), and so is set to NaN.
fmax koff fmin fmax_stde koff_stde fmin_stde rsq exit_flag rmse
M00653:72:000000000-AKPP5:1:2110:15278:23194 0.155019 0.000183 3.101868e-02 0.017777 0.000151 0.044081 0.587036 1 0.133084
M00653:72:000000000-AKPP5:1:2110:13233:12911 2.781173 0.000667 9.853043e-02 0.078231 0.000042 0.037590 0.987463 3 0.399757
M00653:72:000000000-AKPP5:1:2110:8320:10862 1.855716 0.000155 2.997809e-01 0.070028 0.000049 0.232296 0.922999 1 0.544643
M00653:72:000000000-AKPP5:1:2110:16526:20332 1.493321 0.000225 1.262401e-10 0.000000 0.000000 0.000000 0.966848 1 0.362984
M00653:72:000000000-AKPP5:1:2110:12234:20330 1.756226 0.000397 9.702027e-03 0.072196 0.000047 0.056499 0.969505 1 0.434580
- This script can also be used to fit on rates: simply set the
--fittype on
. - This script can accommodate a photobleaching correction.
- photobleaching correction requires setting the percentage remaining after a round of imaging (i.e.
--pb_correct 0.985
) will assume a 1.5% loss per image. - It also requires some indication of how the columns in the CPseries file correlate to the number of images taken. This is assumed to be in a dict format, similar to the time dict input, but has to be manually saved and then specified with the
-id
or--image_n_dict
input. - If the image n dict is not supplied but the pb_correct is, it assumes that the columns each represent a sequential image.
- photobleaching correction requires setting the percentage remaining after a round of imaging (i.e.
For single clusters associated with a variant, bootstrap the fit koff (or kobs) to obtain 95% confidence intervals.
normbasename=offRates/AKPP5_ALL_Bottom_filtered_reduced_normalized
python -m bootStrapFitFile -cf $normbasename.CPfitted.pkl -a anyRNA.CPannot.pkl -p koff
Note:
- You can change the bootstrapped parameter to be any column in the CPfitted file (i.e.
-p kobs
for on rates,-p dG
for a CPfitted file from a binding series. - Default is to bootstrap the medians for 1000 samples.
Note: this one was done on a subset, so there is only 1 cluster at most per variant.
fmax_init koff_init fmin_init numTests fitFraction pvalue numClusters fmax_lb fmax fmax_ub koff_lb koff koff_ub fmin_lb fmin fmin_ub rsq numIter flag
variant_number
0 0.6103634 0.1077054 0.03575844 1 NaN NaN NaN NaN 0.6103634 NaN NaN 0.1077054 NaN NaN 0.03575844 NaN 0.994281 1000 NaN
77 0.3455767 0.001434442 0.07060204 1 1 0.25 NaN NaN 0.3455767 NaN NaN 0.001434442 NaN NaN 0.07060204 NaN 0.6348465 1000 NaN
230 0.7296927 0.02572861 0.02724182 1 NaN NaN NaN NaN 0.7296927 NaN NaN 0.02572861 NaN NaN 0.02724182 NaN 0.99382 1000 NaN
300 2.677065 0.0008283294 0.1461111 1 1 0.25 NaN NaN 2.677065 NaN NaN 0.0008283294 NaN NaN 0.1461111 NaN 0.9773668 1000 NaN
352 0.02609393 0.0003054129 0.2038353 1 NaN NaN NaN NaN 0.02609393 NaN NaN 0.0003054129 NaN NaN 0.2038353 NaN 0.3989996 1000 NaN