Skip to content
sjkdenny edited this page Aug 18, 2016 · 28 revisions

When you are done with the binding curve or offrate pipelines, you may wish to plot your fit curves to evaluate how good they look.

Two scripts are available in the array_fitting_tools/plotscripts folder that should help you plot binding curves or offrate curves (plotBindingCurves and plotOffrateCurves, respectively). These scripts will plot the per-cluster or per-variant fits, depending on the user inputs.

Note

These scripts allow you to plot things from the command line. In practice, you may wish to load files in an ipython notebook for example, and dynamically plot a bunch of things. The script essentially just loads files and initiates a class that has the plotting function. Please feel free to use this class in any context you feel is more appropriate. Even better is if you document here how you did it, so other people can do the same!

Plot description

For all plots shown below, the black dots represent the median fluorescence across single cluster at that x, the error bars represent the 95% confidence intervals on the median fluorescence. The red line indicates the fit. The gray shaded area represents the 95% confidence interval on the fit including bound on fmin and fmax.

Plot binding curves

Plot fits per variant

Let's say you have a CPannot file that looks like the following:

                                             variant_number
clusterID
M00653:72:000000000-AKPP5:1:2101:19526:15124              0
M00653:72:000000000-AKPP5:1:2102:4005:7276                0
M00653:72:000000000-AKPP5:1:2102:14626:8517               0
M00653:72:000000000-AKPP5:1:2102:19935:11152              0
M00653:72:000000000-AKPP5:1:2102:23710:15626              0

You would like to plot the final fit of variant 0, after the bootstrapping has occurred. Enter:

basename=bindingCurves/AKPP5_ALL_Bottom_filtered_reduced_normalized
python -m plotBindingCurves -f $basename.CPvariant -cs $basename.CPseries.pkl  -v 0 -an anyRNA.CPannot.pkl -c concentrations.txt -out fitsPlotted --annotate

In the directory fitsPlotted, you should have a file called binding_curve.0.pdf that looks like:

If you wanted to do multiple variants, i.e. variants 55 and 176 in addition to variant 0 (I'm choosing these to show a range of Kds), you can enter them space-separated in the -vn input:

basename=bindingCurves/AKPP5_ALL_Bottom_filtered_reduced_normalized
python -m plotBindingCurves -f $basename.CPvariant -cs $basename.CPseries.pkl  -v 0 55 176 -an anyRNA.CPannot.pkl -c concentrations.txt -out fitsPlotted --annotate

More plots are in the fitsPlotted directory:

55 176

Note that the fit on the left (variant 55) indicates that fmax was enforced from the distribution of fmaxes.

Plot fits per single cluster

If instead you would like to plot each of the individual cluster fits, you can run the same command but with slightly different inputs:

  • The -f or --variant_file input should be changed to the CPfitted file that was the output of the singleClusterFits script.
  • The -v or --variant_number input should be changed to the clusterID(s) that you want to plot.
  • The -an or --annotated_clusters input should not be provided.

To plot four of the single cluster fits of variant 0:

basename=bindingCurves/AKPP5_ALL_Bottom_filtered_reduced_normalized
python -m plotBindingCurves -f $basename.CPfitted.pkl -cs $basename.CPseries.pkl  -v M00653:72:000000000-AKPP5:1:2101:19526:15124 M00653:72:000000000-AKPP5:1:2102:4005:7276 M00653:72:000000000-AKPP5:1:2102:14626:8517 M00653:72:000000000-AKPP5:1:2102:19935:11152 -c concentrations.txt -out fitsPlotted --annotate

Produces the following four plots:

Plot off rate curves

Plot fits per variant

You would like to plot the off rate of variants 0 and 176. Now use the script:

basename=offRates/AKPP5_ALL_Bottom_filtered_reduced
normbasename=offRates/AKPP5_ALL_Bottom_filtered_reduced_normalized
python -m plotOffrateCurves -f $normbasename.CPvariant -cs $normbasename.CPseries.pkl  -v 0 176 -an anyRNA.CPannot.pkl -td offRates/rates.timeDict.p -ts $basename.CPtiles.pkl -out fitsPlotted --annotate

Produces the following output:

It will by default plot all of the tiles on the same plot, each with their own times and errorbars. This can prove difficult to look at and interpret, so there are some options.

  • You can specify to only look at the top N tiles with the most number of clusters (-n N or --numtiles N).
  • You can specify to only look at particular tiles (-t 001 002 or --tile 001 002 to plot tiles 1 and 2).

Plotting top 2 tiles with the most clusters (-n 2):

Plotting just tile 001 (-t 001):

Note: all of the above have the same fit (i.e. red line) regardless of what subset of the data is plotted.

Plot fits per single cluster

Analogously to the binding curve data, you can plot single clusters with the same pipeline.

basename=offRates/AKPP5_ALL_Bottom_filtered_reduced
normbasename=offRates/AKPP5_ALL_Bottom_filtered_reduced_normalized
python -m plotOffrateCurves -f $normbasename.CPvariant -cs $normbasename.CPseries.pkl  -v M00653:72:000000000-AKPP5:1:2101:23567:6964 M00653:72:000000000-AKPP5:1:2111:29470:12239 M00653:72:000000000-AKPP5:1:2116:2661:17489 M00653:72:000000000-AKPP5:1:2104:24780:13535 -td offRates/rates.timeDict.p -ts $basename.CPtiles.pkl -out fitsPlotted --annotate
Clone this wiki locally