-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CDRP dataset as BBBC047 #61
Comments
@N3llz can you let @MarziehHaghighi know the BBBC dataset id for this dataset? I know it isn't going to be up for a while, but we just need a number for now so that she can use it in her Rosetta report. |
Yes, let's go with BBBC047. @MarziehHaghighi @shntnu |
Question from Gabriel Musso
During our investigation, we wanted to look for any systemic biases and evaluate options to normalize accordingly. We first began looking at plate specific effects, and then progressed towards the plate groups identified ( We were hoping that you or someone on your team might be able to help us with the following questions:
|
Here's a summary of the prefixes of all the platemaps # https://raw.githubusercontent.com/broadinstitute/2015_Bray_GigaScience/master/barcode_platemap_25412.csv
plateid_platemap <-
read_csv("barcode_platemap_25412.csv") %>%
rename(platemap = Plate_Map_Name,
plateid = Assay_Plate_Barcode)
plateid_platemap %>% mutate(platemap_prefix = str_extract(platemap, "(^[A-Z]-[A-Z0-9]+)")) %>% count(platemap_prefix, name = "num_platemaps", sort = TRUE) %>% knitr::kable()
Here's a summary of the categories from the data resource paper
Also available is the file cdrp_runs.txt contains the run id of each assay plate and the date on which it was run The data is in this format:
This info was obtained today from Broad's CBIP database via this link (project = |
Thank you @shntnu for looking into this, and further, for finding the date information. It seems like beyond the batch ID alone, the date is highly correlated with the average similarity between controls across plates. For clarity, here’s a brief summary of steps to generate the figure posted below:
|
@nishanthmerwin Thank you for posting your results on this! One thing worth testing is how severe this effect is in the treatment signatures. Systematic effects of this sort (plate-to-plate variation or well position effects) affect DMSO signatures much more than treatment because the former is typically weaker. Here’s what I would test if I suspected that date was driving the effect:
And then test whether the difference in dates between pairs of wells correlates with/predicts the similarity between the pairs. (There are other ways to do this e.g. using random/fixed effects models) |
@shntnu @MarziehHaghighi Are the data complete for this dataset? If so, I'll just need access to the files and for someone to fill out the info on this form. |
@MarziehHaghighi – please go ahead and sort this out over the next couple of weeks because you reference this dataset in the NeurIPS paper |
@shntnu Could you please clarify what should I sort out here? |
Ah, whatever @bledford87 said |
@MarziehHaghighi I saw you submitted the form for this, so now I just need a zip file with the images and ground truth. |
The images actually live elsewhere: no actual day will be uploaded for this
dataset. The form has the details
On Wed, Jun 16, 2021 at 9:17 AM Becki Ledford ***@***.***> wrote:
@MarziehHaghighi <https://github.com/MarziehHaghighi> I saw you submitted
the form for this, so now I just need a zip file with the images and ground
truth.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#61 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJHQPD3EEMZXOZVY3LWGS3TTCP75ANCNFSM4GDKDXAQ>
.
--
-Shantanu
|
Oh interesting, okay thanks! I don't think we've done a BBBC entry like this in the past, linking to the images in another spot, but I'll sort it out next week. |
Indeed, this dataset is unique in that sense.
…On Wed, Jun 16, 2021 at 9:34 AM Becki Ledford ***@***.***> wrote:
Oh interesting, okay thanks! I don't think we've done a BBBC entry like
this in the past, linking to the images in another spot, but I'll sort it
out next week.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#61 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJHQPA4FANIH37BF7VBC5LTTCR6PANCNFSM4GDKDXAQ>
.
|
We might reprocess this dataset sometime this year, to make the profiles compatible with the feature set being used in https://jump-cellpainting.broadinstitute.org/ If/when we do, we will use this issue to document our progress Internal ref: https://broadinstitute.slack.com/archives/C01AF25CQLT/p1643838553629459?thread_ts=1634213872.002500&cid=C01AF25CQLT |
Some notes from the slack msg: |
@AnneCarpenter Per our new protocol, we will make notes here broadinstitute/cellpainting-gallery#13 (comment) |
In case some is wondering: won't it be confusing that this dataset has an entry in both, BBBC and Cell Painting Gallery? Yes! :) But thankfully our Airtable (and Erin/Beth will decide how to make that info public) Going forward, new profiling datasets will exist only in cellpainting-gallery. See #52 (comment) |
I think this is done, yes? |
|
The description can be copied from the abstract of
https://academic.oup.com/gigascience/article/6/12/giw014/2865213
and then link to that page
Additionally, add these notes
The text was updated successfully, but these errors were encountered: