Skip to content

merge Orr's changes into base #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .DS_Store
Binary file not shown.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.Rproj.user
.Rhistory
.RData
.Ruserdata
*.Rproj
25 changes: 16 additions & 9 deletions quality_control/QualityControl.Rmd
Original file line number Diff line number Diff line change
@@ -7,19 +7,25 @@
title: "Quality Control"
output: html_document
params:
PROJECT_ID: "PROJECT_ID"
DATASET_DESCRIPTION: "Brief description of the single-cell dataset."
PROJECT_ID: "scalable-analysis-playground"
DATASET_DESCRIPTION: "Retinal Bipolar Neurons http://www.cell.com/cell/pdf/S0092-8674(16)31007-8.pdf"
# This table must exist.
RAW_DATA_TABLE: "PROJECT_ID_THE_DATA_IS_IN.DATASET_NAME.TABLE_NAME"
RAW_DATA_TABLE: "scalable-analysis-playground.mouse_retinal_bipolar.scrna_seq"
# These tables will be created.
CELL_METRICS_TABLE: "DESTINATION_DATASET_NAME.TABLE_NAME"
PASSING_CELLS_TABLE: "DESTINATION_DATASET_NAME.TABLE_NAME"
GENE_METRICS_TABLE: "DESTINATION_DATASET_NAME.TABLE_NAME"
PASSING_GENES_TABLE: "DESTINATION_DATASET_NAME.TABLE_NAME"
CELL_METRICS_TABLE: "orr_bp.cell_metrics"
PASSING_CELLS_TABLE: "orr_bp.passing_cells"
GENE_METRICS_TABLE: "orr_bp.gene_metrics"
PASSING_GENES_TABLE: "orr_bp.passing_genes"
MT_GENE_TABLE: "orr_bp.gene_table"
PASSING_MT_FRACTION: 0.1
MIN_GENES: 501 # SQL BETWEEN operator is inclusive for range, want > 500 genes/cell
MAX_GENES: 23000
MIN_CELLS: 30
MIN_COUNTS: 60
# Only create the tables if they do not already exist. For other options, see
# https://cloud.google.com/dataflow/model/bigquery-io#writing-to-bigquery
WRITE_DISPOSITION: "WRITE_EMPTY"

CREATE_DISPOSITION: "CREATE_IF_NEEDED"
WRITE_DISPOSITION: "WRITE_TRUNCATE" # CHANGE THIS TO WRITE_EMPTY SO USERS DO NOT OVERWRITE TABLE
# This RMarkdown is a parameterized report. See
# http://rmarkdown.rstudio.com/developer_parameterized_reports.html
# for more detail.
@@ -191,4 +197,5 @@ perform_bqquery(sql_path = "passing_genes.sql",

```{r passing_counts, comment=NA}
perform_bqquery(sql_path = "passing_data_counts.sql")
sessionInfo()
```
516 changes: 516 additions & 0 deletions quality_control/QualityControl.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion quality_control/cell_metrics.sql
Original file line number Diff line number Diff line change
@@ -5,7 +5,7 @@
SELECT
cell,
SUM(trans_cnt) AS alltrans,
SUM(IF(gene LIKE "mt-%",
SUM(IF(gene IN (SELECT gene FROM `{{ MT_GENE_TABLE }}`),
trans_cnt,
0)) AS mttrans,
COUNT(DISTINCT gene) AS gene_cnt
30 changes: 30 additions & 0 deletions quality_control/gene_table.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
gene
mt-Nd3
mt-Tp
mt-Tn
mt-Atp6
mt-Co1
mt-Co2
mt-Tf
mt-Te
mt-Co3
mt-Tl2
mt-Tl1
mt-Tm
mt-Ti
mt-Rnr1
mt-Tw
mt-Ts2
mt-Tv
mt-Nd4
mt-Tc
mt-Nd1
mt-Ty
mt-Rnr2
mt-Ta
mt-Nd6
mt-Tt
mt-Nd2
mt-Cytb
mt-Tq
mt-Nd5
4 changes: 2 additions & 2 deletions quality_control/passing_cells.sql
Original file line number Diff line number Diff line change
@@ -9,5 +9,5 @@ SELECT
FROM
`{{ CELL_METRICS_TABLE }}`
WHERE
.10 > mttrans/alltrans
AND 500 < gene_cnt
{{ PASSING_MT_FRACTION }} > mttrans/alltrans
AND gene_cnt BETWEEN {{ MIN_GENES }} AND {{ MAX_GENES }}
4 changes: 2 additions & 2 deletions quality_control/passing_genes.sql
Original file line number Diff line number Diff line change
@@ -9,5 +9,5 @@ SELECT
FROM
`{{ GENE_METRICS_TABLE }}`
WHERE
30 < cell_cnt
AND 60 < alltrans
{{ MIN_CELLS }} < cell_cnt
AND {{ MIN_COUNTS }} < alltrans
13 changes: 13 additions & 0 deletions scalable_analytics.Rproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Version: 1.0

RestoreWorkspace: Default
SaveWorkspace: Default
AlwaysSaveHistory: Default

EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 2
Encoding: UTF-8

RnwWeave: Sweave
LaTeX: pdfLaTeX