Skip to content

Commit 054c01e

Browse files
authored
Add files via upload
1 parent 490fa6e commit 054c01e

File tree

2 files changed

+68
-64
lines changed

2 files changed

+68
-64
lines changed

README.Rmd

+21-14
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ knitr::opts_chunk$set(
1414
options(tibble.print_min = 5, tibble.print_max = 5)
1515
```
1616

17-
# SmCCNet: A Comprehensive Tool for Multi-Omics Network Inference <a href="https://liux4283.github.io/SmCCNet"><img src="vignettes/figures/logo.jpg" align="right" height="138" /></a>
17+
# SmCCNet: A Comprehensive Tool for Multi-Omics Network Inference <a href=""><img src="vignettes/figures/logo.jpg" align="right" height="98" /></a>
1818

1919
<!-- badges: start -->
2020
[![CRAN status](https://www.r-pkg.org/badges/version/SmCCNet)](https://cran.r-project.org/web/packages/SmCCNet/index.html)
@@ -31,29 +31,37 @@ options(tibble.print_min = 5, tibble.print_max = 5)
3131

3232

3333

34-
SmCCNet is a framework that adeptly integrates single or multiple omics data types along with a quantitative or binary phenotype of interest. It offers a streamlined setup process that can be tailored manually or configured automatically, ensuring a flexible and user-friendly experience. The algorithm is based on sparse multiple canonical analysis (SmCCA) and is designed for \(T\) omics data types \(X_1, X_2, ..., X_T\) along with a quantitative phenotype \(Y\). SmCCA identifies canonical weights \(w_1, w_2, ..., w_T\) that maximize the sum of pairwise canonical correlations between the omics data and \(Y\), subject to certain constraints. In SmCCNet, LASSO (Least Absolute Shrinkage and Selection Operator) is used as the sparsity constraint function.
34+
SmCCNet is a framework designed for integrating one or multiple types of omics data with a quantitative or binary phenotype. It's based on the concept of sparse multiple canonical analysis (SmCCA) and sparse partial least squared discriminant analysis (SPLSDA) and aims to find relationships between omics data and a specific phenotype. The framework uses LASSO (Least Absolute Shrinkage and Selection Operator) for sparsity constraints, allowing it to identify significant features within the data.
3535

36-
The algorithm can operate in both weighted and unweighted modes, depending on whether \(a_{i,j}\) and \(b_i\) (scaling factors) are equal or not. When \(a_{i,j}\) and \(b_i\) are not all equal, it corresponds to the weighted version; otherwise, it corresponds to the unweighted version, where \(a_{i,j} = b_i = 1\) for all \(i\) and \(j\).
36+
The algorithm has two modes: weighted and unweighted. In the weighted mode, it uses different scaling factors for each data type, while in the unweighted mode, all scaling factors are equal. The choice of mode affects how the data is analyzed and interpreted.
3737

38-
The sparsity penalties \(c_t\) determine the number of features included in each subnetwork. SmCCNet follows a workflow that involves creating a network similarity matrix using SmCCA canonical weights from repeated subsampled omics data and the phenotype. It then identifies multi-omics modules relevant to the phenotype. The subsampling scheme enhances network robustness by analyzing a subset of omics features multiple times and aggregating results from each subsampling step.Below are the four steps of SmCCNet workflow
38+
SmCCNet's workflow consists of four main steps:
3939

40+
**Determine Sparsity Penalties**: The user selects sparsity penalties for omics feature selection, either based on study needs, prior knowledge, or through a K-fold cross-validation procedure. This step ensures the selection of features is generalizable and avoids overfitting.
4041

41-
- Step I: Determine SmCCA sparsity penalties $c_t$. The user can select the penalties for omics feature selection based on the study purpose and/or prior knowledge. Alternatively, one can pick sparsity penalties based on a K-fold cross validation (CV) procedure that minimizes the total prediction error. The K-fold CV procedure ensures selected penalties to be generalizable to similar independent data sets and prevents over-fitting.
42-
- Step II: Randomly subsample omics features without replacement, apply SmCCA with chosen penalties, and compute a feature relationship matrix for each subset. Repeat the process many times and define the similarity matrix to be the average of all feature relationship matrices.
43-
- Step III: Apply hierarchical tree cutting to the similarity matrix to find the multi-omics networks. This step simultaneously identifies multiple subnetworks.
44-
- Step Iv: Prune and summarize each network with our network pruning algorithm.
42+
**Subsample and Apply SmCCA**: Omics features are randomly subsampled and analyzed using SmCCA with the chosen penalties. This process is repeated multiple times to create a feature relationship matrix, which is then averaged to form a similarity matrix.
43+
44+
**Identify Multi-Omics Networks**: The similarity matrix is analyzed using hierarchical tree cutting to identify multiple subnetworks that are relevant to the phenotype.
45+
46+
**Prune and Summarize Networks**: Finally, the identified networks are pruned and summarized using a network pruning algorithm, refining the results to highlight the most significant findings.
4547

4648
# SmCCNet Key Features
4749

50+
There are three major computational algorithms that are used for difrerent number of datasets and phenotype modalities:
51+
52+
- Sparse Multiple Canonical Correlation Analysis (SmCCA)
53+
- Sparse Partial Least Squared Discriminant Analysis (SPLSDA)
54+
- Sparse Canonical Correlation Analysis (SCCA)
55+
4856
Unlock the Power of SmCCNet with These Key Features:
4957

5058
- 🧬 **Multi-Omics Network Inference**
51-
- With Quantitative Phenotype
52-
- With Binary Phenotype
59+
- With Quantitative Phenotype (SmCCA)
60+
- With Binary Phenotype (SmCCA + SPLSDA)
5361

5462
- 📊 **Single-Omics Network Inference**
55-
- With Quantitative Phenotype
56-
- With Binary Phenotype
63+
- With Quantitative Phenotype (SCCA)
64+
- With Binary Phenotype (SPLSDA)
5765

5866
- 🚀 **Automation Simplified**
5967
- Automated SmCCNet with a Single Line of Code
@@ -69,11 +77,10 @@ The final network generated from SmCCNet can be visualized in two ways:
6977

7078

7179

72-
7380
## General Workflow
7481

7582

76-
```{r,echo = FALSE,out.width='90%'}
83+
```{r,echo = FALSE,out.width='100%'}
7784
knitr::include_graphics("vignettes/figures/smccnetworkflow.jpg")
7885
```
7986

README.md

+47-50
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11

22
<!-- README.md is generated from README.Rmd. Please edit that file -->
33

4-
# SmCCNet: A Comprehensive Tool for Multi-Omics Network Inference <a href="https://liux4283.github.io/SmCCNet"><img src="vignettes/figures/logo.jpg" align="right" height="138" /></a>
4+
# SmCCNet: A Comprehensive Tool for Multi-Omics Network Inference <a href=""><img src="vignettes/figures/logo.jpg" align="right" height="98" /></a>
55

66
<!-- badges: start -->
77

@@ -22,61 +22,58 @@ status](https://www.r-pkg.org/badges/version/SmCCNet)](https://cran.r-project.or
2222
2323
## Overview
2424

25-
SmCCNet is a framework that adeptly integrates single or multiple omics
26-
data types along with a quantitative or binary phenotype of interest. It
27-
offers a streamlined setup process that can be tailored manually or
28-
configured automatically, ensuring a flexible and user-friendly
29-
experience. The algorithm is based on sparse multiple canonical analysis
30-
(SmCCA) and is designed for $T$ omics data types $X_1, X_2, ..., X_T$
31-
along with a quantitative phenotype $Y$. SmCCA identifies canonical
32-
weights $w_1, w_2, ..., w_T$ that maximize the sum of pairwise canonical
33-
correlations between the omics data and $Y$, subject to certain
34-
constraints. In SmCCNet, LASSO (Least Absolute Shrinkage and Selection
35-
Operator) is used as the sparsity constraint function.
36-
37-
The algorithm can operate in both weighted and unweighted modes,
38-
depending on whether $a_{i,j}$ and $b_i$ (scaling factors) are equal or
39-
not. When $a_{i,j}$ and $b_i$ are not all equal, it corresponds to the
40-
weighted version; otherwise, it corresponds to the unweighted version,
41-
where $a_{i,j} = b_i = 1$ for all $i$ and $j$.
42-
43-
The sparsity penalties $c_t$ determine the number of features included
44-
in each subnetwork. SmCCNet follows a workflow that involves creating a
45-
network similarity matrix using SmCCA canonical weights from repeated
46-
subsampled omics data and the phenotype. It then identifies multi-omics
47-
modules relevant to the phenotype. The subsampling scheme enhances
48-
network robustness by analyzing a subset of omics features multiple
49-
times and aggregating results from each subsampling step.Below are the
50-
four steps of SmCCNet workflow
51-
52-
- Step I: Determine SmCCA sparsity penalties $c_t$. The user can select
53-
the penalties for omics feature selection based on the study purpose
54-
and/or prior knowledge. Alternatively, one can pick sparsity penalties
55-
based on a K-fold cross validation (CV) procedure that minimizes the
56-
total prediction error. The K-fold CV procedure ensures selected
57-
penalties to be generalizable to similar independent data sets and
58-
prevents over-fitting.
59-
- Step II: Randomly subsample omics features without replacement, apply
60-
SmCCA with chosen penalties, and compute a feature relationship matrix
61-
for each subset. Repeat the process many times and define the
62-
similarity matrix to be the average of all feature relationship
63-
matrices.
64-
- Step III: Apply hierarchical tree cutting to the similarity matrix to
65-
find the multi-omics networks. This step simultaneously identifies
66-
multiple subnetworks.
67-
- Step Iv: Prune and summarize each network with our network pruning
68-
algorithm.
25+
SmCCNet is a framework designed for integrating one or multiple types of
26+
omics data with a quantitative or binary phenotype. It’s based on the
27+
concept of sparse multiple canonical analysis (SmCCA) and sparse partial
28+
least squared discriminant analysis (SPLSDA) and aims to find
29+
relationships between omics data and a specific phenotype. The framework
30+
uses LASSO (Least Absolute Shrinkage and Selection Operator) for
31+
sparsity constraints, allowing it to identify significant features
32+
within the data.
33+
34+
The algorithm has two modes: weighted and unweighted. In the weighted
35+
mode, it uses different scaling factors for each data type, while in the
36+
unweighted mode, all scaling factors are equal. The choice of mode
37+
affects how the data is analyzed and interpreted.
38+
39+
SmCCNet’s workflow consists of four main steps:
40+
41+
**Determine Sparsity Penalties**: The user selects sparsity penalties
42+
for omics feature selection, either based on study needs, prior
43+
knowledge, or through a K-fold cross-validation procedure. This step
44+
ensures the selection of features is generalizable and avoids
45+
overfitting.
46+
47+
**Subsample and Apply SmCCA**: Omics features are randomly subsampled
48+
and analyzed using SmCCA with the chosen penalties. This process is
49+
repeated multiple times to create a feature relationship matrix, which
50+
is then averaged to form a similarity matrix.
51+
52+
**Identify Multi-Omics Networks**: The similarity matrix is analyzed
53+
using hierarchical tree cutting to identify multiple subnetworks that
54+
are relevant to the phenotype.
55+
56+
**Prune and Summarize Networks**: Finally, the identified networks are
57+
pruned and summarized using a network pruning algorithm, refining the
58+
results to highlight the most significant findings.
6959

7060
# SmCCNet Key Features
7161

62+
There are three major computational algorithms that are used for
63+
difrerent number of datasets and phenotype modalities:
64+
65+
- Sparse Multiple Canonical Correlation Analysis (SmCCA)
66+
- Sparse Partial Least Squared Discriminant Analysis (SPLSDA)
67+
- Sparse Canonical Correlation Analysis (SCCA)
68+
7269
Unlock the Power of SmCCNet with These Key Features:
7370

7471
- 🧬 **Multi-Omics Network Inference**
75-
- With Quantitative Phenotype
76-
- With Binary Phenotype
72+
- With Quantitative Phenotype (SmCCA)
73+
- With Binary Phenotype (SmCCA + SPLSDA)
7774
- 📊 **Single-Omics Network Inference**
78-
- With Quantitative Phenotype
79-
- With Binary Phenotype
75+
- With Quantitative Phenotype (SCCA)
76+
- With Binary Phenotype (SPLSDA)
8077
- 🚀 **Automation Simplified**
8178
- Automated SmCCNet with a Single Line of Code
8279

@@ -95,7 +92,7 @@ The final network generated from SmCCNet can be visualized in two ways:
9592

9693
## General Workflow
9794

98-
<img src="vignettes/figures/smccnetworkflow.jpg" width="90%" />
95+
<img src="vignettes/figures/smccnetworkflow.jpg" width="100%" />
9996

10097
## Multi-Omics SmCCNet with Quantitative Phenotype
10198

0 commit comments

Comments
 (0)