Script for running colonic gene expression / clustering analysis. Clusters differentially expressed genes, then performs enrichment analysis to identify functions of clusters.
Differentially expressed genes are identified using DeSeq2. Comparisons across all groups are done, and genes with log2Fold changes >.5 and adjusted pvalues < .05 are recorded. The significant genes are then selected, and their counts scaled. Using the scaled counts, spearman correlation is performed to identify pairwise relationships between genes and samples. These were visualized via hierarchical clustering. Gene were clustered via k-means using the scaled expression level for each gene. The number of clusters chosen via the wss and the elbow method. This resulted in 3 clusters of genes. The function of each cluster was determined using enrichment analysis with cluster profiler.
Script also produces several handy visualizations for the expresssion of the clusters temporally, function of clusters, and a heatmap with the clusters of genes and samples decorating the margins.