Skip to content

Commit

Permalink
code cleanup & small changes in stats calc
Browse files Browse the repository at this point in the history
  • Loading branch information
riasc committed Jun 17, 2024
1 parent 060ff80 commit 6f82fc3
Show file tree
Hide file tree
Showing 14 changed files with 411 additions and 75 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ jobs:
- name: extract version
id: extract_version
run: |
VERSION=$(grep 'ARG VERSION=' Dockerfile | cut -d'=' -f2)
VERSION=${GITHUB_REF#refs/tags/}
echo "::set-output name=VERSION::$VERSION"
- name: build and push
Expand Down
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,18 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

# [0.2.2]

## Features

- add test data

## Fix

- Code cleanup
- Fix in writing to stats.txt that cause overwriting in different subcalls


# [0.2.1]

## Features
Expand Down
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
cmake_minimum_required(VERSION 3.22.1)
project(RNAnue VERSION 0.2.1)
project(RNAnue VERSION 0.2.2)
set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED True)
set(CMAKE_CXX_FLAGS -fopenmp)
Expand Down
21 changes: 10 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[![docker-release](https://github.com/Ibvt/RNAnue/actions/workflows/docker.yml/badge.svg)](https://github.com/Ibvt/RNAnue/actions/workflows/docker.yml)

# RNAnue - 0.2.1
# RNAnue - 0.2.2

## About
RNAnue is a comprehensive analysis to detect RNA-RNA interactions from Direct-Duplex-Detection (DDD) data.
Expand Down Expand Up @@ -126,9 +126,8 @@ columns are defined in the following:

### Interaction table

The `analysis` procedure generates `_interactions` files for each library in
which each line represents an annotated split read that is mapped to a
transcript interaction. The fields are defined as follows:
The `analysis` procedure generates `_interactions` files for each library in which each line represents an annotated
split read that is mapped to a transcript interaction. The fields are defined as follows:

| Field | Description |
| ----- | ----------- |
Expand Down Expand Up @@ -157,11 +156,9 @@ transcript interaction. The fields are defined as follows:
| mfe | Hybridisation energy of the interaction |
| mfe_struc | Minimum free energy (MFE) structure of interaction in dot-bracket notation |

The main result of an RNAnue analysis are transcript interactions.
They are stored in the file `allints.txt` in the same directory.
Its entries are structured as described in the following where
columns with prefix <sample> are given for each sample specified in
the analysis (within the same file).
The main result of an RNAnue analysis are transcript interactions. They are stored in the file `allints.txt` in the
same directory. Its entries are structured as described in the following where columns with prefix <sample> are given
for each sample specified in the analysis (within the same file).

| Field | Description |
|-----------------------| ----------- |
Expand All @@ -182,10 +179,12 @@ in JSON graph format. Finally, –stats set to 1 creates a `stats.txt` file that
each step of the analysis.

### Docker
In additon, we provide a ready-to-use Docker container that has RNAnue preconfigured.
https://hub.docker.com/repository/docker/cobirna/rnanue
In additon, we provide a ready-to-use [Docker container](https://hub.docker.com/repository/docker/cobirna/rnanue) that
has RNAnue preconfigured.

### Testing

We provide a test dataset in the [test](./test/data/) folder that can be used to test the installation.

# Troubleshooting
contact [email protected] or create an issue
21 changes: 17 additions & 4 deletions include/Analysis.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,24 +13,27 @@
#include <boost/program_options.hpp>
#include <boost/filesystem.hpp>
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>
#include <boost/accumulators/accumulators.hpp>
#include <boost/accumulators/statistics.hpp>
#include <boost/math/distributions/binomial.hpp>

#include <boost/math/distributions/chi_squared.hpp>

// SeqAn3
#include <seqan3/io/sam_file/all.hpp>
#include <seqan3/core/debug_stream.hpp>

// Class
#include "IBPTree.hpp"
#include "Stats.hpp"

// define tags
using seqan3::operator""_tag;

namespace po = boost::program_options;
namespace fs = boost::filesystem;
namespace pt = boost::property_tree;
namespace jp = boost::property_tree::json_parser;
namespace ma = boost::math;

class Analysis {
Expand All @@ -44,10 +47,12 @@ class Analysis {
void normalize(); // normalize the frequencies to 1

// write output files (of the analysis)
void writeStats();
void writeInteractionsHeader(std::ofstream& fout);
void writeAllIntsHeader(std::ofstream& fout);
void addToAllIntsHeader(std::ofstream& fout, std::string key);
void writeAllIntsHeader(std::vector<int> condLastFlag, std::ofstream& fout);
void writeAllInts();
void writeAllIntsCounts();
void writeAllIntsJGF();

// other operations
void addToFreqMap(std::pair<std::string,std::string> key, double value);
Expand All @@ -59,19 +64,27 @@ class Analysis {
double calcGCS(std::vector<double>& complementarities);
double calcGHS(std::vector<double>& hybenergies);
double calcStat(dtp::IntKey key, int x);
double calcAdjusted(std::vector<double>& values);

private:
po::variables_map params;
IBPTree features;
std::map<std::pair<std::string,std::string>,double> freq; // strand, name
std::string condition; // buffers the current condition
// maps for storing filters and suppreads
std::vector<std::string> conditions; // buffers all conditions

// maps for storing filters and suppreads (and other information)
std::map<dtp::IntKey, std::vector<double>> suppreads;
std::map<dtp::IntKey, std::vector<std::vector<double>>> complementarities;
std::map<dtp::IntKey, std::vector<std::vector<double>>> hybenergies;

// Stats
std::shared_ptr<Stats> stats;

int repcount; // counter for current replicate
int readcount; // total number of reads
std::vector<int> repcountCond; // number of replicates per condition

};

#endif //RNANUE_ANALYSIS_HPP
8 changes: 6 additions & 2 deletions include/DataTypes.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -75,9 +75,13 @@ namespace dtp {
int alignedCount;
int splitsCount;
int multSplitsCount;
int nSurvivedCount;
int interactionsCount;
StatsFields() : readsCount(0), alignedCount(0), splitsCount(0), multSplitsCount(0), interactionsCount(0) {}
StatsFields(int readsCount, int alignedCount, int splitsCount, int multSplitsCount) :
readsCount(readsCount), alignedCount(alignedCount), splitsCount(splitsCount),
multSplitsCount(multSplitsCount), interactionsCount(0) {} // constructor for analysis class
};
using StatsMap = std::map<std::string, StatsFields>;
using StatsMap = std::map<std::string, std::vector<StatsFields>>;
using SpliceJunctions = std::map<std::string, std::vector<std::pair<size_t,size_t>>>;

// Analysis
Expand Down
3 changes: 2 additions & 1 deletion include/SplitReadCalling.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -111,13 +111,14 @@ class SplitReadCalling {
void addComplementarityToSamRecord(SAMrecord &rec1, SAMrecord &rec2, TracebackResult &res);
void addHybEnergyToSamRecord(SAMrecord &rec1, SAMrecord &rec2, double &hyb);
void writeSAMrecordToBAM(auto& bamfile, std::vector<std::pair<SAMrecord, SAMrecord>>& records);
void writeStats();


private:
po::variables_map params;
IBPTree features;
//Stats stats;
std::shared_ptr<Stats> stats;
int replPerCond; // number of replicates per condition
std::string condition; // stores the current condition
std::deque<std::string> refIds; // stores the reference ids
FilterScores filterScores;
Expand Down
14 changes: 9 additions & 5 deletions include/Stats.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ namespace fs = boost::filesystem;
class Stats {
public:
Stats();
Stats(std::string statsFile);

// Move constructor / no needed because its unique
Stats(Stats&& other) noexcept : stats(other.stats) {}
Expand All @@ -36,13 +37,16 @@ class Stats {
Stats& operator=(const Stats&) = delete;

// getter & setter
void setReadsCount(std::string condition, int increment);
void setAlignedCount(std::string condition, int increment);
void setSplitsCount(std::string condition, int increment);
void setMultSplitsCount(std::string condition, int increment);
void setReadsCount(std::string condition, int repl, int increment);
void setAlignedCount(std::string condition, int repl, int increment);
void setSplitsCount(std::string condition, int repl, int increment);
void setMultSplitsCount(std::string condition, int repl, int increment);
void setInteractionsCount(std::string condition, int repl, int increment);

void reserveStats(std::string condition, int repl); // creates new entry for replicate

// write stats back to file
void writeStats(fs::path outdir);
void writeStats(fs::path outdir, std::string subcall);

private:
dtp::StatsMap stats;
Expand Down
5 changes: 5 additions & 0 deletions include/Utility.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
#include <iostream>
#include <iomanip>
#include <chrono>
#include <random>


// Boost
#include <boost/filesystem.hpp>
Expand Down Expand Up @@ -42,10 +44,13 @@ namespace helper {
bool withinRange(int a, int b, int range);
std::string removeNonPrintable(const std::string str);
std::string getTime(); // reports the current time

std::vector<int> lastOccFlag(std::vector<std::string>& vec);
}

namespace stats {
double median(std::vector<double>& values);
double randNum(double min, double max);
}

// sequence input/output
Expand Down
Loading

0 comments on commit 6f82fc3

Please sign in to comment.