Skip to content

Commit

Permalink
added filters in alignment results
Browse files Browse the repository at this point in the history
  • Loading branch information
Richard A. Schäfer authored and Richard A. Schäfer committed Sep 24, 2020
1 parent 22f6dc6 commit 404b25a
Show file tree
Hide file tree
Showing 7 changed files with 661 additions and 72 deletions.
39 changes: 19 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,32 @@
# RNAnue - 0.0.1

## About

RNAnue is a comprehensive analysis to detect RNA-RNA interactions from Direct-Duplex-Detection (DDD) data.

## Install

### Dependencies

RNAnue has the following dependencies, whereas the brackets indicate the version RNAnue has
been build and tested on. Make sure the requirements are satified by your system.

* [Boost C++ Libraries](https://www.boost.org/)
* [Boost C++ Libraries](https://www.boost.org/) (v1.7.2)
* [SeqAn](https://github.com/seqan/seqan3) (v3.0.2)
* [Segemehl](http://www.bioinf.uni-leipzig.de/Software/segemehl/) (v0.3.4)


### CMake

CMake is a cross-platform Makefile generator. For that, we provide the [CMakeLists](./CMakeLists.txt)
to simplify the build process. In particular, it utilizes the instructions given in the CMakeLists

to simplify the build process. In particular, it utilizes the instructions given in the CMakeLists.
It is recommended to create a "out-of-source build". For that, create a build folder (e.g., ./bin)
and cmake into the root directory.

```
cmake ..\
cmake ../source/
```
This should be sufficient if the dependencies are located in $PATH. Calling `make` builds RNAnue.

## Usage

In principle, the parameters of RNAnue can be specified on the command line.
However

In principle, the parameters of RNAnue can be specified on the command line. However
### Positional Arguments
RNAnue provides different functional arguments for individual procedures.
RNAnue provides different functional arguments for individual procedures. These include `RNAnue preproc`,
`RNAnue align`, `RNAnue clustering`, `RNAnue analysis`.

## Parameters
RNAnue accepts parameter settings both from the commandline and through a configuration file.
Expand All @@ -44,13 +36,20 @@ is reduced to the following call.
```
RNAnue subcall --config /path/to/params.cfg
```
However,
In any case, the specifying parameters over the command lines has precedence over the config file.

### Availability
### Docker
In additon, we provide a ready-to-use Docker container that has RNAnue preconfigured.
https://hub.docker.com/repository/docker/cobirna/rnanue

### Preproc

## Usage
### Align

### Clustering

### Analysis

### Configuration
RNAnue provides several possi
## Output

# Troubleshooting
77 changes: 77 additions & 0 deletions include/Align.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
// boost
#include <boost/program_options.hpp>
#include <boost/property_tree/ptree.hpp>
#include <boost/filesystem.hpp>

#include <iostream>
#include <sstream>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <regex>

#include <seqan3/io/alignment_file/all.hpp>
#include <seqan3/std/filesystem>
#include <seqan3/io/alignment_file/sam_tag_dictionary.hpp>
#include <seqan3/alphabet/nucleotide/dna5.hpp>

#include <seqan3/alphabet/cigar/cigar.hpp>
#include <seqan3/alphabet/cigar/cigar_op.hpp>

#include <bitset>

namespace pt = boost::property_tree;
namespace po = boost::program_options;
namespace fs = boost::filesystem;

using seqan3::operator""_tag;
using seqan3::operator""_cigar_op;
using seqan3::operator""_dna5;
using seqan3::get;

// overload struct to
template <> struct seqan3::sam_tag_type<"XX"_tag> { using type = int32_t; };
template <> struct seqan3::sam_tag_type<"XY"_tag> { using type = int32_t; };
template <> struct seqan3::sam_tag_type<"XJ"_tag> { using type = int32_t; };
template <> struct seqan3::sam_tag_type<"XH"_tag> { using type = int32_t; };

typedef std::pair<uint32_t,uint32_t> ReadPos;
typedef std::pair<uint64_t,uint64_t> GenomePos;
typedef std::vector<seqan3::cigar> CigarSplt;


typedef std::vector<
std::tuple<
std::string,
seqan3::sam_flag,
std::optional<int32_t>,
std::optional<int32_t>,
std::vector<seqan3::cigar>,
seqan3::dna5_vector,
seqan3::sam_tag_dictionary>> Splts;

class Align {
private:
po::variables_map params;
std::string index;


public:
// constructor
Align(po::variables_map params);
Align();

void buildIndex();
void alignReads(std::string query, std::string matched, std::string splits);
void detSplits(std::string matched, std::string splits);

double hybridize(std::string rna1, std::string rna2);
double complementarity(std::string rna1, std::string rna2);

void processSplits(auto &splitrecords, auto &splitsfile);
std::vector<seqan3::dna5> spanToVec(std::span<seqan3::dna5,-1> seq);

void constructIndex();
void start(pt::ptree sample);
};

4 changes: 4 additions & 0 deletions include/Data.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,9 @@
#include <boost/property_tree/json_parser.hpp>
#include <boost/foreach.hpp>


#include "SeqRickshaw.hpp"
#include "Align.hpp"

namespace po = boost::program_options;
namespace fs = boost::filesystem;
Expand Down Expand Up @@ -40,6 +42,7 @@ class Data {
Data(po::variables_map _params);

// getter & setter
//
pt::ptree getDataStructure();

//
Expand Down Expand Up @@ -69,6 +72,7 @@ class Data {
void bla(Callable f);

void preproc();
void align();
};

#endif // DATA_HPP
13 changes: 5 additions & 8 deletions include/SeqRickshaw.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@
#include <seqan3/range/views/char_to.hpp>
#include <seqan3/range/views/slice.hpp>

#include <seqan3/std/ranges>

#include <seqan3/range/views/all.hpp>
#include <seqan3/std/ranges> // std::ranges::copy

Expand Down Expand Up @@ -85,32 +87,27 @@ class SeqRickshaw {
std::size_t calcReadPos(auto& sequence, std::size_t& left, std::pair<std::size_t,std::size_t>& right);


void panic(State state, std::string pattern, char mismatch, int readPos);

void smallestShift(std::string pattern, std::string suffix, int left);

int transition(std::string pattern, std::string suffix, int readPos, std::size_t& left, std::pair<std::size_t,std::size_t>& right);

int extendblock(std::string pattern, std::string suffix, int& left);

void merging(auto fwd, auto rev);

int reoccurrence(std::string pattern, std::string suffix, std::size_t& left);

std::string longestCommonSubstr(std::string forward, std::string reverse);




// helper
int addState(States &states, State state, States::size_type &size);
int nextReadPos(std::string state, int currReadPos);
std::pair<int,int> countConsecutiveMatches(std::string stateSubstr, int readPos);
// finds all occurrences of substring in string
void findAllOcc(std::vector<std::size_t>& fnd, std::string str, std::string substr);
void writeLookupTable(std::ofstream &os);

void preprocPattern();


void writeLookupTable(std::ofstream &os);


std::size_t boyermoore(auto& read, LookupTable tab, int patlen);
Expand Down
Loading

0 comments on commit 404b25a

Please sign in to comment.