-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathReadme.txt
149 lines (102 loc) · 5.81 KB
/
Readme.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
A toolset for inferring transcriptional rules from a group of co-regulated genes using Bayesian networks
BBNet: (B)eer's (B)ayesian (Net)work
GBNet: (G)ibbs sampling enhanced (B)ayesian (Net)work
--
Li Shen, UCSD, Feb. 11, 2008
This toolset is separated into four programs: func, bayescor, bbnet AND gbnet.
bbnet and gbnet use bayescor's OUTPUT as INPUT.
bayescor, bbnet and gbnet all depend on the functional depth files from func.
******************************************************************************
* func: Prepare the functional depth files for a motif list on all sequences.*
******************************************************************************
INPUT: motif list, PWM folder, genomic sequences file, functional depth folder
and normalization constant file
OUTPUT: all motifs functional depth files in a folder
Arguments:
-m motif list
-w PWM folder where you should put all matrix files
-g a single file contains all promoter sequences (TAB delimited)
-f where all functional depth files will go
-n [optional] a file contains all motifs' normalization constants
Example: func -m motif.list -w pwm -g genome -f func_folder -n norm.txt
************************************************************************************
* bayescor: Calculate the score of each single TF's presence on a cluster of genes.*
************************************************************************************
INPUT: motif list, node gene list, background gene list and folder to store binding information
OUTPUT: single motif score list
Arguments:
-m motif list
-n node gene list
-b background gene list
-f folder to store binding information
-o motif score list
Example: bayescor -m motif.list -n cluster.list -b bkg.list -f folder -o scores.list
*************************************************************************
* bbnet: Learn transcriptional regulatory rules from a cluster of genes.*
*************************************************************************
INPUT: motif score list, node gene list, background gene list, folder to store binding information and logK value
OUTPUT: results and parameter settings of Bayesian network running
Arguments:
-s motif score list
-n node gene list
-b background gene list
-f folder to store binding information
-o results output file
Optional:
-k logK value (network complexity penalization)(default = 5.0)
-c number of candidate motifs (optional, default = 50)
-d positive negative (for prediction, default = NULL)
This parameter supplies two files containing known positive and negative cases. bayesnet will
use the regulatory rules learnt from BN to predict these genes' categories and output TP, FP, TN and FN.
-l all training genes' information of satisfying rules (optional, default = NO output)
-t all genes' translational/transcriptional start sites locations in TAB delimited format (optional, default = right end)
-rb bit-string to determine which rules to include.(Default = 111110)
This parameter can be used to "mask" out certain rules that do not make sense in your situation
Rule order: TSS Orientation Second copy Spacing Order Loop
Default= 1 1 1 1 1 0
-i Use mutual information instead of Bayesian score.(Default = off)
Example: bbnet -s scores.list -n node.list -b bkg.list -f func -k 6.5 -o results_6.5.txt -c 50
*******************************************
* Explanation of motif binding file format*
*******************************************
1. file name: motif.func, Eg. YY1.func
Motif functional depth varies from 0.05 to 0.95 in a step of 0.05. All motifs' binding
information must be stored in files as named above so that the programs can find them.
2. For each binding information file, the file format must follow this:
Gene name\t number of binding sites\t binding site 1\t binding site 2\t ... \t binding site n
Separated by TAB, Example:
Hs.106529 2 F,0.054,772 R,0.97,230
3. For each binding site, the format must follow this:
orientation,binding score,distance to TSS
Separated by comma, Example:
F,0.054,772
That means: the motif is binding in Forward orientation with matrix score 0.054 at 772 bps upstream from TSS.
***************************************************
* gbnet: Gibbs sampler enhanced Bayesian networks.*
***************************************************
GBNet combines Gibbs sampling and Simulated annealing to search in a sequence constraints space
trying to find a transcriptional module that is best supported by the data.
Pros:
- Less prone to local minima than greedy search
- Search exaustively in stochastic fashion
- Increase Bayesian score and find rules that are more meaningful
Cons:
- Large computational cost
The usage of GBNet is almost the same with BBNet. The only difference is that
GBNet uses simulated annealing so you'll need to specify the parameters for it
or GBNet will use the default values.
To specify the parameters for SA:
Use: -sa repeats iterations max_changes alpha initial_temperature
E.g. -sa 30 10 300 0.9 10.0 will set SA to start from initial temperature=10.0
and goes down in exponential fashion by alpha = 0.9. This process will repeat
for at most 30 times. During each repeat, SA will run 10 iterations or make
300 changes to Bayesian network structure, whichever comes first.
Some default parameters:
Starting Temp: 5.0 Temperature at the initial point
Repeat: 20 Number of times that temperature changes
Iteration: 20 Number of iterations at each temperature
Changes: 500 Number of required changes to Bayesian network before
moving to the next temperature
Alpha: 0.9 Temperature change rate
If Bayesian network doesn't make any change under a certain temperature after enough iterations,
the process stops assuming the ground zero status is achieved.