-
Notifications
You must be signed in to change notification settings - Fork 35
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'master' into sandbox-oplatek
Conflicts: .gitignore INSTALL README.txt egs/babel/s5/local/generate_proxy_keywords.sh egs/wsj/s5/steps/train_nnet_cpu.sh egs/wsj/s5/utils/nnet-cpu/make_nnet_config_preconditioned.pl src/Makefile src/configure src/lat/Makefile src/makefiles/cygwin.mk src/makefiles/darwin_10_5.mk src/makefiles/darwin_10_6.mk src/makefiles/darwin_10_7.mk src/makefiles/darwin_10_8.mk src/makefiles/linux_atlas.mk src/makefiles/linux_atlas_64bit.mk src/makefiles/linux_clapack.mk src/makefiles/linux_openblas.mk src/nnet-cpu/mixup-nnet.cc src/nnet-cpu/nnet-component-test.cc src/nnet-cpu/nnet-component.cc src/nnet-cpu/nnet-component.h src/nnet-cpu/nnet-nnet.cc src/nnet-cpu/nnet-nnet.h src/nnet-cpu/nnet-update-parallel.cc src/nnet-cpu/nnet-update-parallel.h src/nnet-cpubin/nnet-train-parallel.cc src/nnet/nnet-pdf-prior.h src/nnetbin/nnet-forward.cc tools/Makefile tools/extras/install_portaudio.sh git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/oplatek@2520 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
- Loading branch information
Showing
155 changed files
with
14,774 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,168 @@ | ||
Installation TIPS for KALDI and installation INSTRUCTIONS for my additional repositories | ||
================================================================================= | ||
Intro | ||
----- | ||
Kaldi has very good instructions and tutorial | ||
for building it from source. It is easy and straightforward. | ||
However, I needed also to build shared libraries | ||
and maybe you will face some of my problems too. | ||
So this is the reasons for writing my building procedure down. | ||
|
||
Installing external dependencies | ||
================================ | ||
See `kaldi-trunk/tools/INSTALL` for info. | ||
Basically it telss you to use `kaldi-trunk/tools/Makefile`, which I used also. | ||
|
||
How have I installed OpenBlas? | ||
---------------------- | ||
Simple enough: | ||
```bash | ||
make openblas | ||
``` | ||
|
||
How have I installed Openfst? | ||
---------------------- | ||
In order to install also shared libraries | ||
I changed the line 37 in | ||
`kaldi-trunk/tools/Makefile` | ||
|
||
```sh | ||
*** Makefile | ||
************ | ||
*** 34,38 **** | ||
|
||
openfst-1.3.2/Makefile: openfst-1.3.2/.patched | ||
cd openfst-1.3.2/; \ | ||
! ./configure --prefix=`pwd` --enable-static --disable-shared --enable-far --enable-ngram-fsts | ||
|
||
--- 34,38 ---- | ||
|
||
openfst-1.3.2/Makefile: openfst-1.3.2/.patched | ||
cd openfst-1.3.2/; \ | ||
! ./configure --prefix=`pwd` --enable-static --enable-shared --enable-far --enable-ngram-fsts | ||
|
||
``` | ||
Than I ran | ||
```bash | ||
make openfst_tgt | ||
``` | ||
|
||
How have I installed PortAudio? | ||
-------------------------- | ||
NOTE: Necessary only for Kaldi online decoder | ||
|
||
In kaldi-trunk/tools/extras/install_portaudio.sh | ||
I changed line | ||
``` | ||
./configure --prefix=`pwd`/install | ||
``` | ||
to | ||
``` | ||
./configure --prefix=`pwd`/install --with-pic | ||
``` | ||
|
||
Then I ran | ||
```bash | ||
extras/install_portaudio.sh | ||
``` | ||
|
||
|
||
How have I built Kaldi? | ||
------------------ | ||
```bash | ||
./configure --openblas-root=`pwd`/../tools/OpenBLAS/install --fst-root=`pwd`/../tools/openfst --static-math=no | ||
``` | ||
|
||
Edit the `kaldi.mk` and add the `-fPIC` flag. | ||
TODO It would be nice to do something like | ||
```bash | ||
EXTRA_CXXFLAGS=-fPIC make | ||
EXTRA_CXXFLAGS=-fPIC make ext | ||
``` | ||
But the local makefiles overrides `EXTRA_CXXFLAGS`. | ||
|
||
If you updated from the svn repository do not forget to run `make depend` | ||
Since by *default it is turned of! I always forget about that!* | ||
``` | ||
# DO NOT FORGET TO CHANGE kaldi.mk TODO SCRIPT IT! | ||
# make depend and make ext_depend are necessary only if dependencies changed | ||
make depend && make ext_depend && make && make ext | ||
``` | ||
|
||
How have I updated Kaldi src code? | ||
---------------------------- | ||
I checkout the kaldi-trunk version. | ||
|
||
[Kaldi install instructions](http://kaldi.sourceforge.net/install.html) | ||
|
||
Note: If you checkout Kaldi before March 2013 you need to relocate svn. See the instructions in the link above! | ||
|
||
|
||
What setup did I use? | ||
-------------------- | ||
In order to use Kaldi binaries everywhere I add them to `PATH`. | ||
In addition, I needed to add `openfst` directory to `LD_LIBRARY_PATH`, I compiled Kaldi dynamically linked against `openfst`. To conclude, I added following lines to my `.bashrc`. | ||
```bash | ||
############# Kaldi ########### | ||
kaldisrc=/net/work/people/oplatek/kaldi/src | ||
export PATH="$PATH":$kaldisrc/bin:$kaldisrc/fgmmbin:$kaldisrc/gmmbin:$kaldisrc/nnetbin:$kaldisrc/sgmm2bin:$kaldisrc/tiedbin:$kaldisrc/featbin:$kaldisrc/fstbin:$kaldisrc/latbin:$kaldisrc/onlinebin:$kaldisrc/sgmmbin | ||
|
||
### Openfst ### | ||
openfst=/ha/home/oplatek/50GBmax/kaldi/tools/openfst | ||
export PATH="$PATH":$openfst/bin | ||
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH":$openfst/lib | ||
``` | ||
|
||
Which tool for building a Language Model (LM) have I used? | ||
--------------------------------------------------------- | ||
None. I received built LM in Arpa format. | ||
|
||
NOTE: Probably, I should build my own LM. | ||
|
||
|
||
How have I installed Atlas? | ||
-------------------- | ||
NOTE: I decided NOT to use Atlas, I USE OpenBlas INSTEAD. It is open source and it allows me to compile both shared and static libraries at one run. | ||
|
||
Nevertheless how I install Atlas: | ||
|
||
* I installed version atlas3.10.1.tar.bz2 (available at sourceforge) | ||
* I unpackaged it under `kaldi-trunk/tools` which created `kaldi-trunk/tools/ATLAS` | ||
* The main problem with building ATLAS was for me disabling CPU throtling. | ||
* I solved it by | ||
|
||
```bash | ||
# running following command under root in my Ubuntu 12.10 | ||
# It does not turn off CPU throttling in fact, but I do not need the things optimaze on my local machine | ||
# I ran it for all of my 4 cores | ||
# for n in 0 1 2 3 ; do echo 'performance' > /sys/devices/system/cpu/cpu${n}/cpufreq/scaling_governor ; done | ||
``` | ||
|
||
* Then I needed to install Fortran compiler (The error from configure was little bit covered by consequent errors) by | ||
|
||
```bash | ||
sudo apt-get install gfortran | ||
``` | ||
|
||
* On Ubuntu 12.04 I had issue with | ||
|
||
```bash | ||
/usr/include/features.h:323:26: fatal error: bits/predefs.h | ||
``` | ||
|
||
Which I solved by | ||
|
||
```bash | ||
sudo apt-get install --reinstall libc6-dev | ||
``` | ||
|
||
* Finally, in `kaldi-trunk/tools/ATLAS` I run: | ||
|
||
```bash | ||
mkdir build | ||
mkdir ../atlas_install | ||
cd build | ||
../configure --shared --incdir=`pwd`/../../atlas_install | ||
make | ||
make install | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
ABOUT | ||
===== | ||
* This is a Git mirror of [Svn trunk of Kaldi project](http://sourceforge.net/projects/kaldi/) | ||
`svn://svn.code.sf.net/p/kaldi/code/trunk` | ||
* In the branch `master` I commit my work. In the branch `svn_mirror` I mirror `svn://svn.code.sf.net/p/kaldi/code/trunk`. In the branch `sandbox-oplatek` I am developing changes which I would like to check in back to Kaldi. | ||
* Currently, I mirror the repository manually as often as I needed. | ||
* The main purpose for mirroring is that I want to build my own decoder and train my models for decoding based on up-to-date Kaldi version. | ||
* Recipe for training the models can be found at `egs/kaldi-vystadial-recipe` | ||
* Source code for python wrapper for online-decoder is at `src/python-kaldi-decoding` | ||
* Remarks about new decoder are located at `src/vystadial-decoder` | ||
* I use the `Fake submodules` approach to merge the 3 subprojects to this repository. More about `Fake submodules` [at this blog](http://debuggable.com/posts/git-fake-submodules:4b563ee4-f3cc-4061-967e-0e48cbdd56cb). | ||
* I mirror the svn via `git svn`. [Nice intro to git svn](http://viget.com/extend/effectively-using-git-with-subversion), [Walk through](http://blog.shinetech.com/2009/02/17/my-git-svn-workflow/) and [Multiple svn-remotes](http://blog.shuningbian.net/2011/05/git-with-multiple-svn-remotes.html) | ||
|
||
OTHER INFO | ||
---------- | ||
* Read `INSTALL.md` and `INSTALL` first! | ||
* For training models read `egs/kaldi-vystadial-recipe/s5/README.md` | ||
* For building and developing decoder callable from python read `src/python-kaldi-decoding/README.md` | ||
* For information about new decoder read `src/vystadial-decoder/README.md` | ||
* This work is done under [Vystadial project](https://sites.google.com/site/filipjurcicek/projects/vystadial). | ||
|
||
LICENSE | ||
-------- | ||
* We release all the changes at pyKaldi under `Apache license 2.0` license. Kaldi also uses `Apache 2.0` license). | ||
* We also want to publicly release the training data in the autumn 2013. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,124 @@ | ||
#!/usr/bin/perl | ||
|
||
# Copyright 2012 Johns Hopkins University (Author: Guoguo Chen) | ||
# Apache 2.0. | ||
# | ||
|
||
use strict; | ||
use warnings; | ||
use Getopt::Long; | ||
|
||
my $Usage = <<EOU; | ||
Usage: annotatedKwlist2KWs.pl [options] <kwlist.annot.xml|-> <keywords|-> [category] | ||
e.g.: annotatedKwlist2KWs.pl kwlist.annot.list keywords.list "NGram Order:2,3,4" | ||
This script reads an annotated kwlist xml file and writes a list of keywords, according | ||
to the given categories. The "category" is a "key:value" pair in the annotated kwlist xml | ||
file. For example | ||
1. "NGram Order:2,3,4" | ||
2. "NGram Order:2" | ||
3. "NGram Order:-" | ||
where "NGram Order" is the category name. The first line means print keywords that are | ||
bigram, trigram and 4gram; The second line means print keywords only for bigram; The last | ||
line means print all possible ngram keywords. | ||
If no "category" is specified, the script will print out the possible categories. | ||
Allowed options: | ||
EOU | ||
|
||
GetOptions(); | ||
|
||
@ARGV >= 2 || die $Usage; | ||
|
||
# Workout the input/output source | ||
my $kwlist_filename = shift @ARGV; | ||
my $kws_filename = shift @ARGV; | ||
|
||
my $source = "STDIN"; | ||
if ($kwlist_filename ne "-") { | ||
open(KWLIST, "<$kwlist_filename") || die "Fail to open kwlist file: $kwlist_filename\n"; | ||
$source = "KWLIST"; | ||
} | ||
|
||
# Process kwlist.annot.xml | ||
my %attr; | ||
my %attr_kws; | ||
my $kwid=""; | ||
my $name=""; | ||
my $value=""; | ||
while (<$source>) { | ||
chomp; | ||
if (m/<kw kwid=/) {($kwid) = /kwid="(\S+)"/; next;} | ||
if (m/<name>/) {($name) = /<name>(.*)<\/name>/; next;} | ||
if (m/<value>/) { | ||
($value) = /<value>(.*)<\/value>/; | ||
if (defined($attr{$name})) { | ||
$attr{"$name"}->{"$value"} = 1; | ||
} else { | ||
$attr{"$name"} = {"$value", 1}; | ||
} | ||
if (defined($attr_kws{"${name}_$value"})) { | ||
$attr_kws{"${name}_$value"}->{"$kwid"} = 1; | ||
} else { | ||
$attr_kws{"${name}_$value"} = {"$kwid", 1}; | ||
} | ||
} | ||
} | ||
|
||
my $output = ""; | ||
if (@ARGV == 0) { | ||
# If no category provided, print out the possible categories | ||
$output .= "Possible categories are:\n\n"; | ||
foreach my $name (keys %attr) { | ||
$output .= "$name:"; | ||
my $count = 0; | ||
foreach my $value (keys %{$attr{$name}}) { | ||
if ($value eq "") {$value = "\"\"";} | ||
if ($count == 0) { | ||
$output .= "$value"; | ||
$count ++; next; | ||
} | ||
if ($count == 6) { | ||
$output .= ", ..."; | ||
last; | ||
} | ||
$output .= ",$value"; $count ++; | ||
} | ||
$output .= "\n"; | ||
} | ||
print STDERR $output; | ||
$output = ""; | ||
} else { | ||
my %keywords; | ||
while (@ARGV > 0) { | ||
my $category = shift @ARGV; | ||
my @col = split(/:/, $category); | ||
@col == 2 || die "Bad category \"$category\"\n"; | ||
$name = $col[0]; | ||
if ($col[1] eq "-") { | ||
foreach my $value (keys %{$attr{$name}}) { | ||
foreach my $kw (keys %{$attr_kws{"${name}_$value"}}) { | ||
$keywords{$kw} = 1; | ||
} | ||
} | ||
} else { | ||
my @col1 = split(/,/, $col[1]); | ||
foreach my $value (@col1) { | ||
foreach my $kw (keys %{$attr_kws{"${name}_$value"}}) { | ||
$keywords{$kw} = 1; | ||
} | ||
} | ||
} | ||
} | ||
foreach my $kw (keys %keywords) { | ||
$output .= "$kw\n"; | ||
} | ||
} | ||
|
||
if ($kwlist_filename ne "-") {close(KWLIST);} | ||
if ($kws_filename eq "-") { print $output;} | ||
else { | ||
open(O, ">$kws_filename") || die "Fail to open file $kws_filename\n"; | ||
print O $output; | ||
close(O); | ||
} |
Oops, something went wrong.