Skip to content

Commit b9248e0

Browse files
JannikStroetgenjzell
authored andcommitted
adaptations for HeidelTime 2.0
1 parent 63c6b16 commit b9248e0

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

doc/readme.txt

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -198,8 +198,8 @@ set the environment variables.
198198
To process English, German, Dutch, Spanish, Italian, French, Chinese or Russian documents,
199199
the TreeTaggerWrapper can be used for pre-processing:
200200
* Download the TreeTagger and its tagging scripts, installation scripts, as well as
201-
English, German, and Dutch (or any other) parameter files into one directory from:
202-
http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/
201+
English, German, and Dutch (and all required) parameter files into one directory from:
202+
http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/
203203
- mkdir treetagger
204204
- cd treetagger
205205
- wget http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/tree-tagger-linux-3.2.tar.gz
@@ -211,6 +211,8 @@ set the environment variables.
211211
- wget http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/italian-par-linux-3.2-utf8.bin.gz
212212
- wget http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/spanish-par-linux-3.2-utf8.bin.gz
213213
- wget http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/french-par-linux-3.2-utf8.bin.gz
214+
- wget http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/portuguese-par-linux-3.2-utf8.bin.gz
215+
- wget http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/estonian-par-linux-3.2-utf8.bin.gz
214216
Attention: If you do not use Linux, please download all TreeTagger files directly from
215217
http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/
216218
* (OPTIONAL) For Chinese documents, please get the Tokenizer and TreeTagger parameter file
@@ -279,9 +281,9 @@ set the environment variables.
279281
You will need to enter the full path of the hunpos-1.0-linux directory in the
280282
HunPosTaggerWrapper.
281283

282-
To process any of the automatically create, you can use the AllLanguagesTokenizer
283-
which is part of the heideltime kit. It is a simple (whitespace-based) yet generic
284-
tool and creaetes sentence and token annotation.
284+
To process any of the language with automatically created resources, you can use
285+
the AllLanguagesTokenizer, which is part of the heideltime kit. It is a simple
286+
(whitespace-based) yet generic tool and creaetes sentence and token annotation.
285287

286288

287289
For sample UIMA workflows for any of the supported languages, please take a look

0 commit comments

Comments
 (0)