-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Moved SMA to the new infra. Most files in $GTHOME/gt/sma/src/ have be…
…en moved, a few remain, but all the core analyzers should be there.
- Loading branch information
Showing
75 changed files
with
71,061 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
Authors of gtlangs-sma package. | ||
|
||
The following people have legal copyright to files and software in this | ||
directory and its subdirectories: | ||
|
||
__FIXME__ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
It is of extreme importance that you fill in this file with copyright | ||
information about the morphology contained in this directory. | ||
|
||
__FIXME__ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
## Process this file with automake to produce Makefile.in | ||
## Copyright: Sámediggi/Divvun/UiT | ||
## Licence: GPL v3+ | ||
|
||
ACLOCAL_AMFLAGS = -I m4 | ||
SUBDIRS = . src tools doc test | ||
|
||
EXTRA_DIST = und.timestamp | ||
|
||
configure: und.timestamp | ||
|
||
und.timestamp: ${GTCORE}/templates/und/und.timestamp | ||
@echo | ||
@echo " The build templates are newer than this language directory" | ||
@echo " To get new build rules and conventions, run: " | ||
@echo | ||
@echo "${GTCORE}/scripts/merge-templates.sh" | ||
@echo | ||
@echo " The build will die now, but if you do not want to update your" | ||
@echo " templates, touch $@ and run make again." | ||
@exit 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
.. -*- mode: rst -*- | ||
======================================== | ||
The Southern Sami morphology and tools | ||
======================================== | ||
|
||
This directory contains source files for the Southern Sami language morphology and | ||
dictionary. The data and implementation are licenced under __LICENCE__ licence | ||
also detailed in the LICENCE file of this directory. The authors named in the | ||
AUTHORS file are available to grant other licencing choices. | ||
|
||
Installation and compilation, and a short note on usage, is documented in the | ||
file INSTALL. | ||
|
||
Documentation is scattered around on giellateknos pages, e.g.: | ||
|
||
* http://giellatekno.uit.no/smadoc/index.html | ||
* http://giellatekno.uit.no/doc/tools/docu-sma-manual.html | ||
|
||
Requirements | ||
------------ | ||
|
||
In order to compile and use Southern Sami language morphology and dictionaries, | ||
you need: | ||
|
||
* Xerox Finite-State Morphology tools, or | ||
|
||
* Helsinki Finite-State Technology library and tools, version 3.3.2 or newer | ||
|
||
Optionally: | ||
|
||
* VislCG3 Constraint Grammar tools | ||
|
||
Downloading | ||
----------- | ||
|
||
The Southern Sami language sources can be acquired using `giellatekno SVN | ||
repository <http://divvun.no/doc/infra/anonymous-svn.html>`_, from the | ||
language specific directory, after the core has been downloaded and initial | ||
setup has been performed. | ||
|
||
Installation | ||
------------ | ||
|
||
INSTALL describes the GNU build system in detail, but for most users the usual: | ||
|
||
./configure | ||
make | ||
(as root) make install | ||
|
||
should result in a local installation and:: | ||
|
||
(as root) make uninstall | ||
|
||
in its uninstallation. | ||
|
||
If you would rather install in e.g. your home directory | ||
(or aren't the system administrator), you can tell ./configure:: | ||
|
||
./configure --prefix=$HOME | ||
|
||
If you are checking out the development versions from SVN you must first create | ||
and install the necessary autotools files from the host system: | ||
|
||
autoreconf -i | ||
|
||
It is common practice to keep `generated files out of version control | ||
<http://www.gnu.org/software/automake/manual/automake.html#CVS>`_. | ||
|
||
VPATH builds | ||
------------ | ||
|
||
If you want to keep the source code tree clean, a VPATH build is the solution. | ||
The idea is to create a build dir somewhere outside of the source code tree, | ||
and call `configure` from there. Here is one VPATH variant of the standard | ||
procedure: | ||
|
||
mkdir build && cd build | ||
../configure | ||
make | ||
(as root) make install | ||
|
||
This will keep all the generated files within the build/ dir, and keep the src/ | ||
dir (mostly) free of generated files. | ||
|
||
WARNING!!! Presently there is a bug in Xerox LexC that limits the size of the | ||
input argument to 1064 chars. As VPATH builds tend to increase the character | ||
count in the input string (all filenames are prefixed with a relative path from | ||
the VPATH build dir to the source dir), it is easy to break this limit. If you | ||
follow the simple example above, you should be fine (it has been tested), but | ||
if you try to build from a more distant location you will be out of luck (this | ||
is also tested and confirmed). A bug report will be sent to the Xerox tools' | ||
authors. | ||
|
||
For further installation instruction refer to file ``INSTALL``, which contains | ||
the standard installation instructions for GNU autoconf based software. | ||
|
||
.. vim: set ft=rst: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
## Process this file with automake to produce Makefile.in | ||
|
||
## Copyright (C) 2011 Samediggi | ||
|
||
## This program is free software: you can redistribute it and/or modify | ||
## it under the terms of the GNU General Public License as published by | ||
## the Free Software Foundation, either version 3 of the License, or | ||
## (at your option) any later version. | ||
|
||
## This program is distributed in the hope that it will be useful, | ||
## but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
## GNU General Public License for more details. | ||
|
||
## You should have received a copy of the GNU General Public License | ||
## along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
|
||
####### Source file defs: ######## | ||
|
||
# need to check way to list build targets like automake | ||
#! @param GT_REGEX_SRCS required, list all regular expressions used | ||
EXTRA_DIST=$(GT_FILTER_SRCS) | ||
|
||
####### Automake targets: ######## | ||
|
||
# @param GT_REGEX_TARGETS required | ||
noinst_DATA=$(GT_FILTER_TARGETS) | ||
|
||
####### HFST build rules: ######## | ||
|
||
.regex.hfst: | ||
$(PRINTF) \ | ||
"read regex @re\"$<\";\nsave stack $@ \n" > [email protected] | ||
$(HFST_XFST) -f [email protected] | ||
-rm -f [email protected] | ||
|
||
%.hfst: $(GTCORE)/gtshared/src/filters/%.regex | ||
$(HFST_REGEXP2FST) $(HFSTFLAGS) -i $< -o $@ | ||
|
||
####### Xerox build rules: ####### | ||
|
||
.regex.xfst: | ||
$(PRINTF) "read regex @re\"$<\";\nsave stack $@\nquit\n" |\ | ||
$(XFST) $(XFSTFLAGS) | ||
|
||
%.hfst: $(GTCORE)/gtshared/src/filters/%.regex | ||
$(PRINTF) "read regex @re\"$<\";\nsave stack $@\nquit\n" |\ | ||
$(XFST) $(XFSTFLAGS) | ||
|
||
####### Other targets: ########### | ||
clean-local: | ||
-rm -f *.hfst *.ol | ||
|
||
# vim: set ft=automake: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
## Process this file with automake to produce Makefile.in | ||
|
||
## Copyright (C) 2011 Samediggi | ||
|
||
## This program is free software: you can redistribute it and/or modify | ||
## it under the terms of the GNU General Public License as published by | ||
## the Free Software Foundation, either version 3 of the License, or | ||
## (at your option) any later version. | ||
|
||
## This program is distributed in the hope that it will be useful, | ||
## but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
## GNU General Public License for more details. | ||
|
||
## You should have received a copy of the GNU General Public License | ||
## along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
|
||
GT_DICTIONARY_HFST=dictionary.default.hfst | ||
|
||
####### Automake targets: ######## | ||
|
||
GT_ERRMODELS= | ||
if WANT_HFST | ||
#GT_ERRMODELS+=errmodel.edit-distance-1.hfst | ||
GT_ERRMODELS+=errmodel.default.hfst | ||
GT_SPELLING_HFST=speller.zhfst | ||
endif | ||
|
||
if WANT_HFST | ||
voikkosharedir=$(libdir)/voikko/2/mor-hfst-$(GTLANG2)/ | ||
#! @param GT_VOIKKO optional, set to spell checker automata names if | ||
#! installable | ||
voikkoshare_DATA=$(GT_SPELLING_HFST) voikko-fi_FI.pro | ||
endif | ||
|
||
noinst_DATA=$(GT_ERRMODELS) | ||
|
||
####### HFST build rules: ######## | ||
|
||
# Error model building: | ||
%.hfst: %.txt $(top_srcdir)/$(GT_DICTIONARY_HFST) | ||
$(GTCORE)/scripts/editdist.py -v -s -d 2 -e '@0@' -i $< \ | ||
-a $(top_srcdir)/$(GT_DICTIONARY_HFST) | \ | ||
$(HFST_TXT2FST) $(HFST_FLAGS) -e '@0@' | \ | ||
$(HFST_FST2FST) $(HFST_FLAGS) -f olw -o $@ | ||
|
||
acceptor.default.hfst: $(top_srcdir)/$(GT_DICTIONARY_HFST) | ||
$(HFST_FST2FST) $(HFST_FLAGS) -f olw $< -o $@ | ||
|
||
$(GT_SPELLING_HFST): acceptor.default.hfst \ | ||
$(GT_ERRMODELS) \ | ||
$(GT_OTHER_VOIKKO_FILES) \ | ||
index.xml | ||
$(ZIP) $(ZIPFLAGS) $@ $^ | ||
|
||
####### Other targets: ########### | ||
clean-local: | ||
-rm -f *.hfst *.xfst | ||
|
||
# vim: set ft=automake: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
## Process this file with automake to produce Makefile.in | ||
|
||
|
||
# vim: set ft=automake: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
## Include this file to lexc directory to build lexical automata | ||
|
||
## Copyright (C) 2011 Samediggi | ||
|
||
## This program is free software: you can redistribute it and/or modify | ||
## it under the terms of the GNU General Public License as published by | ||
## the Free Software Foundation, either version 3 of the License, or | ||
## (at your option) any later version. | ||
|
||
## This program is distributed in the hope that it will be useful, | ||
## but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
## GNU General Public License for more details. | ||
|
||
## You should have received a copy of the GNU General Public License | ||
## along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
|
||
####### Source file defs: ######## | ||
|
||
#! @param GT_LEXC_ROOT required, define name of file holding root lexicon | ||
#! @param GT_LEXC_SRCS optional, space separated list of file names holding | ||
#! supplementary lexicon data | ||
#! @param GENERATED_LEXC_SRCS optional, space separated list of file names | ||
#! holding supplementary lexicon data generated from xml files. | ||
#! @param GT_XML_SRCS optional, space separated list of xml source file names | ||
GT_LEXC_ALLSRC=$(GT_LEXC_ROOT) $(GT_LEXC_SRCS) \ | ||
$(GENERATED_LEXC_SRCS) $(GT_XML_SRCS) | ||
|
||
# All sources need to be included in the tarball | ||
EXTRA_DIST=$(GT_LEXC_ALLSRC) | ||
|
||
####### Automake targets: ######## | ||
|
||
# The transducers we build and don't distribute depend on the configuration: | ||
GT_LEXICAL= | ||
if WANT_HFST | ||
GT_LEXICAL+=$(GTLANG).lexc.hfst | ||
endif | ||
if WANT_XFST | ||
GT_LEXICAL+=$(GTLANG).lexc.xfst | ||
endif | ||
noinst_DATA=$(GT_LEXICAL) | ||
|
||
####### XML2LEXC build rules: #### | ||
JV = java | ||
MF = -Xmx1024m | ||
#EF = -it main # Saxon-B compatible version | ||
EF = -it:main # Saxon-HE compatible version | ||
XSLPROC = net.sf.saxon.Transform | ||
XSLDIR = $(GTCORE)/scripts/xsl | ||
curdir := $(shell pwd) | ||
|
||
# This target will convert each individual xml file to lexc: | ||
stems/%.lexc: stems/%.xml $(XSLDIR)/generate_lex-fileVM.xsl | ||
$(JV) $(MF) $(XSLPROC) $(EF) $(XSLDIR)/generate_lex-fileVM.xsl \ | ||
inFile=$(curdir)/$< > $@ | ||
|
||
####### HFST build rules: ######## | ||
|
||
# lexical transducer building rules: | ||
# for HFST: | ||
$(GTLANG).lexc.hfst: $(GT_LEXC_ROOT) $(GT_LEXC_SRCS) $(GENERATED_LEXC_SRCS) | ||
$(HFST_LEXC) -f foma -o [email protected] $^ | ||
$(HFST_FST2FST) -f openfst-tropical -i [email protected] -o $@ | ||
-rm -f [email protected] | ||
|
||
####### Xerox build rules: ####### | ||
|
||
$(GTLANG).lexc.xfst: $(GT_LEXC_ROOT) $(GT_LEXC_SRCS) $(GENERATED_LEXC_SRCS) | ||
$(PRINTF) "compile-source $^\nsource-to-result\nsave-source $@\nquit\n" |\ | ||
$(LEXC) | ||
|
||
####### Other targets: ########### | ||
clean-local: | ||
-rm -f $(GTLANG).lexc.hfst $(GTLANG).lexc.xfst | ||
|
||
maintainer-clean-local: | ||
-rm -f $(GENERATED_LEXC_SRCS) |
Oops, something went wrong.