Skip to content

Commit

Permalink
Moved SMA to the new infra. Most files in $GTHOME/gt/sma/src/ have be…
Browse files Browse the repository at this point in the history
…en moved, a few remain, but all the core analyzers should be there.
  • Loading branch information
snomos committed Aug 28, 2012
1 parent 32b8af2 commit 2fc88e7
Show file tree
Hide file tree
Showing 75 changed files with 71,061 additions and 0 deletions.
6 changes: 6 additions & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Authors of gtlangs-sma package.

The following people have legal copyright to files and software in this
directory and its subdirectories:

__FIXME__
365 changes: 365 additions & 0 deletions INSTALL

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions LICENCE
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
It is of extreme importance that you fill in this file with copyright
information about the morphology contained in this directory.

__FIXME__
21 changes: 21 additions & 0 deletions Makefile.am
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
## Process this file with automake to produce Makefile.in
## Copyright: Sámediggi/Divvun/UiT
## Licence: GPL v3+

ACLOCAL_AMFLAGS = -I m4
SUBDIRS = . src tools doc test

EXTRA_DIST = und.timestamp

configure: und.timestamp

und.timestamp: ${GTCORE}/templates/und/und.timestamp
@echo
@echo " The build templates are newer than this language directory"
@echo " To get new build rules and conventions, run: "
@echo
@echo "${GTCORE}/scripts/merge-templates.sh"
@echo
@echo " The build will die now, but if you do not want to update your"
@echo " templates, touch $@ and run make again."
@exit 1
97 changes: 97 additions & 0 deletions README
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
.. -*- mode: rst -*-
========================================
The Southern Sami morphology and tools
========================================

This directory contains source files for the Southern Sami language morphology and
dictionary. The data and implementation are licenced under __LICENCE__ licence
also detailed in the LICENCE file of this directory. The authors named in the
AUTHORS file are available to grant other licencing choices.

Installation and compilation, and a short note on usage, is documented in the
file INSTALL.

Documentation is scattered around on giellateknos pages, e.g.:

* http://giellatekno.uit.no/smadoc/index.html
* http://giellatekno.uit.no/doc/tools/docu-sma-manual.html

Requirements
------------

In order to compile and use Southern Sami language morphology and dictionaries,
you need:

* Xerox Finite-State Morphology tools, or

* Helsinki Finite-State Technology library and tools, version 3.3.2 or newer

Optionally:

* VislCG3 Constraint Grammar tools

Downloading
-----------

The Southern Sami language sources can be acquired using `giellatekno SVN
repository <http://divvun.no/doc/infra/anonymous-svn.html>`_, from the
language specific directory, after the core has been downloaded and initial
setup has been performed.

Installation
------------

INSTALL describes the GNU build system in detail, but for most users the usual:

./configure
make
(as root) make install

should result in a local installation and::

(as root) make uninstall

in its uninstallation.

If you would rather install in e.g. your home directory
(or aren't the system administrator), you can tell ./configure::

./configure --prefix=$HOME

If you are checking out the development versions from SVN you must first create
and install the necessary autotools files from the host system:

autoreconf -i

It is common practice to keep `generated files out of version control
<http://www.gnu.org/software/automake/manual/automake.html#CVS>`_.

VPATH builds
------------

If you want to keep the source code tree clean, a VPATH build is the solution.
The idea is to create a build dir somewhere outside of the source code tree,
and call `configure` from there. Here is one VPATH variant of the standard
procedure:

mkdir build && cd build
../configure
make
(as root) make install

This will keep all the generated files within the build/ dir, and keep the src/
dir (mostly) free of generated files.

WARNING!!! Presently there is a bug in Xerox LexC that limits the size of the
input argument to 1064 chars. As VPATH builds tend to increase the character
count in the input string (all filenames are prefixed with a relative path from
the VPATH build dir to the source dir), it is easy to break this limit. If you
follow the simple example above, you should be fine (it has been tested), but
if you try to build from a more distant location you will be out of luck (this
is also tested and confirmed). A bug report will be sent to the Xerox tools'
authors.

For further installation instruction refer to file ``INSTALL``, which contains
the standard installation instructions for GNU autoconf based software.

.. vim: set ft=rst:
54 changes: 54 additions & 0 deletions am-shared/filters-include.am
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
## Process this file with automake to produce Makefile.in

## Copyright (C) 2011 Samediggi

## This program is free software: you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation, either version 3 of the License, or
## (at your option) any later version.

## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
## GNU General Public License for more details.

## You should have received a copy of the GNU General Public License
## along with this program. If not, see <http://www.gnu.org/licenses/>.

####### Source file defs: ########

# need to check way to list build targets like automake
#! @param GT_REGEX_SRCS required, list all regular expressions used
EXTRA_DIST=$(GT_FILTER_SRCS)

####### Automake targets: ########

# @param GT_REGEX_TARGETS required
noinst_DATA=$(GT_FILTER_TARGETS)

####### HFST build rules: ########

.regex.hfst:
$(PRINTF) \
"read regex @re\"$<\";\nsave stack $@ \n" > [email protected]
$(HFST_XFST) -f [email protected]
-rm -f [email protected]

%.hfst: $(GTCORE)/gtshared/src/filters/%.regex
$(HFST_REGEXP2FST) $(HFSTFLAGS) -i $< -o $@

####### Xerox build rules: #######

.regex.xfst:
$(PRINTF) "read regex @re\"$<\";\nsave stack $@\nquit\n" |\
$(XFST) $(XFSTFLAGS)

%.hfst: $(GTCORE)/gtshared/src/filters/%.regex
$(PRINTF) "read regex @re\"$<\";\nsave stack $@\nquit\n" |\
$(XFST) $(XFSTFLAGS)

####### Other targets: ###########
clean-local:
-rm -f *.hfst *.ol

# vim: set ft=automake:
60 changes: 60 additions & 0 deletions am-shared/hfst-spellchecker-include.am
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
## Process this file with automake to produce Makefile.in

## Copyright (C) 2011 Samediggi

## This program is free software: you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation, either version 3 of the License, or
## (at your option) any later version.

## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
## GNU General Public License for more details.

## You should have received a copy of the GNU General Public License
## along with this program. If not, see <http://www.gnu.org/licenses/>.

GT_DICTIONARY_HFST=dictionary.default.hfst

####### Automake targets: ########

GT_ERRMODELS=
if WANT_HFST
#GT_ERRMODELS+=errmodel.edit-distance-1.hfst
GT_ERRMODELS+=errmodel.default.hfst
GT_SPELLING_HFST=speller.zhfst
endif

if WANT_HFST
voikkosharedir=$(libdir)/voikko/2/mor-hfst-$(GTLANG2)/
#! @param GT_VOIKKO optional, set to spell checker automata names if
#! installable
voikkoshare_DATA=$(GT_SPELLING_HFST) voikko-fi_FI.pro
endif

noinst_DATA=$(GT_ERRMODELS)

####### HFST build rules: ########

# Error model building:
%.hfst: %.txt $(top_srcdir)/$(GT_DICTIONARY_HFST)
$(GTCORE)/scripts/editdist.py -v -s -d 2 -e '@0@' -i $< \
-a $(top_srcdir)/$(GT_DICTIONARY_HFST) | \
$(HFST_TXT2FST) $(HFST_FLAGS) -e '@0@' | \
$(HFST_FST2FST) $(HFST_FLAGS) -f olw -o $@

acceptor.default.hfst: $(top_srcdir)/$(GT_DICTIONARY_HFST)
$(HFST_FST2FST) $(HFST_FLAGS) -f olw $< -o $@

$(GT_SPELLING_HFST): acceptor.default.hfst \
$(GT_ERRMODELS) \
$(GT_OTHER_VOIKKO_FILES) \
index.xml
$(ZIP) $(ZIPFLAGS) $@ $^

####### Other targets: ###########
clean-local:
-rm -f *.hfst *.xfst

# vim: set ft=automake:
4 changes: 4 additions & 0 deletions am-shared/hfstlanginstall-include.am
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
## Process this file with automake to produce Makefile.in


# vim: set ft=automake:
78 changes: 78 additions & 0 deletions am-shared/lexc-include.am
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
## Include this file to lexc directory to build lexical automata

## Copyright (C) 2011 Samediggi

## This program is free software: you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation, either version 3 of the License, or
## (at your option) any later version.

## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
## GNU General Public License for more details.

## You should have received a copy of the GNU General Public License
## along with this program. If not, see <http://www.gnu.org/licenses/>.

####### Source file defs: ########

#! @param GT_LEXC_ROOT required, define name of file holding root lexicon
#! @param GT_LEXC_SRCS optional, space separated list of file names holding
#! supplementary lexicon data
#! @param GENERATED_LEXC_SRCS optional, space separated list of file names
#! holding supplementary lexicon data generated from xml files.
#! @param GT_XML_SRCS optional, space separated list of xml source file names
GT_LEXC_ALLSRC=$(GT_LEXC_ROOT) $(GT_LEXC_SRCS) \
$(GENERATED_LEXC_SRCS) $(GT_XML_SRCS)

# All sources need to be included in the tarball
EXTRA_DIST=$(GT_LEXC_ALLSRC)

####### Automake targets: ########

# The transducers we build and don't distribute depend on the configuration:
GT_LEXICAL=
if WANT_HFST
GT_LEXICAL+=$(GTLANG).lexc.hfst
endif
if WANT_XFST
GT_LEXICAL+=$(GTLANG).lexc.xfst
endif
noinst_DATA=$(GT_LEXICAL)

####### XML2LEXC build rules: ####
JV = java
MF = -Xmx1024m
#EF = -it main # Saxon-B compatible version
EF = -it:main # Saxon-HE compatible version
XSLPROC = net.sf.saxon.Transform
XSLDIR = $(GTCORE)/scripts/xsl
curdir := $(shell pwd)

# This target will convert each individual xml file to lexc:
stems/%.lexc: stems/%.xml $(XSLDIR)/generate_lex-fileVM.xsl
$(JV) $(MF) $(XSLPROC) $(EF) $(XSLDIR)/generate_lex-fileVM.xsl \
inFile=$(curdir)/$< > $@

####### HFST build rules: ########

# lexical transducer building rules:
# for HFST:
$(GTLANG).lexc.hfst: $(GT_LEXC_ROOT) $(GT_LEXC_SRCS) $(GENERATED_LEXC_SRCS)
$(HFST_LEXC) -f foma -o [email protected] $^
$(HFST_FST2FST) -f openfst-tropical -i [email protected] -o $@
-rm -f [email protected]

####### Xerox build rules: #######

$(GTLANG).lexc.xfst: $(GT_LEXC_ROOT) $(GT_LEXC_SRCS) $(GENERATED_LEXC_SRCS)
$(PRINTF) "compile-source $^\nsource-to-result\nsave-source $@\nquit\n" |\
$(LEXC)

####### Other targets: ###########
clean-local:
-rm -f $(GTLANG).lexc.hfst $(GTLANG).lexc.xfst

maintainer-clean-local:
-rm -f $(GENERATED_LEXC_SRCS)
Loading

0 comments on commit 2fc88e7

Please sign in to comment.