-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path00_readme.txt
56 lines (38 loc) · 1.64 KB
/
00_readme.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
@trond: Important question: where do these data sets (fin_* and smn_* files) stem from?
==> fin_smn from fin_smn dict (incoming), smn_fin from smn_fin dict (incoming), as
==> described in letters from Taarna Valtonen.
==> cf. finsmn/inc/2015/00_readme.txt
==> cf. smnfin/inc/2015/00_readme.txt
Work plan for improving the finsmn*) dictionary.
PROGRAMMER WORK:
Move all entries with SPACE in lemma to MWE_finsmn.xml
======================================================
There are 67 of them, almost all of them are fixed expressions.
These may just be moved to MWE_finsme at once.
==> DONE
Add part of speech lemma and translation :
=========================================
==> DONE
Split the all_finsmn.xml according to fin POS.
==============================================
... after fin POS have been added
==> DONE
Unifiy the lema_pos entries to improve the presentation of matches in NDS
=========================================================================
==> DONE
LINGUIST WORK:
Dictionary translations missing in fst:
=================================
Some of the smn words are wrong, and have a correct version in the fst.
(They are typos or outside the norm). These should be corrected in the dictionary
Some of the smn words are just missing in the fst.
They should be added.
Dictionary lemmas missing in fst:
=================================
Assume Finnish words are written correctly
Add missing Finnish words to langs/fin/src/morphology/stems (not high priority)
----
*)
The original source file is described in
finsmn/inc/2015/00_readme.txt
Later additions have come via Giellatekno / Giellagáldu work with FST and dictionary.