Skip to content

Commit

Permalink
Linter fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
albbas committed Sep 2, 2024
1 parent c0bdfeb commit d17d4d2
Show file tree
Hide file tree
Showing 606 changed files with 16,285 additions and 26,469 deletions.
10 changes: 1 addition & 9 deletions AboutGiellaLT.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,34 +6,26 @@ It is an open source website providing analysers and tools for
[a wide range of languages](LanguageModels.html), as well as
[a ready-made setup](infra/HowToAddANewLanguage.md) for adding more languages.




# The possibility to make computer tools for your language


Computer tools supported by our infrastructure include:


- linguistic analysers (morphology, syntax)
- spell checkers and grammar checkers
- morphologically enabeled e-dictionaries
- machine translation


# ... by using the following linguistic technology

We write our morphologies as [finite state transducers](https://en.wikipedia.org/wiki/Finite_state_transducer)
in the formalisms *lexc*, *twolc* and *xfst rewrite rules*, and compile them into computer programs for language analysis with the compilers [xfst](http://fsmbook.com),
in the formalisms _lexc_, _twolc_ and _xfst rewrite rules_, and compile them into computer programs for language analysis with the compilers [xfst](http://fsmbook.com),
[hfst](http://www.ling.helsinki.fi/kieliteknologia/tutkimus/hfst/) or [foma](https://github.com/mhulden/foma).
Our syntaxes we write in [constraint grammar](https://en.wikipedia.org/wiki/Constraint_grammar),
and we compile our constraint grammars with [vislcg3](http://beta.visl.sdu.dk/cg3.html).
The installation of these compilers is documented on the [Getting Started](infra/GettingStarted.html) page.


# Source code, licensing and cooperation


All our resources, infrastructure and linguistic content alike, are available under dual licenses, CC-by-SA and GPL. You may thus take whatever resource you find useful with you and go, as long as you refer to us when you use it.

The linguistic source code is found in the present git repository ([giellalt](https://github.com/giellalt)). In addition to that, we maintain the following git repositories (all on github), mostly with more technical content: [borealium](https://github.com/borealium), [divvun](https://github.com/divvun), [divvungiellatekno](https://github.com/divvungiellatekno), [giellatekno](https://github.com/giellatekno). Another relevant git repository (also on github) is [apertium](https://github.com/apertium).
Expand Down
2 changes: 1 addition & 1 deletion CorpusResources.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Corpus Resources

![Warning](images/Warning.svg)
__*Under construction.*__
**_Under construction._**

This page contains a dynamically built list of all corpus repositories. Private repositories are not listed.

Expand Down
12 changes: 6 additions & 6 deletions DocumentationGuide.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@
The documentation is organised as follows:

- language specific documentation is organised in separate subdomains, links to each language can be found as follows:
- [keyboards and locales](keyboards/KeyboardLayouts.md)
- [morphology, syntax, text processing, proofing tools](LanguageModels.md)
- [speech technology resources](SpeechTechnologyResources.md)
- [keyboards and locales](keyboards/KeyboardLayouts.md)
- [morphology, syntax, text processing, proofing tools](LanguageModels.md)
- [speech technology resources](SpeechTechnologyResources.md)
- general technical & language independent documentation: [this site](/index.md)
- Documentation specific to Divvun, Giellatekno and Tromsø:
- [old site](https://giellalt.uit.no)
- [new site](https://divvungiellatekno.github.io/giellalt.uit.no/) (will be moved to the old site URL when it is fully converted)
- [old site](https://giellalt.uit.no)
- [new site](https://divvungiellatekno.github.io/giellalt.uit.no/) (will be moved to the old site URL when it is fully converted)

Documentation on how to *write* and *publish* documentation [can be found here](infra/docinfra.md).
Documentation on how to _write_ and _publish_ documentation [can be found here](infra/docinfra.md).
3 changes: 1 addition & 2 deletions Games.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,10 @@ The languages are grouped according to game.

# Word guessing game

Simple word guessing game in the tradition of [MasterMind](https://en.wikipedia.org/wiki/Mastermind_(board_game)). For more information on the source code, see [this repo](https://github.com/giellalt/template-wordguess-und).
Simple word guessing game in the tradition of [MasterMind](<https://en.wikipedia.org/wiki/Mastermind_(board_game)>). For more information on the source code, see [this repo](https://github.com/giellalt/template-wordguess-und).

<div id="wordguess" ></div>


<script src="/assets/js/langtable.js"></script>
<script>
const domWordGames = document.querySelector('#wordguess');
Expand Down
2 changes: 1 addition & 1 deletion KeyboardLayouts.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Beware that the documentation pages for most Experimental repos have little or no content, and that documentation for other keyboards probably is out-of-date. Writing documentation is an ongoing effort, and part of the development process. Automatically generated SVG layouts is presently not working.

The languages are grouped in three different ways, according to *maturity, geography* and *language family*. [Private repositories](https://github.com/divvun/private-registry) are not listed.
The languages are grouped in three different ways, according to _maturity, geography_ and _language family_. [Private repositories](https://github.com/divvun/private-registry) are not listed.

# Grouped according to maturity of the keyboards

Expand Down
4 changes: 2 additions & 2 deletions LanguageModels.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@

Beware that the documentation pages for most Experimental repos have little or no content, and that documentation for other languages probably is out-of-date. Writing documentation for each language repository is an ongoing effort, and part of the development process.

The languages are grouped in three different ways, according to *maturity, geography* and *language family*. [Private repositories](https://github.com/divvun/private-registry) are not listed.
The languages are grouped in three different ways, according to _maturity, geography_ and _language family_. [Private repositories](https://github.com/divvun/private-registry) are not listed.

# Grouped according to maturity of the resources

The [maturity levels](MaturityClassification.md) are *production, beta, alpha* and *experimental*. Some of the beta language models are used in practical applications.
The [maturity levels](MaturityClassification.md) are _production, beta, alpha_ and _experimental_. Some of the beta language models are used in practical applications.

Being in the **Production** group does not necessarily mean a language model is in production for all purposes, it could be for one only. See the documentation for each language for further details.

Expand Down
163 changes: 87 additions & 76 deletions MaturityClassification.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,21 @@
# Language resource maturity classification

This page *presents* and *defines* the maturity classification system of this site. At the bottom of the page comes a description of how to add and change maturity tags.

This page _presents_ and _defines_ the maturity classification system of this site. At the bottom of the page comes a description of how to add and change maturity tags.

# Maturity classes

In the GielllaLT infrastructure we use a five-step classification to broadly describe the quality and development level of various linguistic resources. These categories are used as labels in README files, on the documentation front page for each resource, as well as in the overview pages for [language models](LanguageModels.md), [dictionaries](https://giellalt.github.io/dicts/DictionarySources.html), [keyboards](KeyboardLayouts.md) and [spell checkers](proof/index.md) (the maturity level of grammar checkers, machine translation applications and speech technology are still undefined). The labels look like the following:

| No. | Label | Type | Colour |
| --- |:----- |:---- | ------ |
| 1.| ![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg)| Production | green |
| 2.| ![Maturity: Beta](https://img.shields.io/badge/Maturity-Beta-yellow.svg) | Beta | yellow |
| 3.| ![Maturity: Alpha](https://img.shields.io/badge/Maturity-Alpha-red.svg) | Alpha | red |
| 4.| ![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg) | Experiment / student exercise | black |
| 5.| ![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg) | Undefined | grey |

| No. | Label | Type | Colour |
| --- | :---------------------------------------------------------------------------------------- | :---------------------------- | ------ |
| 1. | ![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg) | Production | green |
| 2. | ![Maturity: Beta](https://img.shields.io/badge/Maturity-Beta-yellow.svg) | Beta | yellow |
| 3. | ![Maturity: Alpha](https://img.shields.io/badge/Maturity-Alpha-red.svg) | Alpha | red |
| 4. | ![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg) | Experiment / student exercise | black |
| 5. | ![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg) | Undefined | grey |

# Maturity class definitions (in reverse order)


Some of the criterias for the various levels are common for all resource pages and listed under **General criteria**. Other criteria are application specific:

## Undefined ![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg)
Expand All @@ -31,121 +28,135 @@ This category also covers student exercises (published with permission). The poi

### General criteria

* license not required, but is nice
* may not build at all
* Divvun Manager:
* might not be available
* if available: only available in the nightly channel
* rule of thumb: may not work at all
- license not required, but is nice
- may not build at all
- Divvun Manager:
- might not be available
- if available: only available in the nightly channel
- rule of thumb: may not work at all

### Application specific criteria

#### Language model
* fragmentary grammar
* less than 1k lexical entries

- fragmentary grammar
- less than 1k lexical entries

#### Dictionary
* less than 1k lexical entries

- less than 1k lexical entries

#### Keyboard
* all letters may not be included
* layout experimental, will change

- all letters may not be included
- layout experimental, will change

#### Spell checker
* see language model above
* no adaptation of error model
* no weighting corpus

- see language model above
- no adaptation of error model
- no weighting corpus

## Alpha ![Maturity: Production](https://img.shields.io/badge/Maturity-Alpha-red.svg)

### General criteria

* license highly recommended
* Divvun Manager:
* is available
* only available in the nightly channel
* rule of thumb: it can be built locally and used for something
- license highly recommended
- Divvun Manager:
- is available
- only available in the nightly channel
- rule of thumb: it can be built locally and used for something

### Application specific criteria

#### Language model
* grammar model mostly complete
* lexicon between 1k and 10k entries

- grammar model mostly complete
- lexicon between 1k and 10k entries

#### Dictionary
* entries from different parts of speech
* lexicon between 1k and 10k entries

- entries from different parts of speech
- lexicon between 1k and 10k entries

#### Keyboard
* layout mostly done, may still change
* all letters in alphabet included

- layout mostly done, may still change
- all letters in alphabet included

#### Spell checker
* Program works, corrects text, and is of some use

- Program works, corrects text, and is of some use

## Beta ![Maturity: Production](https://img.shields.io/badge/Maturity-Beta-yellow.svg)

### General criteria
* there **should** be a proper license
* CI/CD working for the tools being provided
* Divvun Manager:
* is available
* is available in the stable channel
* **NOT** visible on the front page, only via the `All languages` view
* rule of thumb: it can easily be installed via Divvun Manager - it must be testable by the user community

- there **should** be a proper license
- CI/CD working for the tools being provided
- Divvun Manager:
- is available
- is available in the stable channel
- **NOT** visible on the front page, only via the `All languages` view
- rule of thumb: it can easily be installed via Divvun Manager - it must be testable by the user community

### Application specific criteria

#### Language model
* grammar model complete
* lexicon has more than 10k entries
* running text coverage above 80 %

- grammar model complete
- lexicon has more than 10k entries
- running text coverage above 80 %

#### Dictionary
* different parts of speech treated differently
* lexicon has more than 10k entries

- different parts of speech treated differently
- lexicon has more than 10k entries

#### Keyboard
* layout complete for all levels and input methods

- layout complete for all levels and input methods

#### Spell checker
* The number of false positives is below 20 %
* Correction mechanism gives relevant connection in top-5 in most cases


- The number of false positives is below 20 %
- Correction mechanism gives relevant connection in top-5 in most cases

## Production ![Maturity: Production](https://img.shields.io/badge/Maturity-Production-rightgreen.svg)

### General criteria
* there **must** be a proper license
* at least one contact person in the language community that is willing to or being payed to be a first line support person and language resource maintainer, public contact email or other contact info
* CI/CD working for the tools being provided
* Divvun Manager:
* is available
* is available in the stable channel
* **IS** visible on the front page
* Release `1.0.0` or higher of either speller or analyser/`giella-XXX` package
* rule of thumb: it is easily installable via the One-click installer or Divvun Manager front page

- there **must** be a proper license
- at least one contact person in the language community that is willing to or being payed to be a first line support person and language resource maintainer, public contact email or other contact info
- CI/CD working for the tools being provided
- Divvun Manager:
- is available
- is available in the stable channel
- **IS** visible on the front page
- Release `1.0.0` or higher of either speller or analyser/`giella-XXX` package
- rule of thumb: it is easily installable via the One-click installer or Divvun Manager front page

### Application specific criteria

#### Language model

* grammar/model/layout complete
* lexicon has more than 30k entries (but subject to realworld realities & limits)
* running text coverage above 90 %
- grammar/model/layout complete
- lexicon has more than 30k entries (but subject to realworld realities & limits)
- running text coverage above 90 %

#### Dictionary
* lexicon has more than 20k entries
* lemma articles are structured according to lemma type

#### Keyboard
* layout complete and evaluated for all levels and input methods
- lexicon has more than 20k entries
- lemma articles are structured according to lemma type

#### Spell checker
* The number of false positives is below 5 %
* Correction mechanism gives relevant connection in top-5 in almost all cases, in top position in most cases
#### Keyboard

- layout complete and evaluated for all levels and input methods

#### Spell checker

- The number of false positives is below 5 %
- Correction mechanism gives relevant connection in top-5 in almost all cases, in top position in most cases

# Registering maturity

Expand All @@ -159,10 +170,10 @@ Adding maturity tags is done via [GitHub topics](https://docs.github.com/en/gith

The topic tags corresponding to the labels above are as follows:

* `maturity-prod` - ![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg)
* `maturity-beta` - ![Maturity: Beta ](https://img.shields.io/badge/Maturity-Beta-yellow.svg)
* `maturity-alpha` - ![Maturity: Alpha ](https://img.shields.io/badge/Maturity-Alpha-red.svg)
* `maturity-exper` - ![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg)
- `maturity-prod` - ![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg)
- `maturity-beta` - ![Maturity: Beta ](https://img.shields.io/badge/Maturity-Beta-yellow.svg)
- `maturity-alpha` - ![Maturity: Alpha ](https://img.shields.io/badge/Maturity-Alpha-red.svg)
- `maturity-exper` - ![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg)

The ![Maturity: Undefined ](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg) category does of course not have a topic - that is the definition of the category. In the lists and tables linked to above it should ideally be empty, but it is listed in any case to easily spot repositories that do not yet have a defined maturity class.

Expand Down
16 changes: 8 additions & 8 deletions Personvern.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,23 +18,23 @@ Om data ligg ute, inkl data over språklege feil, og diskusjonar om normering -

Det er dessutan alltid feil i koden vår - som i all annan programvare. Vi strevar så klart til å ha så få feil som mogleg, men koden er for kompleks til å kunna ha han heilt feilfri. Og i ein del tilfelle er det uklårt kva som er rett og feil, eller normeringa er udefinert.


# Retningsliner

# Praktiske tips

Om github:

- alle personkontoar er offentlege, men ein kan ha minimalt med info der, og ingen ting som identifiserer ein.

- lag eit githubnamn som ikkje kan koplast til deg
- det er mogleg [å byta brukarnamn på ein eksisterande konto](https://github.com/settings/admin) - "Change username" (men sjå sida over moglege negative fylgjer)
- det er mogleg [å byta brukarnamn på ein eksisterande konto](https://github.com/settings/admin) - "Change username" (men sjå sida over moglege negative fylgjer)
-[profilsida](https://github.com/settings/profile):
- ikkje ha eit profilbilete som kan koplast til deg
- ikkje spesifiser kor du bur
- ingen namn
- ingen url
- ingen arbeidsgjevar
- ikkje ha ei synleg e-postadresse - "Public email" (la feltet stå uspesifisert)
- ikkje ha eit profilbilete som kan koplast til deg
- ikkje spesifiser kor du bur
- ingen namn
- ingen url
- ingen arbeidsgjevar
- ikkje ha ei synleg e-postadresse - "Public email" (la feltet stå uspesifisert)
- [hald e-postadressa privat](https://github.com/settings/emails) - "Keep my email addresses private"
- [ikkje vis at du er medlem i ein "organisasjon"](https://docs.github.com/en/free-pro-team@latest/github/setting-up-and-managing-your-github-user-account/publicizing-or-hiding-organization-membership), t.d. giellalt eller divvun
(github-organisasjonar er som ein paraply over alle repositoria som høyrer saman)
Expand Down
Loading

0 comments on commit d17d4d2

Please sign in to comment.