Skip to content

Typescript utilities to work with ATF, a text format describing cuniform tablet contents

License

Notifications You must be signed in to change notification settings

GhentCDH/Cune-iiif-orm-ATF-utils

Repository files navigation

atf-cuniform-utilities

This contains TypeScript functions which help when working with ATF, a semi-standardized text markup format used by the Cuneiform Digital Library Initiative as a way to transcribe the contents of cuniform tablets.

More specifically it contains a tokenizer to split ATF contents into separate characters. See here and here for similar projects. https://github.com/cdli-gh

https://cdli.mpiwg-berlin.mpg.de/search?f[provenience][]=Sippar-Amnanum%20(mod.%20Tell%20ed-Der)

Screenshot of the atf viewer

ATF files

The Cuniform Digital Library Initiative is a great resource and has a collection of ATF files. To export the ATF data into a single file call the following using the CDLI API client:

npm install -g https://github.com/cdli-gh/framework-api-client
npx cdli export \
  --host https://cdli.mpiwg-berlin.mpg.de/ \
  --entities inscriptions \
  --format atf \
  --output-file artifacts.atf

Recommended IDE Setup

VSCode + Volar (and disable Vetur).

Type Support for .vue Imports in TS

TypeScript cannot handle type information for .vue imports by default, so we replace the tsc CLI with vue-tsc for type checking. In editors, we need Volar to make the TypeScript language service aware of .vue types.

Customize configuration

See Vite Configuration Reference.

Project Setup

npm install

Compile and Hot-Reload for Development

npm run dev

Type-Check, Compile and Minify for Production

npm run build

Run Unit Tests with Vitest

npm run test:unit

Questions

If a line is given but not all signs on the line are annotated, which is then the index of the signs: are there gaps possible. E.g. sign 1 2 4 5 are annotated 3 not.

/ = we don't know which of the following signs to read, but it should be one of these. Indexing? Is this a single index or more than one index? Three signs with the same index?

Examples of special ATF instances

  1. example of character divider: na-bi-{d}EN.ZU, the combination of -{ is only one character divider.
  2. example of compound verbs: PA3(|IGI.RU|) or E3(|UD.DU|) - note the pipes within parentheses. We have to annotate both because the annotations need a goal and we still need to refer to the correct reading of a sign.
  3. example of missing or too many signs: we use <x> to indicate sign(s) we think is/are missing or <<x>> to indicate sign(s) that we think is/are wrongly put in, the missing signs will not be annotated but they still typically have an index, whereas the ones that are wrongly there will be annotated and also still have an index.
  4. example of word mixing: the rule is that hyphens split syllables or words within Proper nouns, whereas dots split different signs part of a word in logographic writing. Syllables are written with lower case letters, logograms with upper case letters. One typical confusion is that in Proper nouns two words written with logographic writing can be split by hyphens if they each refer to two individual words, e.g. {d}EN.ZU for the god Sîn, but {d}EN.ZU-ZI for the personal name Sîn-napišti (ZI = napišti).
  5. example of ambiguity for upper case: most upper case written signs are logograms, but some are uncertain readings for a sign, i.e. we can see what sign it is, but we don’t know how to understand it.

Important ATF flags

  • # = partial breakage, all signs after each other that are followed by a # will be parsed in classical publications to start and end with upper-half square brackets, e.g. ITI NE#.NE#.GAR# U4.5(disz).KAM parses to ITI ⸢NE.NE.GAR⸣ U4.5(disz).KAM. Most often these signs will also be annotated but sometimes they won't, depends on how bad the breakage is.
  • ! = corrected reading, is often followed by a parentheses with the original wrong reading, e.g. na!(u4), the value we read is 'na' but on the tablet we see 'u4'.
  • ? = uncertain identification of a sign
  • [] = complete broken of section, can contain signs, e.g. [IGI {d}]EN.ZU-ZI DUB.SAR, the signs within square brackets are never annotated. Can also just be [...] to indicate that the break contains things we can't estimate, sometimes they also contain a number of x's indicating a number of signs we assume to be there, e.g. [IGI x-x-x-x] DUB.SAR.
  • <> = signs that should be added, see above
  • <<>> = signs that should be removed, see above
  • / = we don't know which of the following signs to read, but it should be one of these.

Credits

Development by GhentCDH - Ghent University.

Funded by the GhentCDH research projects.

About

Typescript utilities to work with ATF, a text format describing cuniform tablet contents

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published