Skip to content

Creating a Text File

Aidan Sawyer edited this page Jun 3, 2017 · 10 revisions

Steps

  1. Create new, empty text file according to any given naming convention (same name as the file you’re intending to uploading with the .txt extension highly recommended)

  2. Add header information according to decided upon standard (with new line after for convention/readability)

  3. Construct Body

  4. Add comments

    • Begin the line with two forward slashes (//):If you have comments about the item (something out of the ordinary, date changes, explained typos/corrections, etc),
    • Type out comment. Whatever follows these characters will be added to the dc.description.note field.
    • Each separate line will be added as a distinct comment
  5. Add Contributors [a]

    • Mark the beginning of the contributors section by adding -CONTRIBUTORS-. The parser will be looking for the - character, and for contribut or collaborator in the text following it.
    • Set the type of contributor by writing out the job title in all caps (e.g. “EXECUTIVE EDITOR”, “ILLUSTRATORS”). Don’t worry about pluralization.
    • Enter (or copy/paste) the full name (in the form: “firstName middleName/s lastName”) of all the associated contributors within that class.
  6. Add Articles [b]

    • Mark the beginning of the articles section by adding -ARTICLES-. The parser will find the - character, read it as changing the adding mode, and look for the text 'article' to change the mode.
    • Enter the full title name for each title, one title per line.
  7. Add Subject/s [c]

    • Mark the beginning of the subject section with -SUBJECT-.
    • Similar to adding contributors, optionally add in an inner classification for subject signifying the encoding (e.g. 'LCSH', 'KEYWORD' or 'TAG', 'UDC', etc.)
    • Add in the actual subjects according to the specifications of the selected class.
  8. Quality check

    • Check for any unexpected characters introduced by the the OCR [d,e]
    • Check for common typos or naming errors [f]
    • Check header and body against standard set in config.

Tips and Tricks

  • Edit the documents in a text editor that has spelling suggestions to reduce typos and spelling errors
  • Create new text files from completed old ones:
    • reduce duplication (sections of the file, names, recurring columns, etc.)
  • Name, Combine, and OCR the finalized .pdf at the same time you're cataloging it
    • you're already combining the pages, checking the OCR, and flipping through pages anyway
    • write the filename of the .pdf in the text file first and just copy it over

Notes

Uploading Original Files

If you'd like to upload the original files alongside cataloguing them, include the filename with its tag and ensure that they are located in the same folder as the digital item you are trying to upload.

Collection/Grouping Determination

How you decide to batch the uploads is entirely open to the user. For our purposes, I plan to use it according to the calendar year (~15-25 issues) but organization by volume would work too.


[a] Well-formed 'Contributors' segment

-CONTRIBUTORS-
NEWS EDITOR
Edward R. Murrow
NEWS WRITERS
James Agee
Christiane Amanpour

[b] Well-formed 'Articles' segment

-ARTICLES-
Article 1 Name
This is the Second Article
The Third Article is One Which has a Very Long Title: And a Subtitle to Boot

[c] Well-formed 'Subjects' segment

KEYWORDS
Open Source Software
foss
library automation
LCSH
Z 678.892 Cf.Z699
FAST
Library science--Automation
997921

[d] Unexpected Characters (OCR Errors)

-CONTRIBUTORS-
P3RF0RME~R
Marlc Wahtborg

[e] Common OCR Errors

rn,m m,rn
lc k
c,e,o c,e,o
l,1,I l,1,I
B,R R,B
O,0 0,O
... ...

[f] Spelling Mistakes/Typos

-CONTRIBUTORS-
PERFORMER
Marc Wallbergg