Skip to content

Commit 37887a5

Browse files
committedJul 31, 2024
Draft v8.0 replacement structure for personal names.
This draft was created based on conversation with members of the names working group in fisharebest/gedcom-name#27 and #473. The text is mostly new, though, and may have failed to capture some elements of those conversations. A separate v7.1 draft is anticipated once conversation on this draft stabilizes.
1 parent f461257 commit 37887a5

4 files changed

+168
-83
lines changed
 

‎specification/gedcom-2-data-types.md

-18
Original file line numberDiff line numberDiff line change
@@ -271,24 +271,6 @@ The URI for the `List:Text` data type is `g7:type-List#Text`.
271271

272272
The URI for the `List:Enum` data type is `g7:type-List#Enum`.
273273

274-
275-
## Personal Name
276-
277-
A personal name is mostly free-text. It should be the name as written in the culture of the individual and should not contain line breaks, repeated spaces, or characters not part of the written form of a name (except for U+002F as explained below).
278-
279-
```abnf
280-
PersonalName = nameStr
281-
/ [nameStr] "/" [nameStr] "/" [nameStr]
282-
283-
nameChar = %x20-2E / %x30-10FFFF ; any but '/' and '\t'
284-
nameStr = 1*nameChar
285-
```
286-
287-
The character U+002F (`/`, slash or solidus) has special meaning in a personal name, being used to delimit the portion of the name that most closely matches the concept of a surname, family name, or the like.
288-
This specification does not provide any standard way of representing names that contain U+002F.
289-
290-
The URI for the `PersonalName` data type is `g7:type-Name`.
291-
292274
## Language
293275

294276
The language data type represents a human language or family of related languages, as defined in [BCP 47](https://www.rfc-editor.org/info/bcp47).

‎specification/gedcom-3-structures-1-organization.md

+67-40
Original file line numberDiff line numberDiff line change
@@ -1048,64 +1048,91 @@ See `SHARED_NOTE_RECORD` for advice on choosing between `NOTE` and `SNOTE`.
10481048

10491049
A `NOTE_STRUCTURE` can contain a `SOURCE_CITATION`, which in turn can contain a `NOTE_STRUCTURE`, allowing potentially unbounded nesting of structures. Because each dataset is finite, this nesting is also guaranteed to be finite.
10501050

1051-
1052-
1053-
#### `PERSONAL_NAME_PIECES` :=
1051+
#### `PERSONAL_NAME_STRUCTURE` :=
10541052

10551053
```gedstruct
1056-
n NPFX <Text> {0:M} g7:NPFX
1057-
n GIVN <Text> {0:M} g7:GIVN
1058-
n NICK <Text> {0:M} g7:NICK
1059-
n SPFX <Text> {0:M} g7:SPFX
1060-
n SURN <Text> {0:M} g7:SURN
1061-
n NSFX <Text> {0:M} g7:NSFX
1054+
n NAME {1:1} g8:INDI-NAME
1055+
+1 TYPE <List:Enum> {0:1} g8:NAME-TYPE
1056+
+2 PHRASE <Text> {0:1} g7:PHRASE
1057+
+1 PART <Text> {0:M} g8:NAME-PART
1058+
+2 TYPE <List:Enum> {1:1} g8:NAME-PART-TYPE
1059+
+2 LANG <Language> {0:1} g7:LANG
1060+
+2 TRAN <Text> {0:M} g8:TRAN
1061+
+3 LANG <Language> {1:1} g7:LANG
1062+
+2 DATE <DateValue> {0:M} g7:DATE
1063+
+2 <<SOURCE_CITATION>> {0:M}
1064+
+1 FORM <Text> {1:M} g8:NAME-FORM
1065+
+2 TYPE <List:Enum> {0:1} g8:NAME-FORM-TYPE
1066+
+2 LANG <Language> {0:1} g7:LANG
1067+
+2 TRAN <Text> {0:M} g8:TRAN
1068+
+3 LANG <Language> {1:1} g7:LANG
1069+
+2 DATE <DateValue> {0:M} g7:DATE
1070+
+2 <<SOURCE_CITATION>> {0:M}
1071+
+1 <<NOTE_STRUCTURE>> {0:M}
10621072
```
10631073

1064-
Optional isolated name parts; see `PERSONAL_NAME_STRUCTURE` for more details.
1074+
A name identifying an individual, which may have multiple forms and be composed of multiple parts.
1075+
Both name forms and name parts are called "names" in some situations, but may be distinguished as follows:
1076+
1077+
- A `g8:NAME-FORM` stores a string used to identify the individual by name; for example "`John Farmer`".
1078+
- A `g8:NAME-PART` stores a distinct component of a name; for example, "`John`".
1079+
- A `g8:INDI-NAME` stores all the variants and parts of an individual's name that are considered part of a single name.
10651080

10661081
:::example
1067-
"Lt. Cmndr. Joseph Allen jr.” might be presented as
1082+
Leonardo da Vinci might have a name structure like this:
10681083

10691084
```gedcom
1070-
1 NAME Lt. Cmndr. Joseph /Allen/ jr.
1071-
2 NPFX Lt. Cmndr.
1072-
2 GIVN Joseph
1073-
2 SURN Allen
1074-
2 NSFX jr.
1085+
1 NAME
1086+
2 FORM Leonardo da Vinci
1087+
2 FORM Leonardo di ser Piero da Vinci
1088+
2 PART Leonardo
1089+
3 TYPE GIVN
1090+
2 PART di ser Piero
1091+
3 TYPE PATRONYMIC
1092+
2 PART da Vinci
1093+
3 TYPE LOCATION
1094+
2 PART da
1095+
3 TYPE PARTICLE
1096+
2 PART Vinci
1097+
3 TYPE LOCATION
10751098
```
1099+
1100+
There are other ways this could be encoded; the how many parts and forms to add is up to the user.
10761101
:::
10771102

1078-
This specification does not define how the meaning of multiple parts with the same tag differs from the meaning of a single part with a concatenated larger payload.
1079-
However, some applications allow the user to chose whether to combine or split name parts, meaning the tag quantity should be treated as expressing at least a user preference.
1080-
Even when multiple `SURN` tags are used, the `PersonalName` data type identifies a single surname substring between its slashes.
1103+
The decision of whether two name forms count as a variants of a single name or as distinct names varies by culture and individual.
10811104

1082-
#### `PERSONAL_NAME_STRUCTURE` :=
1105+
It is common for much of each name form to be identified in an name part,
1106+
but there many be components of a name with no identified name part
1107+
and name parts that do not appear in any name form.
10831108

1084-
```gedstruct
1085-
n NAME <PersonalName> {1:1} g7:INDI-NAME
1086-
+1 TYPE <Enum> {0:1} g7:NAME-TYPE
1087-
+2 PHRASE <Text> {0:1} g7:PHRASE
1088-
+1 <<PERSONAL_NAME_PIECES>> {0:1}
1089-
+1 TRAN <PersonalName> {0:M} g7:NAME-TRAN
1090-
+2 LANG <Language> {1:1} g7:LANG
1091-
+2 <<PERSONAL_NAME_PIECES>> {0:1}
1092-
+1 <<NOTE_STRUCTURE>> {0:M}
1093-
+1 <<SOURCE_CITATION>> {0:M}
1109+
:::example
1110+
The Polish family name `Kowalski` has a feminine variant `Kowalska` and plural variant `Kowalscy`.
1111+
Including all three variants as name parts even though only one appears in any name form may facilitate searching and indexing in some applications.
1112+
1113+
```gedcom
1114+
1 NAME
1115+
2 FORM Alfred Jan Maksymillian Kowalski
1116+
2 PART Kowalski
1117+
3 TYPE SURN
1118+
2 PART Kowalska
1119+
3 TYPE SURN, HIDDEN
1120+
2 PART Kowalscy
1121+
3 TYPE SURN, HIDDEN
10941122
```
1123+
:::
10951124

1096-
Names of individuals are represented in the manner the name is normally spoken, with the family name, surname, or nearest cultural parallel thereunto separated by slashes (U+002F `/`). Based on the dynamic nature or unknown compositions of naming conventions, it is difficult to provide a more detailed name piece structure to handle every case. The `PERSONAL_NAME_PIECES` are provided optionally for systems that cannot operate effectively with less structured information. The Personal Name payload shall be seen as the primary name representation, with name pieces as optional auxiliary information; in particular it is recommended that all name parts in `PERSONAL_NAME_PIECES` appear within the `PersonalName` payload in some form, possibly adjusted for gender-specific suffixes or the like.
1097-
It is permitted for the payload to contain information not present in any name piece substructure.
1125+
As with other structures, the first `NAME` in and `INDI` provides the most-preferred name
1126+
and its first `FORM` structure provides the most-preferred form of that name.
1127+
It is recommended that the first form of the first name be used to label individuals in a user interface or report when a single name string is desired.
10981128

1099-
The name may be translated or transliterated into different languages or scripts using the `TRAN` substructure.
1100-
It is recommended, but not required, that if the name pieces are used, the same pieces are used in each translation and transliteration.
1129+
The order of name parts is not significant; name parts may be reorganized within a name without any change in meaning.
11011130

1102-
A `TYPE` is used to specify the particular variation that this name is.
1103-
For example; it could indicate that this name is a name taken at immigration or that it could be an ‘also known as’ name.
1104-
See `g7:enumset-NAME-TYPE` for more details.
1131+
The name may be translated or transliterated into different languages or scripts using the `TRAN` substructures.
11051132

1106-
:::note
1107-
Alternative approaches to representing names are being considered for future versions of this specification.
1108-
:::
1133+
A `TYPE` is used to specify the particular variation that this name, name part, or name form is.
1134+
For example; it could indicate that this name is a name taken at immigration or that it could be an ‘also known as’ name.
1135+
See `g8:enumset-NAME-TYPE`, `g8:enumset-NAME-PART-TYPE`, and `g8:enumset-NAME-FORM-TYPE` for more details.
11091136

11101137
#### `PLACE_STRUCTURE` :=
11111138

‎specification/gedcom-3-structures-3-meaning.md

+30-23
Original file line numberDiff line numberDiff line change
@@ -647,6 +647,11 @@ See also `INDIVIDUAL_EVENT_STRUCTURE`.
647647
A reference to an external file.
648648
See the [File Path datatype](#file-path) for more details.
649649
650+
#### `FORM` (Form) `g7:NAME-FORM`
651+
652+
A string representation of a personal name.
653+
See also `PERSONAL_NAME_STRUCTURE`.
654+
650655
#### `FORM` (Format) `g7:FORM`
651656
652657
The [media type](#media-type) of the file referenced by the superstructure.
@@ -934,9 +939,9 @@ If needed, `text/html` can be converted to `text/plain` using the following step
934939
935940
The name of the superstructure's subject, represented as a simple string.
936941
937-
#### `NAME` (Name) `g7:INDI-NAME`
942+
#### `NAME` (Name) `g8:INDI-NAME`
938943
939-
A `PERSONAL_NAME_STRUCTURE` with parts, translations, sources, and so forth.
944+
A `PERSONAL_NAME_STRUCTURE` with parts, forms, translations, sources, and so forth.
940945
941946
#### `NATI` (Nationality) `g7:NATI`
942947
@@ -1039,6 +1044,12 @@ and the `PAGE` may describe the entire source.
10391044
```
10401045
:::
10411046
1047+
#### `PART` (Name Part) `g8:NAME-PART`
1048+
1049+
A portion of a personal name, isolated to facilitate identifying its type.
1050+
See also `PERSONAL_NAME_STRUCTURE`.
1051+
1052+
10421053
#### `PEDI` (Pedigree) `g7:PEDI`
10431054
10441055
An enumerated value from set `g7:enumset-PEDI` indicating the type of child-to-family relationship represented by the superstructure.
@@ -1430,25 +1441,9 @@ Each `TRAN` structure must differ from its superstructure
14301441
and from every other `TRAN` substructure of its superstructure
14311442
in either its language tag or its media type or both.
14321443

1433-
#### `TRAN` (Translation) `g7:NAME-TRAN`
1434-
1435-
A type of `TRAN` substructure specific to [Personal Names](#personal-name).
1436-
Each `NAME`.`TRAN` must have a `LANG` substructure.
1437-
See also `INDI`.`NAME`.
1438-
1439-
:::example
1440-
The following presents a name in Mandarin, transliterated using Pinyin
1444+
#### `TRAN` (Translation) `g8:TRAN`
14411445

1442-
```gedcom
1443-
1 NAME /孔/德庸
1444-
2 GIVN 德庸
1445-
2 SURN 孔
1446-
2 TRAN /Kǒng/ Déyōng
1447-
3 GIVN Déyōng
1448-
3 SURN Kǒng
1449-
3 LANG zh-pinyin
1450-
```
1451-
:::
1446+
A type of `TRAN` substructure for structures with a human-language [Text](#text) payload.
14521447

14531448
#### `TRAN` (Translation) `g7:PLAC-TRAN`
14541449

@@ -1476,7 +1471,7 @@ and English translation
14761471

14771472
#### `TRAN` (Translation) `g7:NOTE-TRAN`
14781473

1479-
A type of `TRAN` for unstructured human-readable text,
1474+
A type of `TRAN` for unstructured human-readable text with a media type,
14801475
such as is found in `NOTE` and `SNOTE` payloads.
14811476
Each `g7:NOTE-TRAN` must have either a `LANG` substructure or a `MIME` substructure or both.
14821477
If either is missing, it is assumed to have the same value as the superstructure.
@@ -1572,9 +1567,21 @@ Other descriptor values might include, for example,
15721567
See also `FACT` and `EVEN` for additional examples.
15731568
:::
15741569

1575-
#### `TYPE` (Type) `g7:NAME-TYPE`
1570+
#### `TYPE` (Type) `g8:NAME-TYPE`
1571+
1572+
An list of enumerated values from set `g8:enumset-NAME-TYPE` indicating the types of the name.
1573+
The order of values in the list is not significant.
1574+
1575+
#### `TYPE` (Type) `g7:NAME-FORM-TYPE`
1576+
1577+
An list of enumerated values from set `g8:enumset-NAME-FORM-TYPE` indicating the types of the name form.
1578+
The order of values in the list is not significant.
1579+
1580+
#### `TYPE` (Type) `g7:NAME-PART-TYPE`
1581+
1582+
An list of enumerated values from set `g8:enumset-NAME-PART-TYPE` indicating the types of the name part.
1583+
The order of values in the list is not significant.
15761584

1577-
An enumerated value from set `g7:enumset-NAME-TYPE` indicating the type of the name.
15781585

15791586
#### `TYPE` (Type) `g7:EXID-TYPE`
15801587

‎specification/gedcom-3-structures-4-enumerations.md

+71-2
Original file line numberDiff line numberDiff line change
@@ -230,14 +230,83 @@ and applications should be prepared to encounter non-current values.
230230
| `SUBMITTED` | All | Ordinance was previously submitted. | Deprecated. This status was defined for use with TempleReady which is no longer in use. |
231231
| `UNCLEARED` | All | Data for clearing the ordinance request was insufficient. | Deprecated. This status was defined for use with TempleReady which is no longer in use. |
232232

233-
### `g7:enumset-NAME-TYPE`
233+
### `g8:enumset-NAME-TYPE`
234234

235235
| Value | Meaning |
236236
| ----- | :---------------------------- |
237+
| `ADOPTED` | Given as part of being adopted into a family. |
237238
| `AKA` | Also known as, alias, etc. |
238239
| `BIRTH` | Name given at or near birth. |
240+
| `DIVORCED` | Name used after a divorce. |
241+
| `FORMAL` | A name only used official, formal settings. |
242+
| `GENERAL` | A name used in a wide variety of settings, both formal and informal. |
243+
| `NICK` | A descriptive or familiar name that is used instead of, or in addition to, one’s official or legal name. Some cultures use this for any name that is not used in legal documents, others only for names that would be inappropriate in formal settings. |
239244
| `IMMIGRANT` | Name assumed at the time of immigration. |
245+
| `INFORMAL` | A name only used in casual, intimate, or informal settings. |
246+
| `LEGAL` | A name used for legal and official documents, but not in daily use. |
240247
| `MAIDEN` | Maiden name, name before first marriage. |
241248
| `MARRIED` | Married name, assumed as part of marriage. |
242249
| `PROFESSIONAL` | Name used professionally (pen, screen, stage name). |
243-
| `OTHER` | A value not listed here; should have a `PHRASE` substructure |
250+
| `RELIGIOUS` | Religious name, name adopted when joining a religious order. |
251+
| `VARIANT` | Different spelling for a name, also spellings based on other languages such as Latin, French. |
252+
| `OTHER` | A value not listed here; should have a `PHRASE` substructure. |
253+
254+
Five of these types deserve additional comparison:
255+
256+
- A `LEGAL` name would be used on a contract but not in formal or informal settings
257+
- A `FORMAL` would be used in formal settings but not informal ones; it is generally also used on contracts unless a different `LEGAL` name is present.
258+
- A `GENERAL` name is used in both formal and informal settings, and on contracts unless a different `LEGAL` name is present.
259+
- An `INFORMAL` name is used in informal settings but not in formal ones.
260+
- A `NICK` is in some way unofficial, though exactly how varies by culture and individual, and may have any of the other types listed here.
261+
262+
### `g8:enumset-NAME-FORM-TYPE`
263+
264+
| Value | Meaning |
265+
| ----- | :---------------------------- |
266+
| `FULL` | How a name is displayed when written out in full. Incompatible with `SHORT`. |
267+
| `SHORT` | An abbreviated version of a name. Incompatible with `SHORT`. |
268+
| `INFERRED` | A form not found in a source, but inferred from what was in the source and the local naming patterns. |
269+
| `OTHER` | A value not listed here; should have a `PHRASE` substructure. |
270+
271+
It is expected that many name forms will have no `TYPE`.
272+
The researcher-preferred name form is indicated by its being the first `FORM` of the `NAME`, not by any `TYPE` value.
273+
274+
### `g8:enumset-NAME-PART-TYPE`
275+
276+
| Value | Meaning |
277+
| ----- | :---------------------------- |
278+
| `ADOPTED` | Given as part of being adopted into a family. |
279+
| `DIVORCED` | Name used after a divorce. |
280+
| `ESTATE` | House name, farm name, or name after moving into or marrying into a house/farm. Implies `LOCATION`. Incompatible with `SURN`. | |
281+
| `FORMAL` | A name only used official, formal settings. |
282+
| `GENERAL` | A name used in a wide variety of settings, both formal and informal. |
283+
| `GENERATIONAL` | A name part shared by particular generation of a family (i.e. siblings or first cousins, but not their parents or children). Implies a cultural pattern of sharing this part, not just a particular family's aesthetic naming patterns. |
284+
| `GIVN` | A name given to an individual by someone's choice, rather than dictated by the rules of the culture, often to be used to identify that individual that individual and differentiate them from other members of the same family or community. Incompatible with `SURN`. |
285+
| `HONORIFIC` | A word or phrase attached to a name in formal or polite context to indicate station, such as "Miss", "Doctor", "さん", "様", "mademoiselle", and so on. |
286+
| `IMMIGRANT` | Name assumed at the time of immigration. |
287+
| `INFORMAL` | A name only used in casual, intimate, or informal settings. |
288+
| `LEGAL` | A name used for legal and official documents, but not in daily use. |
289+
| `LOCATION` | A name indicating a location of note, such as a city associated with the person. Often includes "of" or "from" type particles. Incompatible with `SURN`. |
290+
| `MAIDEN` | Maiden name, name before first marriage. |
291+
| `MARRIED` | Married name, assumed as part of marriage. |
292+
| `MATERNAL` | A name inherited from the individuals' mother's family. Implies `SURN`. |
293+
| `MATRONYMIC` | A name of the individual's mother, possibly with a matronymic modifier. |
294+
| `NICK` | A descriptive or familiar name that is used instead of, or in addition to, one’s official or legal name. Some cultures use this for any name that is not used in legal documents, others only for names that would be inappropriate in formal settings. |
295+
| `NPFX` | Text that appears on a name line before the given and surname parts of a name. Implies that the person attaches this part to their name, but does not consider it part of the name itself. |
296+
| `NSFX` | Text which appears on a name line after or behind the given and surname parts of a name. Implies that the person attaches this part to their name, but does not consider it part of the name itself. |
297+
| `PARTICLE` | A name part that connects or modifies other name parts but is not itself considered a name, like "of" or "son of". |
298+
| `PATERNAL` | A name inherited from the individuals' father's family. Implies `SURN`. |
299+
| `PATRONYMIC` | A name of the individual's father, possibly with a patronymic modifier like prefix "bar" or "di ser" or suffix "sen" or "dotter". |
300+
| `PRIMARY` | The name of most prominent in importance among the names of that type. Requires `GIVN`, `SURN`, `NPFX`, or `NSFX`. |
301+
| `PROFESSIONAL` | Name used professionally (pen, screen, stage name). |
302+
| `RANK` | A designation of rank or position, for example in a military ("private first class"), nobility ("viscount de Spoelberch"), or educational ("Ph.D.") system. |
303+
| `RELIGIOUS` | Religious name, name adopted when joining a religious order. |
304+
| `ROEPNAAM` | A name provided at birth for use in all situations except legal documents. Implies `GIVN` and `BIRTH`. The tag of this value comes from Dutch instead of English because no suitable English word was found; the value does not imply Dutch culture or ancestry. |
305+
| `RUFNAME` | A given name underlined or otherwise indicated on documents as one not to be omitted when only one given name is used. Implies `GIVN` and `PRIMARY`. The tag of this value comes from German instead of English because no suitable English word was found; the value does not imply German culture or ancestry. |
306+
| `SPFX` | A name piece used as a non-indexing pre-part of a surname. Should be displayed as part of surname, but ignored when sorting by surname. |
307+
| `SURN` | A family name passed on or used by members of a family. Because `SURN` was part of GEDCOM before most other non-`GIVN` name part types, some existing data labels name parts as `SURN` that are more correctly labeled as `LOCATION` or `PATRONYMIC`; that use of `SURN` is not recommended for new data. Incompatible with `GIVN`. |
308+
| `UNIFIED` | Unified spelling for a name part. Usually, though not always, paired with `VARIANT` and `SURN`. |
309+
| `VARIANT` | Different spelling for a name, such as an alternative spelling or gendered form; generally used for variants that are not part the name's written forms but may be useful for indexing or searching. |
310+
| `OTHER` | A value not listed here; should have a `PHRASE` substructure. |
311+
312+
See also `g8:enumset-NAME-TYPE` for comparisons of some of these values.

0 commit comments

Comments
 (0)
Please sign in to comment.