Skip to content
This repository was archived by the owner on Dec 2, 2021. It is now read-only.

Commit 625b971

Browse files
cmharlowmjgiarlo
authored andcommitted
Mappings docs (#14)
* add giarlos start * add revised/clean CAP org fixture & git-ignored sensitive-fixtures dir * mapping in progress * second revision of mappings for cap orgs * added VIVO example output in mapping * minor repairs to org sample to match docs * adding start of sample VIVO output to fixtures * split CAP specific docs to docs, mappings to RIALTO all
1 parent f587c81 commit 625b971

File tree

5 files changed

+1029
-16
lines changed

5 files changed

+1029
-16
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,4 @@
77
/pkg/
88
/spec/reports/
99
/tmp/
10+
/spec/sensitive-fixtures/

docs/CAP-organizations.md

+258
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,258 @@
1+
# CAP Organizations to VIVO / RIALTO Stanford Organizations Mapping
2+
3+
This is mapping documentation for taking CAP API Organizations data (`http://api.stanford.edu/cap/v1/orgs/org-path-name`) and mapping them to our RIALTO model (based on VIVO-ISF Ontology) for `Organizations` (a subclass of `Agents`).
4+
5+
## Mapping
6+
7+
Reused Ontologies List (to be vetted):
8+
- "bibo": "http://purl.org/ontology/bibo/"
9+
- "c4o": "http://purl.org/spar/c4o/"
10+
- "cito": "http://purl.org/spar/cito/"
11+
- "dbpedia": "http://dbpedia.org/resource/"
12+
- "dbo": "http://dbpedia.org/ontology/"
13+
- "event": "http://purl.org/NET/c4dm/event.owl#"
14+
- "fabio": "http://purl.org/spar/fabio/"
15+
- "foaf": "http://xmlns.com/foaf/0.1/"
16+
- "geo": "http://aims.fao.org/aos/geopolitical.owl#"
17+
- "obo": "http://purl.obolibrary.org/obo/"
18+
- "ocrer": "http://purl.org/net/OCRe/research.owl#"
19+
- "ocresd": "http://purl.org/net/OCRe/study_design.owl#"
20+
- "owl": "http://www.w3.org/2002/07/owl#"
21+
- "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
22+
- "rdfs": "http://www.w3.org/2000/01/rdf-schema#"
23+
- "scires": "http://vivoweb.org/ontology/scientific-research#"
24+
- "skos": "http://www.w3.org/2004/02/skos/core#"
25+
- "vcard": "http://www.w3.org/2006/vcard/ns#"
26+
- "vitro": "http://vitro.mannlib.cornell.edu/ns/vitro/0.7#"
27+
- "vitro-public": "http://vitro.mannlib.cornell.edu/ns/vitro/public#"
28+
- "vivo": "http://vivoweb.org/ontology/core#"
29+
- "xsd": "http://www.w3.org/2001/XMLSchema#"
30+
31+
For a given organization hash:
32+
33+
| CAP key | RIALTO entry | Notes |
34+
| ------------ | --------------------------------------------------------- | ----- |
35+
| 'type' | `rdf:type` / `@type` for given organization at RIALTO URI | See mapping below. |
36+
| 'alias' | `@id` `http://rialto.stanford.edu/individual/{alias}` | Domain may change. Want to confirm alias is consistent enough for use of minting resources that will be fed by all data sources. |
37+
| 'alias' | `dbo:alias` then value as string | Capture the alias also in the metadata explicitly. |
38+
| 'name' | `rdfs:label` then value as string | any alt labels? repeated labels? need to check. |
39+
| 'orgCodes' | for each value, `dbo:code` then value as string | alternate identifiers? where will we look for later matching? |
40+
| 'children' | `obo:BFO_0000051` (*has part*) then child's alias value as RIALTO URI | capture each presumed URI from the alias, but get the data for that specific organization from separate API calls...? See question above. |
41+
| 'children' | for each child, `obo:BFO_000005` (*part of*) then parent's RIALTO URI | how to make sure this adds data to the child's graph without removing data the parent won't know about? Or just use Stanford / ROOT and add all data for all Organizations from that? |
42+
| 'url' | `rdfs:seeAlso` then value as IRI | |
43+
| 'browsable' | n/a | Ignore. |
44+
| 'onboarding' | n/a | Ignore. |
45+
46+
| CAP Organization Type | RIALTO / VIVO Entity Type | Notes |
47+
| --------------------- | ------------------------------------ | ----- |
48+
| ROOT | vivo:University << foaf:Organization | |
49+
| SCHOOL | vivo:School << foaf:Organization | |
50+
| DEPARTMENT | vivo:Department << foaf:Organization | From VIVO for Department: "Use for any non-academic department" so this may not fit long-term. Seems like vivo:AcademicDepartment could be better, but departments in CAP are not consistently academic or other. |
51+
| DIVISION | vivo:Division << foaf:Organization, vivo:ExtensionUnit | From VIVO: subclass of Extension Unit, "A unit devoted primarily to extension activities, whether for outreach or research", so this may not fit long term. |
52+
| SUB_DIVISION | vivo:Division << foaf:Organization, vivo:ExtensionUnit | See note above. No requirement to distinguish sub-ness in RIALTO. |
53+
54+
## Sample Input
55+
56+
Sample source CAP data for a provided Organization is in [our fixtures (this has been shortened and any real values replace)](../spec/fixtures/cap/organization.json). See a simplified example of the JSON output below:
57+
58+
```JSON
59+
{
60+
"alias": "stanford-test",
61+
"browsable": false,
62+
"children": [{
63+
"alias": "department-of-funny-walks",
64+
"browsable": false,
65+
"children": [{
66+
"alias": "department-of-funny-walks/intercollegiate-walks",
67+
"browsable": false,
68+
"name": "Intercollegiate Walks",
69+
"onboarding": true,
70+
"orgCodes": [
71+
"WALK",
72+
"WALZ"
73+
],
74+
"type": "DEPARTMENT"
75+
},
76+
{
77+
"alias": "department-of-funny-walks/walks-education",
78+
"browsable": false,
79+
"children": [{
80+
"alias": "department-of-funny-walks/walks-education/adventure-walks",
81+
"browsable": false,
82+
"name": "Adventure Walks",
83+
"onboarding": true,
84+
"orgCodes": [
85+
"ADVE"
86+
],
87+
"type": "DIVISION"
88+
}
89+
],
90+
"name": "Walks Education",
91+
"onboarding": true,
92+
"orgCodes": [
93+
"EDUC",
94+
"WEDU",
95+
"EDUW",
96+
"WAED",
97+
"EDWA"
98+
],
99+
"type": "DEPARTMENT"
100+
}
101+
],
102+
"name": "Department of Funny Walks",
103+
"onboarding": false,
104+
"orgCodes": [
105+
"HAAA"
106+
],
107+
"type": "SCHOOL"
108+
},
109+
{
110+
"alias": "graduate-school-of-parrots",
111+
"browsable": false,
112+
"name": "Graduate School of Parrots",
113+
"onboarding": true,
114+
"orgCodes": [
115+
"PARR"
116+
],
117+
"type": "SCHOOL",
118+
"url": "http://parrots.python.pizza/"
119+
}],
120+
"name": "Stanford Test",
121+
"onboarding": false,
122+
"orgCodes": [
123+
"STAN"
124+
],
125+
"type": "ROOT",
126+
"url": "http://python.pizza/"
127+
}
128+
```
129+
130+
For any given Organization, these keys / fields appear in its Organization object:
131+
132+
| Key | Expectation | Definition | Notes |
133+
| ------------ | ----------------------------- | ---------- | ----- |
134+
| 'orgCodes' | Array of 4-letter strings | the Stanford-specific (for ... HR?) organization code or identifier | history / previous projects says these can be helpful but also reflect previous / no longer extent departments or relationships |
135+
| 'type' | String, 1 of following values: `ROOT`, `SCHOOL`, `DEPARTMENT`, `DIVISION`, `SUB_DIVISION` | The type of organization within the University (aka the `ROOT`) | See mappings to RIALTO / VIVO types below |
136+
| 'name' | String | Name or label for the organization represented by the present JSON Object | n/a |
137+
| 'children' | Array of Organization Objects | Any organizations that are children of the organization represented by the present JSON Object | Should we iterate on these for data or just to know what orgs are children, then call their own API response separately? |
138+
| 'browsable' | Boolean | Uncertain. If is public data? | n/a |
139+
| 'alias' | String, API query path value | The API URL path value for the organization. | Is this used for anything other than the API? |
140+
| 'url' | String, HTTP URL | URL provided for the given organization. | n/a |
141+
| 'onboarding' | Boolean | Uncertain. If onboarding exists? | n/a |
142+
143+
## Sample Output
144+
145+
Sample output VIVO JSON-LD data for a provided Organization is in [our fixtures (this has been shortened and any real values replace)](../spec/fixtures/vivo/org-out.json). See a simplified example of the JSON output below:
146+
147+
```JSON
148+
{
149+
"@context": {
150+
"dbpedia": "http://dbpedia.org/resource/",
151+
"dbo": "http://dbpedia.org/ontology/",
152+
"obo": "http://purl.obolibrary.org/obo/",
153+
"rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
154+
"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
155+
"vivo": "http://vivoweb.org/ontology/core#"
156+
},
157+
"@graph": [
158+
{
159+
"@id": "http://rialto.stanford.edu/individual/stanford-test",
160+
"@type": "vivo:University",
161+
"dbo:alias": "stanford-test",
162+
"rdfs:label": "Stanford Test",
163+
"rdfs:seeAlso": "http://python.pizza/",
164+
"dbo:code": [
165+
"STAN"
166+
],
167+
"obo:BFO_0000051": [
168+
{
169+
"@id": "http://rialto.stanford.edu/individual/department-of-funny-walks"
170+
},
171+
{
172+
"@id": "http://rialto.stanford.edu/individual/graduate-school-of-parrots"
173+
}
174+
],
175+
"obo:RO_0001025": {
176+
"@id": "dbpedia:Palo_Alto,_California"
177+
}
178+
},
179+
{
180+
"@id": "http://rialto.stanford.edu/individual/graduate-school-of-parrots",
181+
"@type": "vivo:School",
182+
"dbo:alias": "graduate-school-of-parrots",
183+
"rdfs:label": "Graduate School of Parrots",
184+
"rdfs:seeAlso": "http://parrots.python.pizza/",
185+
"dbo:code": [
186+
"PARR"
187+
],
188+
"obo:BFO_0000050": {
189+
"@id": "http://rialto.stanford.edu/individual/stanford-test"
190+
}
191+
},
192+
{
193+
"@id": "http://rialto.stanford.edu/individual/department-of-funny-walks",
194+
"@type": "vivo:School",
195+
"dbo:alias": "department-of-funny-walks",
196+
"rdfs:label": "Department of Funny Walks",
197+
"dbo:code": [
198+
"HAAA"
199+
],
200+
"obo:BFO_0000050": {
201+
"@id": "http://rialto.stanford.edu/individual/stanford-test"
202+
},
203+
"obo:BFO_0000051": [
204+
{
205+
"@id": "http://rialto.stanford.edu/individual/department-of-funny-walks/intercollegiate-walks"
206+
},
207+
{
208+
"@id": "http://rialto.stanford.edu/individual/department-of-funny-walks/walks-education"
209+
}
210+
]
211+
},
212+
{
213+
"@id": "http://rialto.stanford.edu/individual/department-of-funny-walks/intercollegiate-walks",
214+
"@type": "vivo:Department",
215+
"dbo:alias": "department-of-funny-walks/intercollegiate-walks",
216+
"rdfs:label": "Intercollegiate Walks",
217+
"dbo:code": [
218+
"WALK",
219+
"WALZ"
220+
],
221+
"obo:BFO_0000051": {
222+
"@id": "http://rialto.stanford.edu/individual/department-of-funny-walks"
223+
}
224+
},
225+
{
226+
"@id": "http://rialto.stanford.edu/individual/department-of-funny-walks/walks-education",
227+
"@type": "vivo:Department",
228+
"dbo:alias": "department-of-funny-walks/walks-education",
229+
"rdfs:label": "Walks Education",
230+
"dbo:code": [
231+
"EDUC",
232+
"WEDU",
233+
"EDUW",
234+
"WAED",
235+
"EDWA"
236+
],
237+
"obo:BFO_0000051": {
238+
"@id": "http://rialto.stanford.edu/individual/department-of-funny-walks"
239+
},
240+
"obo:BFO_0000050": {
241+
"@id": "http://rialto.stanford.edu/individual/department-of-funny-walks/walks-education/adventure-walks"
242+
}
243+
},
244+
{
245+
"@id": "http://rialto.stanford.edu/individual/department-of-funny-walks/walks-education/adventure-walks",
246+
"@type": "vivo:Division",
247+
"dbo:alias": "department-of-funny-walks/walks-education/adventure-walks",
248+
"rdfs:label": "Adventure Walks",
249+
"dbo:code": [
250+
"ADVE"
251+
],
252+
"obo:BFO_0000051": {
253+
"@id": "http://rialto.stanford.edu/individual/department-of-funny-walks/walks-education"
254+
}
255+
}
256+
]
257+
}
258+
```

mapping.md

+43-16
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,49 @@
1-
TODO: wrap below mappings in @graph => {}
1+
# RIALTO / VIVO Mapping & Mapping Target
22

3-
each mapping should include: @id, @type, rdfs:label, one or more obo:BFO\_0000050 or obo:BFO\_0000051
3+
This is mapping documentation for the end result of our selected sources to RIALTO / VIVO models. See more information in our [docs folder](docs). This will be iterated on as sources and types are mapped.
44

5-
alias (string) => @id http://authorities.stanford.edu/orgs#{alias}
6-
browsable (boolean) => ignore
7-
children (array) => keep track of parent, iterate over values (for each obo:BFO\_0000050/partOf) and map, keep track of children for obo:BFO\_0000051/hasPart
8-
name (string) => rdfs:label
9-
onboarding (boolean) => ignore
10-
orgCodes (array) => vivo:abbreviation
11-
type (string) => see type mappings
12-
url (string) => rdfs:seeAlso
5+
# Reused Ontologies List (to be further vetted)
136

14-
type mappings
7+
- "dbpedia": "http://dbpedia.org/resource/"
8+
- "dbo": "http://dbpedia.org/ontology/"
9+
- "foaf": "http://xmlns.com/foaf/0.1/"
10+
- "obo": "http://purl.obolibrary.org/obo/"
11+
- "owl": "http://www.w3.org/2002/07/owl#"
12+
- "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
13+
- "rdfs": "http://www.w3.org/2000/01/rdf-schema#"
14+
- "skos": "http://www.w3.org/2004/02/skos/core#"
15+
- "vivo": "http://vivoweb.org/ontology/core#"
16+
- "xsd": "http://www.w3.org/2001/XMLSchema#"
1517

16-
DIVISION @type: http://vivoweb.org/ontology/core#Division
17-
SUB_DIVISION @type: http://vivoweb.org/ontology/core#Division
18-
ROOT @type: http://vivoweb.org/ontology/core#University
19-
SCHOOL @type: http://vivoweb.org/ontology/core#School
20-
DEPARTMENT @type: http://vivoweb.org/ontology/core#Department
18+
## Overarching RIALTO Model
2119

20+
TBD
2221

22+
## Mappings to RIALTO
23+
24+
### For Organizations
25+
26+
| Source & key | RIALTO entry | Notes |
27+
| -------------- | --------------------------------------------------------- | ----- |
28+
| CAP 'type' | `rdf:type` / `@type` for given organization at RIALTO URI | See mapping below. |
29+
| CAP 'alias' | `@id` `http://rialto.stanford.edu/individual/{alias}` | Domain may change. Want to confirm alias is consistent enough for use of minting resources that will be fed by all data sources. |
30+
| CAP 'alias' | `dbo:alias` then value as string | Capture the alias also in the metadata explicitly. |
31+
| CAP 'name' | `rdfs:label` then value as string | any alt labels? repeated labels? need to check. |
32+
| CAP 'orgCodes' | for each value, `dbo:code` then value as string | alternate identifiers? where will we look for later matching? |
33+
| CAP 'children' | `obo:BFO_0000051` (*has part*) then child's alias value as RIALTO URI | capture each presumed URI from the alias, but get the data for that specific organization from separate API calls...? See question above. |
34+
| CAP 'children' | for each child, `obo:BFO_000005` (*part of*) then parent's RIALTO URI | how to make sure this adds data to the child's graph without removing data the parent won't know about? Or just use Stanford / ROOT and add all data for all Organizations from that? |
35+
| CAP 'url' | `rdfs:seeAlso` then value as IRI | |
36+
37+
### RIALTO Organization Types Mapping
38+
39+
| Source & Type | RIALTO / VIVO Entity Type | Notes |
40+
| --------------------- | ------------------------------------ | ----- |
41+
| CAP@type ROOT | vivo:University << foaf:Organization | |
42+
| CAP@type SCHOOL | vivo:School << foaf:Organization | |
43+
| CAP@type DEPARTMENT | vivo:Department << foaf:Organization | From VIVO for Department: "Use for any non-academic department" so this may not fit long-term. Seems like vivo:AcademicDepartment could be better, but departments in CAP are not consistently academic or other. |
44+
| CAP@type DIVISION | vivo:Division << foaf:Organization, vivo:ExtensionUnit | From VIVO: subclass of Extension Unit, "A unit devoted primarily to extension activities, whether for outreach or research", so this may not fit long term. |
45+
| CAP@type SUB_DIVISION | vivo:Division << foaf:Organization, vivo:ExtensionUnit | See note above. No requirement to distinguish sub-ness in RIALTO. |
46+
47+
## Sample RIALTO Graph
48+
49+
Sample output VIVO JSON-LD data for a provided Organization is in [our fixtures (this has been shortened and any real values replace)](spec/fixtures/vivo/org-out.json). A larger file with a fuller graph generated from multiple sources will be added in the near future.

0 commit comments

Comments
 (0)