-
Notifications
You must be signed in to change notification settings - Fork 13
Description
In #266 (comment) @candleindark identified oddity in our metadata records, that Affiliation records include fields which are not part of the Affiliation model, e.g.
https://api.dandiarchive.org/api/dandisets/000029/versions/draft/info/ ATM has
"affiliation": [
{
"name": "An Institution",
"roleName": [],
"schemaKey": "Affiliation",
"contactPoint": [],
"includeInCitation": false
}
],
after doing archeological metadata expedition we figured that it is 99% likely due to
where affiliations got their own Affiliation class. But migrate()
function was not adjusted to filter them out somehow... but here we do not even need explicit migration since pydantic likely to do the right thing:
In [10]: Affiliation.model_construct(**{
...: "name": "An Institution",
...: "roleName": [],
...: "schemaKey": "Organization",
...: "contactPoint": [],
...: "includeInCitation": False
...: }).model_dump()
Out[10]:
{'id': None,
'schemaKey': 'Organization',
'identifier': None,
'name': 'An Institution'}
and here is with the full
In [11]: Affiliation(**{
...: "name": "An Institution",
...: "roleName": [],
...: "contactPoint": [],
...: "includeInCitation": False
...: }).model_dump()
Out[11]:
{'id': None,
'schemaKey': 'Affiliation',
'identifier': None,
'name': 'An Institution'}
so the hypothesis that absence of metadata migration on dandi-archive side, ref:
keeps old metadata versions present, and it is so:
dandi@drogon:/mnt/backup/dandi/dandisets$ grep -h schemaVersion */dandiset.yaml | sort | uniq -c
8 schemaVersion: 0.4.4
139 schemaVersion: 0.6.0
26 schemaVersion: 0.6.2
85 schemaVersion: 0.6.3
311 schemaVersion: 0.6.4
12 schemaVersion: 0.6.6
111 schemaVersion: 0.6.7
109 schemaVersion: 0.6.8
which would forbid us to validate using more strict models such as the ones disallowing for extra fields, but also potentially simply having "bugs" due to migration not carried out at all.
On the side of dandi-schema I would like us to check what would happen if we .migrate()
metadata records for dandisets -- would they succeed/fail and get rid of those irrelevant values.