Skip to content

Add datalab support#136

Merged
SteffenBrinckmann merged 1 commit into
TheELNConsortium:masterfrom
ml-evs:ml-evs/datalab
Feb 25, 2026
Merged

Add datalab support#136
SteffenBrinckmann merged 1 commit into
TheELNConsortium:masterfrom
ml-evs:ml-evs/datalab

Conversation

@ml-evs
Copy link
Copy Markdown
Contributor

@ml-evs ml-evs commented Oct 20, 2025

We are working on .eln export for datalab and have a minimal working version (datalab-org/datalab#1371 -- cc @BenjaminCharmes). This PR begins to add our example (copied from the other projects for now) and will let us iterate on support for all of our fields.

I had a couple of Q's:

  • how are different platforms claiming .eln file import support, given how broadly it can be interpreted? Is it enough to simply store the file, or is there an expectation that entry types will be mapped to the underlying standard of the receiving ELN?
  • Is it possible in the current standard to link to (potentially very large) raw data files by reference rather than including them in the .eln archive, or is that out of scope?

Cheers!

@NicolasCARPi
Copy link
Copy Markdown
Contributor

is there an expectation that entry types will be mapped to the underlying standard of the receiving ELN?

Yes, things are mapped, see for example:
https://github.com/TheELNConsortium/TheELNFileFormat/tree/master/examples/elabftw#concepts-used

So for instance in eLabFTW there are tags. The become keywords in the .eln, and so everytime there is some content in keywords attribute of a dataset, the import code interprets it as tags.

Is it possible in the current standard to link to (potentially very large) raw data files by reference rather than including them in the .eln archive, or is that out of scope?

Yes, very much possible, but we decided to not import external data (like fetch the 4 Gb file on the network).

@NicolasCARPi NicolasCARPi self-requested a review October 20, 2025 20:13
@ml-evs
Copy link
Copy Markdown
Contributor Author

ml-evs commented Oct 20, 2025

Hi @NicolasCARPi, thanks for the quick response. I'll look into that mapping for fields. I guess my issue with import into datalab is that we would treat every ELN "entry" as a physical sample at the moment rather than something generic -- I'll probably leave import for another day and focus on export for now.

@NicolasCARPi
Copy link
Copy Markdown
Contributor

@ml-evs Hello Matthew, think you can update this PR? Otherwise we will close it.

@ml-evs ml-evs marked this pull request as ready for review February 9, 2026 14:00
@ml-evs ml-evs changed the title [WIP] Add datalab support Add datalab support Feb 9, 2026
@ml-evs
Copy link
Copy Markdown
Contributor Author

ml-evs commented Feb 9, 2026

@ml-evs Hello Matthew, think you can update this PR? Otherwise we will close it.

Thanks for the reminder -- I've just pushed an example from our latest release, hopefully this is ready-for-review now!

@ml-evs
Copy link
Copy Markdown
Contributor Author

ml-evs commented Feb 9, 2026

CI failures look like an intermittent GitHub issue, but there's still some failing tests locally -- will fix now.

@ml-evs
Copy link
Copy Markdown
Contributor Author

ml-evs commented Feb 9, 2026

Looks like the only outstanding issue is how we are handling nesting of samples -> files. The generic ro-crate tools all seem to work fine and show each sample as a sub-crate, does .eln have some harsher requirements than this?

@SteffenBrinckmann
Copy link
Copy Markdown
Collaborator

Yes there are some more stringent tests: tests/check.py and the function checkParamMetadataJson. It verifies that certain keys exist.

The issue seems to be:
**ERROR: all entries must only occur once in crate. check: ./jdb2/CG20474_jdb11-2a.xrdml
**ERROR: all entries must only occur once in crate. check: ./jdb2_e1_c1/jdb11-1_c3_gcpl_5cycles_2V-3p8V_C-24_data_C09.mpr

Does that help? or should I go into your .eln-file?

@ml-evs
Copy link
Copy Markdown
Contributor Author

ml-evs commented Feb 11, 2026

Yeah, I've replicated that error locally but I'm not sure I understand it -- each file is only there once in the crate and in the metadata so I'm not sure I follow...

@SteffenBrinckmann
Copy link
Copy Markdown
Collaborator

In the ro-crate-metadata.json:64, you state that there is a hasPart with an @id; but that @id does not appear anywhere in the file.

    {
      "@id": "./jdb2/",
      "@type": "Dataset",
      "name": "sodium cobalt oxide made by solid state synthesis (2nd attempt)",
      "identifier": "jdb2",
      "dateCreated": "2024-02-22T07:19:00",
      "hasPart": [
        {
          "@id": "./jdb2/CG20474_jdb11-2a.xrdml"
        }
      ]
    },

Same issue a few lines down:83, you state that there is an @id, but that @id does not appear in the file.

So the issue is not that an item is double, but that some ids (that you use) do not exist.

@ml-evs
Copy link
Copy Markdown
Contributor Author

ml-evs commented Feb 11, 2026

In the ro-crate-metadata.json:64, you state that there is a hasPart with an id; but that id does not appear anywhere in the file.

    {
      "@id": "./jdb2/",
      "@type": "Dataset",
      "name": "sodium cobalt oxide made by solid state synthesis (2nd attempt)",
      "identifier": "jdb2",
      "dateCreated": "2024-02-22T07:19:00",
      "hasPart": [
        {
          "@id": "./jdb2/CG20474_jdb11-2a.xrdml"
        }
      ]
    },

Same issue a few lines down:83, you state that there is an id, but that id does not appear in the file.

So the issue is not that an item is double, but that some ids (that you use) do not exist.

Ahhh, that makes more sense -- the error message is a bit of a red herring but I can fix that on our end this week.

@ml-evs
Copy link
Copy Markdown
Contributor Author

ml-evs commented Feb 12, 2026

Down to to final issues in local testing:

- [{'@id': './people/65d6e50050726b088d328499'}, {'@id': './people/6574f788aabb227db8d1b14e'}] is not of type 'object'

How should I specify multiple authors? I'm trying to set authors to the above but the validator isn't allowing it.

Detected issue of severity REQUIRED with check "ro-crate-1.1_2.2": RO-Crate file descriptor "ro-crate-metadata.json" is not fully flattened at entity "./"

Assume this is something straightforward I'm missing -- can look at the validator code for it directly.

@SteffenBrinckmann
Copy link
Copy Markdown
Collaborator

Hey,

  • author - is according to schema.org - a single person / organization, i.e. it is a restricted entity and using "authors" solves those restrictions and is grammatically correct for multiple people anyhow. I always use 'authors' instead of 'author' to not run into that trap.
  • in './' you have a license entry, which itself is a node and should not be hierarchically nested but separate, according to ro-crate standard. If I remove the 'name' it works without a flaw, the @id is sufficient at an URL.
  • in the eln format we have the agreement that the filename is the same as the folder-name when extracted. An earlier version of your PR was following that guideline. Currently you use "demo:IBPDKL" which is not.

- Add screenshot and note about relationships

- Add example ELN export from demo server

- Update datalab example with version and license info, plus remove null dates

- Fix several issues (but not all) in datalab example

- Fix eln zip naming and other tweaks
@ml-evs
Copy link
Copy Markdown
Contributor Author

ml-evs commented Feb 16, 2026

Thanks @SteffenBrinckmann, that's super helpful -- I did look for plural authors in schema.org and was surprised not to find something similar -- I've made your other suggested fixes and believe the tests should now pass... I get 7 warnings locally, but these are also present if I remove the datalab folder, so hopefully I haven't made anything worse!

@SteffenBrinckmann SteffenBrinckmann merged commit 55176a5 into TheELNConsortium:master Feb 25, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants