Skip to content

Query regarding model licensing #444

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ThibautGoldsborough opened this issue Dec 13, 2024 · 5 comments
Open

Query regarding model licensing #444

ThibautGoldsborough opened this issue Dec 13, 2024 · 5 comments

Comments

@ThibautGoldsborough
Copy link

Hello,

Please feel free to redirect me to the Image.sc forum if this is not the right place to discuss about licensing.

We are in the process of uploading a number of instanseg models and had a query regarding the licensing of our models. The instanseg method is released under Apache-2.0, however the weights were trained on a number of datasets, each with their own licensing terms. So far we have kept to CC-BY or CC-0 datasets as these are compatible under the Apache-2.0 terms. But we'd like to release more models trained on less permissive licences (e.g. cellpose dataset, tissuenet dataset, livecell dataset), which have confusing strictly non-commercial custom licenses (cellpose and tissuenet).

I noticed that some of the popular models hosted on bioimageio (e.g. https://bioimage.io/#/?id=10.5281%2Fzenodo.5869899&type=model) were released under a fully open license (CC-BY-4.0) yet the training data (livecell) is released under a more restrictive CC-BY-NC. Does bioimageio assume that the license of a model is not tied to the license of the training dataset?

I'm aware this is not your direct responsibility, but there seems to be a lack of guidelines for model developers which could lead to unintended downstream risks for model users.

@FynnBe
Copy link
Member

FynnBe commented Dec 13, 2024

Thank you @ThibautGoldsborough for raising this issue.
It is my current understanding that the license of an RDF applies to the metadata it holds only.
I will forward this to our weekly discussions and report back here (and hopefully update the documentation to make this clear(er)).

@FynnBe
Copy link
Member

FynnBe commented Dec 20, 2024

short update: we did not find time to discuss this this week and take a holiday break until January...

@FynnBe
Copy link
Member

FynnBe commented Jan 8, 2025

short update: raised in today's meeting. We are now working on defining the scope of the license field more concisely.

@fjug
Copy link

fjug commented Jan 29, 2025

Hi @ThibautGoldsborough,

we have (finally) discussed the important issue you raised.

We would like to propose the following:

  • We update the documentation to make it clear that the license file only applies to the metadata of the model.
  • The only place where the license of the training data can/might affect us is the sample image we show in the model card. Here we have two options:
    • If the data license allows it, we associate and display this license in the model card, or
    • We use a sample image that is NOT taken from said training data (or at least not from the part that has the more restrictive license attached).

The next version of our model format should additionally be changed such that listed datasets also make the associated license explicit. This is not currently possible and a workaround would be to use any of the optional comment fields.

Please let us know if you have diverging opinions or if you want to discuss any other aspects regarding this topic.

Thanks a lot,
Florian (and Anna)

@alanocallaghan
Copy link

alanocallaghan commented Feb 5, 2025

For datasets that require derivative works to be shared under a particular type of license, this would seem to preclude sharing using the bioimageio format?

It's also unclear, if the license applies only to the metadata, under which license all other files distributed by bioimageio fall under? And therefore the use of any parts of a bioimageio model becomes legally unclear

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants