Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError while loading Mondo #142

Closed
jsstevenson opened this issue Sep 26, 2023 · 6 comments
Closed

TypeError while loading Mondo #142

jsstevenson opened this issue Sep 26, 2023 · 6 comments
Assignees
Labels
bug Something isn't working priority:high High priority technical debt A feature/requirement implemented in a sub-optimal way & must be re-written. Contrast to "cleanup"

Comments

@jsstevenson
Copy link
Member

Loading Mondo...
Traceback (most recent call last):
  File "/Users/jamesstevenson/code/therapy-normalization/venv/bin/therapy_norm_update", line 8, in <module>
    sys.exit(update_normalizer_db())
  File "/Users/jamesstevenson/code/therapy-normalization/venv/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/jamesstevenson/code/therapy-normalization/venv/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/jamesstevenson/code/therapy-normalization/venv/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/jamesstevenson/code/therapy-normalization/venv/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/Users/jamesstevenson/code/therapy-normalization/therapy/cli.py", line 294, in update_normalizer_db
    _check_disease_normalizer(sources, db, from_local)
  File "/Users/jamesstevenson/code/therapy-normalization/therapy/cli.py", line 130, in _check_disease_normalizer
    update_disease_db(
  File "/Users/jamesstevenson/code/therapy-normalization/venv/lib/python3.10/site-packages/disease/cli.py", line 121, in _update_normalizers
    _load_source(n, db, delete_time, processed_ids, from_local)
  File "/Users/jamesstevenson/code/therapy-normalization/venv/lib/python3.10/site-packages/disease/cli.py", line 168, in _load_source
    processed_ids += source.perform_etl(use_existing=from_local)
  File "/Users/jamesstevenson/code/therapy-normalization/venv/lib/python3.10/site-packages/disease/etl/base.py", line 46, in perform_etl
    self._transform_data()
  File "/Users/jamesstevenson/code/therapy-normalization/venv/lib/python3.10/site-packages/disease/etl/mondo.py", line 154, in _transform_data
    mondo = owl.get_ontology(self._data_file.absolute().as_uri()).load()
  File "/Users/jamesstevenson/code/therapy-normalization/venv/lib/python3.10/site-packages/owlready2/namespace.py", line 1121, in load
    if self.world.graph.indexed: self._load_properties()
  File "/Users/jamesstevenson/code/therapy-normalization/venv/lib/python3.10/site-packages/owlready2/namespace.py", line 1156, in _load_properties
    raise TypeError("'%s' belongs to more than one entity types (cannot be both a property and a class/an individual)!" % Prop.iri)
TypeError: 'http://purl.obolibrary.org/obo/PATO_0000051' belongs to more than one entity types (cannot be both a property and a class/an individual)!
@jsstevenson jsstevenson added bug Something isn't working priority:high High priority labels Sep 26, 2023
@jsstevenson jsstevenson self-assigned this Sep 26, 2023
@jsstevenson
Copy link
Member Author

Asked about this in the Mondo repo: monarch-initiative/mondo#6712

They may or may not make a change that reverts this -- it's legal OWL either way so they technically don't need to. In the meantime, I've been using the previous version of Mondo locally.

We probably should look into alternatives to OwlReady2, unfortunately.

@jsstevenson jsstevenson added the technical debt A feature/requirement implemented in a sub-optimal way & must be re-written. Contrast to "cleanup" label Oct 10, 2023
@korikuzma
Copy link
Member

@jsstevenson do you plan on making a fix for this soon? I'm testing out the automatic ddb updates and forgot about this for ETL

@jsstevenson
Copy link
Member Author

Ugh, I was hoping they would just make a new release and we could slide by... I can prioritize a fix

@korikuzma
Copy link
Member

@jsstevenson thank you!

@jsstevenson
Copy link
Member Author

update. I have tried out a few different OWL and OBO libraries. Some thoughts

  • It'd be nice to switch over to the OBO format. OWL is probably more powerful than we need. OBO is much lighter and seems to include all the data that we'd like to extract. fastobo seems great. The OBO format of our test fixture input data is waaaay smaller than the OWL and much less complex to generate and use.
  • In working on this, I noticed that our current extraction process might be a little too greedy. We're picking up xrefs/assoc_with cases that we really shouldn't be ingesting. For example, the Mondo term for Hirschsprung-associated ganglioneuroblastoma lists the orphanet term for "neuroblastoma" as a "related" concept. My preference would be to not include this as an assoc_with reference, but we don't really have a way to exclude it right now.
  • Unfortunately, the qualifier in the OBO format of Mondo is provided in a format that isn't picked up by either of the OBO reading libraries that I tried. In other words, they can pick up everything we currently ingest, but they can't help us be pickier than we are now (even though we probably should be pickier).

tldr we can probably make the switch over to OBO, it's pretty nice. However, limitations with OBO readers prevent us from tightening how we ingest xrefs (tbf, I also can't figure out how to do this with the OWL libraries, even if the Mondo.owl import was working).

@jsstevenson
Copy link
Member Author

Closed in #171

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority:high High priority technical debt A feature/requirement implemented in a sub-optimal way & must be re-written. Contrast to "cleanup"
Projects
None yet
Development

No branches or pull requests

2 participants