-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable occurrence search by IUCN redlist category #257
Comments
@timrobertson100 I assume that this is the ws we should use (match by name)https://apiv3.iucnredlist.org/api/v3/docs#species-individual-name |
@andrewrodrigues will confirm. He is our IUCN liaison and requested this (and the other IUCN ones I just logged while we were on a call) I believe that is the correct one, so extracting the |
When establishing the IUCN API Token please use [email protected] or similar (@MattBlissett - your preference?) and not a personal email address. |
Yes, that would the ws. There may be some taxonomical issues as we do not have an up to date red list published but I am working on that |
Is there a reason why the threat status is coming from Wikidata and but the source link takes you through to the red list page? Can you confirm that we will be getting the threat statuses through their API? There was a suggestion from IUCN to automate the process of IUCN checklist update from our side through their - Marie had said this was possible - and at the same time retrieve the threat statuses of the species |
The scale we use for the showing the red list is from Wikidata and is not the correct branding that IUCN uses. This will need to be updated. |
(@andrewrodrigues, please try and keep the discussion on this issue, and open new issues as required -- it's very confusing tracing a topic through 3-4 different issues! https://github.com/search?type=Issues&state=open&q=org%3Agbif+IUCN is all GBIF issues with "IUCN" mentioned, and https://github.com/issues/mentioned those where you are mentioned, and https://github.com/issues/assigned those where you have been assigned. The latter two come via "Issues" at the top of the page.)
I have made an issue for this, gbif/data-mobilization#175.
Their API wasn't fast / reliable enough. See gbif/portal16#1322 (comment)
Receiving it as a published dataset is preferable, based on our past experience with their API. Building it from their API, and "publishing" it ourselves, is probably still better than using an unreliable API -- it's fast, and the origin of the data is easier to trace. |
The IUCN Red List data is now available on GBIF: https://doi.org/10.15468/0qnb58 We can now use GBIF APIs to interpret the Red List status of occurrences in GBIF. (However, I think there was another discussion on whether we should enable a filter for (e.g.) all endangered species seen within the last week, and I do not remember the conclusion.) |
With this now in place, I think we can proceed as follows:
This does have the caveat that this is an assignment of the category based on global and not local red lists i.e. all occurrence records will carry the global red list category regardless of any local red list process. @andrewrodrigues - can you please confirm this is as agreed? |
Yes, this is correct. Would be good to set this up in UAT first before gbif.org to test the functionality and capture any concerns before it is rolled out. |
@mdoering, @fmendezh, @andrewrodrigues, @ahahn-gbif I had a quick look at the suggested change to Checklistbank, but I think we need a bit more logic around synonymy -- or at least, to discuss whether we need to handle it, and work out where we explain any differences between the Red List and the GBIF Backbone. For example, our occurrences of Loxodonta cyclotis (African elephant) should be marked as vulnerable, but the IUCN name is a synonym (without a threat status). It's IUCN Loxodonta africana (African elephant) that is accepted in the IUCN Red List, and has the threat status. There's also the situation where the occurrence matches to a synonym backbone name but accepted backbone name has an IUCN name with a RL status. |
Solving the synonym issues properly is difficult as this is exactly the reason why we need to work with taxon concepts, not just names as we do now. It makes a real difference which threat status a species has if the name was split and there is a broader and a narrower concept, i.e. more or less included individual. Nevertheless, knowing we don't show the right status in some cases, how about:
|
…UCN synonyms to their accepted name, see gbif/pipelines#257
* Adding the IUCN RedList Category to the name match cache gbif/pipelines#257 * Checking possible null values in the NameUsageMatch response and in WS responses
#257 Adding IUCN RedList Category to taxon record, hdfs record, and Elasticsearch record and schema
Aside: @andrewrodrigues verified with the IUCN that the name of this field internally, in downloads, in the API, and on the web should be |
Here's a real example: Goniastrea deformis is a synonym in GBIF's backbone: https://www.gbif.org/species/2260144 (and has occurrences) But in the IUCN Red List, it is an accepted name with Vulnerable status: https://www.gbif.org/species/176597529 |
I agree with @mdoering that the best approach is to use the nubKey to assess the threat status and not use the accepted name. If we follow this approach the JSON response can stay as it is with only the threat status name and code. |
Using the nubKey to get the IUCN Red List category, the accepted name is not being used toa void misleading results after mixing multiple checklists
We could return the accepted name according to IUCN - that might be informative to users. |
Thinking through what information you would want to to retrieve from a search. if you searched a country for all globally critically endangered species, would you
Would this be the case with the approach above? One problem I can see with this approach is where species have been lumped and there have been assessments of all the previous species but not of the new species. In this instance we might have several assessments associated to one accepted taxon in GBIF. Not sure how to get around that. |
I would ignore the GBIF taxonomy and try to lookup the GBIF name in IUCN and evaluate the redlist category according to IUCNs taxonomy as this is what the assessment is based on. In order to show that the assessed IUCN name was a different one we should return the accepted IUCN name from the API - as well as the exact scientific name spelling used in IUCN and possibly the taxonomic status & rank, as we are not only dealing with species.
For anything more accurate we would need a better handle on taxon concepts. By including the redlist in the backbone sources we also make sure that all names in IUCN are included in the backbone and we can do the name matching from occurrences in all cases. Right now we get 91% overlap only, i.e. there are some IUCN names that we miss. |
This sounds like a good approach, and I am assuming that with the red list in backbone it would then also be able to pick up examples like this ? |
If we followed this approach, then In this case all 14 records are provided with a scientific name of G. deformis and so yes they would pick up vulnerable. If they had been supplied with P. deformis as the scientific name, then no, they would not get a category. The nice thing about ignoring the GBIF Backbone would be that it is a defensible approach, but may in some cases miss things that arguably could be inferred. |
Thanks Tim for explaining. This definitely seems like the best approach. |
Not sure why the filter was taken off the UAT environment. Can this filter be introduced again with a view to making it available outside of the UAT environment? |
It was removed from the UAT website because UAT is used for many other tests than IUCN. Do you need it to be public or would a private test environment be sufficient? |
It would be good to make the filter public. |
Note also the related issue here: #495. There's also a more recent version of the IUCN Red List. I will update our imported version. |
Is it possible to automate the redlist dwca generation and build it regularly? |
There are limitations on the size of the calls we can make on the API which may limit the automation of red list updates, will leave it with Matt for more details. Regarding the periodicity of updates, there is no fixed schedule of red lists updates with the list updated regularly on continual basis. I would suggest updates every 6 months unless of course there are large updates that we know are coming up such as for the Congress where we might want additional updates |
We weren't able to use the API, as the bulk download we need to use isn't available that way, and downloading the whole list through the API isn't practical. I think they mark a release every 6 months or so as a new version. The current version is 2021-2, which is shown in the API: https://apiv3.iucnredlist.org/api/v3/version . I can set up a monitor to detect what that number increases, and prompt someone (me) to download the new data. |
How do you download the data, can that be scripted? For ITIS or NCBI we also check for newly available db dumps and then process them into a DwCA. Maybe we can integrate IUCN also into https://github.com/mdoering/checklist_builder? |
It is already there: https://github.com/mdoering/checklist_builder/blob/master/src/main/java/de/doering/dwca/iucn/ArchiveBuilder.java#L82-L104 (and I just requested new downloads). Requesting a download from the IUCN site requires several steps in the browser, agreeing to various terms and conditions etc. |
I've added the 2021-2 version to https://hosted-datasets.gbif.org/datasets/iucn/ and updated the endpoint in the registry. We'll need to reinterpret all occurrences before we launch the feature, but it probably doesn't make much difference while we have the other issues around synonymy and data security. |
The other pending issue is the #496 interpretation of aff. species. |
The Red List has juts been updated and we will have to update the categories via the IUCN API. The IUCN filter should show apply a NE - Not Evaluated - category to those species that have not been assessed by IUCN. This category is not applied systematically by IUCN and thus not available through their API, we will need to apply this category to all species with no IUCN assessment @MattBlissett |
The categories within the filter should be ordered and capitalised as per the following diagram https://www.iucnredlist.org/about/faqs#What%20are%20the%20Red%20List%20Categories%20and%20Criteria |
I've moved the NE change to a new issue, so we can close this one. The ordering is already as on that diagram, and the capitalization is consistent with the other filters (GBIF style) -- but that's up to @kcopas / @dnoesgaard anyway. |
I'm looking for those categories' strings in CrowdIn without success. Can someone direct my attention to their whereabouts? |
wow, just couldn't see the tree for the forest—thanks! |
Use the IUCN Red List API to add an IUCN Red List category field to every record.
Enable this as a search filter and include this in both simple and DwC-A download files.
The text was updated successfully, but these errors were encountered: