-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wrong matchType in species/match api call #129
Comments
Matching is always a very delicate balancing act. In this case the species http://api.gbif.org/v1/species/match?verbose=true&name=Brunella In order to avoid matching to species further away in the classification you should provide some taxonomic context. For example if you give the family it matches to your expectations: In general it should be avoided to match just on the name alone. Especially with genera there are too many homonyms and closely spelled alternatives. It seems as if the emendation part of the authorship is considered by the matching to be an infraspecific epithet. Hence you get higher rank matches even to the species in your link above. You get more alternatives to be considered when the em part is removed: |
https://www.gbif.org/species/176820144 This is a rather rare case when apparently only the autonym existed and no other subspecies. In that case we remove the autonym from the backbone and only keep the species. |
Thanks for your answers @mdoering But yes. In this particular case I was testing names from a database which also has some family information. As a result the family I can pass may or may not match the one in the backbone.
Should I first try to generate an updated list of families? I wonder if the same species/matching api could be used for this.
So, in step 2, this would be the query for the wrong Brunella genus, extracted from the above species string: An example with one of those APG-changed families would be Veronica and many other formerly placed in Scrophulariaceae family. If I pass family in this request: OK, so when I try to match any species of Brunella (according to our labels) I should pass in "Lamiaceae", and for Veronica species I should pass "Plantaginaceae". Or this (a bit overcomplicated for me) is fully innecesary and you think I can safely pass the old families and the api will safely recognize them? BTW. I am confused with these two api request parameters. What's the difference between them, and whether it is useful or not to pass anyone in the above requests:
species/match api documentation doesn't say anything about the data type for the second one. |
If For the The The Hope this helps. We should add the data type for |
Thanks for all your helpful comments @djtfmartin The name I tried to match is "Dryopteris borreri Newman" If I use a search without providing authorship, I get the expected result: Isn't it a bit odd that the 1st request -providing an "imperfect but not so bad" authorship- returns a "worse" response than 2nd request -which only provides a canonical name-? When there is not a good EXACT or FUZZY match using the provided authorship, shouldn't it be more appropriate to fall back to try the canonical name before giving up and returning a HIGHERRANK match? Maybe this is difficult to achieve for some technical reason? (i.e. having to make a 2nd database call which slows down things a lot, or whatever). |
you can add the query parameter https://api.gbif.org/v1/species/match?name=Dryopteris%20borreri%20Newman&verbose=true You now see the list of considered potential matches as alternatives. This list contains Additionally there is also:
I agree this is unfortunate behavior though. |
Thanks @mdoering the pro parte explanation makes sense. I catched a new possible issue, or at least I cannot explain why this happens.
Why does this happen? 2nd one is not an exact match, is it? |
Another unfortunate match not being reported as Linaria polygalifolia Hoffmanns. & Link subsp. aguillonensis (García Mart.) Castrov. & Lago I would expect that matching this backbone accepted subspecies (or at least ranking it very high): |
I've been trying api matches against the GBIF backbone taxonomy during the last few days and I have a question about one particular case:I was tried to match ths name in several of our specimens:
Brunella grandiflora (L.) Jacq. em Moench.
http://api.gbif.org/v1/species/match?name=Brunella%20grandiflora%20(L.)%20Jacq.%20em%20Moench.
Brunella Mill. is an unaccepted genus in backbone (a synonym orthographic variant of Prunella L.).
So I was expecting this api call to return some kind of
matchType:FUZZY
.Or perhaps a
matchType:HIGHERRANK
... but not like this:For some reason, the api is not taking into account that synonym relationship, and also not considering that both genus are in the same family (Lamiaceae). Instead, api matches a species of genus Brunellia (family Brunelliaceae, not even in the same taxonomic order or class).
Shouldn't the genus synonym relationship and higher taxonomic ranks help api to give a taxonomically-closer match here?
Especially when "Brunella" spelling is not closer to "Brunellia" than to "Prunella" (one letter difference in both cases).
Why did the api take the worst taxonomic option here?
I passed a species name, and the api returns a name match at species level too. But it is flagged as
matchType: HIGHERRANK
.I'd say that is bug. If not, what's the logic behind it?
Regarding the wrongly matched "species", in web species interface it is called Brunellia grandiflora (no authorship, and no children taxa in left panel). It also says:
The api response gives not exactly that information. I see these flags:
Is
IMPLICIT_NAME
the one, isn't it? But I am not sure I understand the implications:At a first glance I'd say this name was created in backbone because there are some other infraspecific names which need to be put somewhere in the tree. But as I said, there are no children names in the species page, so I don't really understand what is going on.
Thanks in advance for possible explanations
@abubelinha
The text was updated successfully, but these errors were encountered: