-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feedback on clusters view #263
Comments
also, spotted a small typo in the legend
|
Thanks @ymgan ! To add to your feedback (so @MortenHofft has it all in one place). Yes the feature is super cool! Highlighting the occurrence (and cluster) that was clicked on would also be helpful. In addition to that:
I think that if those points were addressed, the feature could be in production. I have some ideas of things I would like to see if they were possible. I don't know if those could be implemented at all. Feel free to discard.
|
I've looked at above and implemented what I could at this point. Some things I cannot do at this point.
|
I revisited this functionality today and it has not lost its coolness :-) I checked back in on it because I was interested in demonstrating the data journey of a data collected on a NOAA ship (essentially the clustering of samples at the Smithsonian with the ROV photos in the NOAA Deep Sea Coral database). One question, one note:
|
Hi Steve We don’t currently have a cluster download function, but will add it. If I made a custom export for you, would it help? If so, what filter are you interested in please - clusters that include a Smithsonian to NOAA connection? I've fixed the link in the blog post. The code moved to here |
Tim, thanks for the quick answer. I hope you had a happy new year! Yes, a Smithsonian to NOAA connection is exactly what I'm after. In short, what occurrences from the DSCRTP dataset ( https://www.gbif.org/dataset/df8e3fb8-3da7-4104-a866-748f6da20a3c) are clustered with occurrences from Smithsonian datasets (especially https://www.gbif.org/dataset/821cc27a-e3bb-4bc5-ac34-89ada245069d or https://www.gbif.org/dataset/26098c25-8f7f-4c71-97ac-1d3db181c65e). It doesn't need to be fullproof. I'm just looking to demonstrate that we can QAQC the data across these institutions and improve collection metadata and publishing workflows. If I'm able to connect the dots, then I'll pass it back to y'all as a use case for developing a download of clusters. |
As a first pass please see this TSV created using this SQL. Please let me know if you'd like any changes. The format is:
|
Maybe the link to the blogpost should be replaced by the link to the documentation? https://techdocs.gbif.org/en/data-processing/clustering-occurrences (apparently, the source code links should be updated there as well) |
After noticing that the links didn't work any more, see gbif/hosted-portals#263
Thanks for the help; it's tough to identify a strategy for efficiently investigating these in a pairwise fashion (and I'm guessing you already know this) . I suspect it might take me a while to make progress on this. |
Thanks @sformel-usgs - that is pretty accurate. Generally I load into a DB (e.g. clickhouse), start with those I am interested in and then |
I appreciate the offer, but unfortunately, I'll have to put this to the side for the next couple of weeks. One realization I had while I was combing through the data is that the DSCRTP data might be discarding useful (e.g. |
Hi @MortenHofft
I remember you asked for feedback on the clusters view in the hosted portal community call. @Antonarctica and I finally sat together and had a look at it. Here's our feedback:
cool Cool COOL!!!!!
Anton exclaimed "COOL!!" a million times while clicking through the nodes today!! We think that it's really well done and we appreciate the effort very much!! The feedback below is just for icing on the cake!
Include entity selected in the clusters tab
https://www.biodiversity.aq/occurrence/search?entity=1434623051&from=30&view=CLUSTERS
We think that including the entity selected (pointed by the red arrow) in the clusters tab (the overlay in screenshot) and have it highlighted in a different colour will be helpful for user to compare it with the other nodes from the same cluster. To us, it is a cluster of 4, but we saw 3 entities in the cluster tab so we were a little confused in the beginning.
Meaning of node with dark outline
We aren't sure what it means when a node has darker outlined (pointed by the red arrow) in the screenshot above. We guessed that it is the occurrence record scoped in our hosted portal, but it wasn't clear to us.
How are clusters sorted?
We are curious how clusters are sorted because that would help users to decide whether they want to browse through the pages because sometimes there can be hundreds or thousands of pages.
Pagination for the clusters
Depending on how clusters are sorted, user may be interested to see the last page or enter a page number. This would be more helpful than clicking the next button.
What does the puzzle icon mean?
We are not sure what the puzzle icon means (pointed by red arrow). We believe that it means the data from all of the extensions associated with the record but we aren't sure.
Not sure where to start when first exploring
Our initial responses when we saw the view was we weren't sure where to start. Not sure how constructive this comment is because I don't know how to present this information better. But as we drill down to the nodes and edges, we understand what is happening and we appreciate the work and thought went into this.
Thank you so much for working on this!!
The text was updated successfully, but these errors were encountered: