Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Genbank downloading problems #2179

Closed
pandan74 opened this issue Aug 4, 2022 · 6 comments · Fixed by #2255
Closed

Genbank downloading problems #2179

pandan74 opened this issue Aug 4, 2022 · 6 comments · Fixed by #2255

Comments

@pandan74
Copy link

pandan74 commented Aug 4, 2022

I'm having difficulties to download Genbank databases. I was able to download GTDB, and Genbank viral k31. Is there any other place I can find these files? Please suggest how I can download them. I use curl and usually i get this error: curl: (33) HTTP server doesn't seem to support byte ranges, or just time out.

@gabridinosauro
Copy link

Hi sourmash creators! Thanks a lot for your work! I am having the exact same problem as being described above.
Do you have any idea on how to solve it? it seems that the server we are downloading from times out after certain amount of time.

Thanks in advance! Gabri

@ctb
Copy link
Contributor

ctb commented Aug 6, 2022

hi! sorry about this, it's been difficult to find good places to store these files ;(.

these problems occur when using the dweb.link URLs at https://sourmash.readthedocs.io/en/latest/databases.html, right? If so there are some options discussed here but it is not simple at the moment... I'll see if I can document it more clearly today.

@taylorreiter
Copy link
Contributor

yes -- they've been plaguing me for days. another potential alternative -- the OSF links are super fast. are you trying to move away from google drive to OSF for the large files? in the short term can we update the documentation so that it downloads from the links on OSF until we fix the dweb links?

@luizirber
Copy link
Member

some solutions while we don't move everything to R2:

remove https://dweb.link/ipfs/ from download URL

  • for genbank-2022.03-viral-k21.zip: https://dweb.link/ipfs/bafybeicjyx6qkhdtw6q4cxs6fyl46gqfhd4q5eqje5lkswf2npljnyytzi -> bafybeicjyx6qkhdtw6q4cxs6fyl46gqfhd4q5eqje5lkswf2npljnyytzi

with the cloudflare gateway:

  • wget -O genbank-2022.03-viral-k21.zip https://cloudflare-ipfs.com/ipfs/bafybeicjyx6qkhdtw6q4cxs6fyl46gqfhd4q5eqje5lkswf2npljnyytzi

with ipget:

  • grab ipget from https://dist.ipfs.io/#ipget
  • ipget -O genbank-2022.03-viral-k21.zip bafybeicjyx6qkhdtw6q4cxs6fyl46gqfhd4q5eqje5lkswf2npljnyytzi

@ctb
Copy link
Contributor

ctb commented Sep 3, 2022

Hello! I am pleased to report that our databases may now be Robustly Available via the local UC Davis infrastructure of the dib-lab ;).

Please see #2255 for the PR; you can view the databases file directly here until that PR is merged, at which point it will show up here.

I will update this issue once the PR is merged (which should be fairly quickly).

🎉

@ctb ctb closed this as completed in #2255 Sep 3, 2022
@ctb
Copy link
Contributor

ctb commented Sep 3, 2022

Merged & docs updated: prepared databases page here now has robustified farm links.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants