-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to download OakInk2 dataset #4
Comments
Does the download error only happen on the specific tar |
@kelvin34501 Thank you for your response. It is difficult to tell, because the download script processes entire files in a whole. |
@kelvin34501 I have downloaded the But the error persists for different file.
|
I have a guess (which might not be the case): the HuggingFace download tool uses a cache system where files are by default first downloaded to I use the following script to validate the size of the tar, and it seems ok: import requests
from huggingface_hub import get_hf_file_metadata, hf_hub_url
url = hf_hub_url(repo_id="kelvin34501/OakInk-v2", repo_type="dataset", filename="data/scene_01__O001++seq__97fc3dab0aa577c392a5__2023-04-13-20-53-02.tar")
metadata = get_hf_file_metadata(url)
print(metadata.size)
response = requests.head(metadata.location)
print(response.headers["Content-Length"]) I am currently trying to reproduce the download issue and it will take some time. |
I tried to download the dataset from huggingface with following script:
from huggingface_hub import snapshot_download snapshot_download(repo_id="kelvin34501/OakInk-v2", repo_type="dataset")
However, I constantly get this error.
File "<stdin>", line 2, in <module>53.tar: 22%|███████████████████▊ | 1.79G/8.30G [20:54< 43:33, 2.49MB/s] File "/home/danieljung0121/anaconda3/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn | 1.87G/8.30G [21:27< 43:56, 2.44MB/s] return fn(*args, **kwargs)-16-25-53.tar: 24%|██████████████████████▍ | 2.02G/8.30G [22:22< 28:32, 3.67MB/s] File "/home/danieljung0121/anaconda3/lib/python3.9/site-packages/huggingface_hub/_snapshot_download.py", line 308, in snapshot_download | 2.86G/8.30G [26:21< 19:31, 4.64MB/s] thread_map(6ea__2023-04-22-16-25-53.tar: 41%|█████████████████████████████████████▍ | 3.38G/8.30G [27:54< 17:21, 4.73MB/s] File "/home/danieljung0121/anaconda3/lib/python3.9/site-packages/tqdm/contrib/concurrent.py", line 94, in thread_map | 4.25G/8.30G [29:11< 05:49, 11.6MB/s] return _executor_map(ThreadPoolExecutor, fn, *iterables, **tqdm_kwargs) File "/home/danieljung0121/anaconda3/lib/python3.9/site-packages/tqdm/contrib/concurrent.py", line 76, in _executor_map return list(tqdm_class(ex.map(fn, *iterables, **map_args), **kwargs)) File "/home/danieljung0121/anaconda3/lib/python3.9/site-packages/tqdm/std.py", line 1195, in __iter__ for obj in iterable: File "/home/danieljung0121/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 609, in result_iterator yield fs.pop().result() File "/home/danieljung0121/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 446, in result return self.__get_result() File "/home/danieljung0121/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result raise self._exception File "/home/danieljung0121/anaconda3/lib/python3.9/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) File "/home/danieljung0121/anaconda3/lib/python3.9/site-packages/huggingface_hub/_snapshot_download.py", line 283, in _inner_hf_hub_download return hf_hub_download( File "/home/danieljung0121/anaconda3/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn return fn(*args, **kwargs) File "/home/danieljung0121/anaconda3/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1457, in hf_hub_download http_get( File "/home/danieljung0121/anaconda3/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 552, in http_get raise EnvironmentError( OSError: Consistency check failed: file should be of size 13168824320 but has size 2971966929 ((…)dab0aa577c392a5__2023-04-13-20-53-02.tar). We are sorry for the inconvenience. Please retry download and pass
force_download=True, resume_download=Falseas argument. If the issue persists, please let us know by opening an issue on https://github.com/huggingface/huggingface_hub.
.FYI, I have tried all combinations of
force_download=True, resume_download=True
,force_download=True, resume_download=False
,force_download=False, resume_download=True
,force_download=False, resume_download=False
as recommended by the error but none of them worked. So, I am assuming that the problem stems from either the download itself or huggingface.May I ask for help on this problem?
The text was updated successfully, but these errors were encountered: