Convert labkey.get_data
and quilt.get_data
to be single download_file
calls to allow for mapped requests
#68
Labels
enhancement
New feature or request
Currently dask workers only have two cores and four threads by default. Since quilt uses a ThreadPoolExecutor to download files this function running in a dask worker severely limits how quickly files will be downloaded.
If we re-configure the combination of
{loader}.get_data
and the actually fetching to basically be{loader}.get_data
returns a list partial functions to call and then map out those partial functions the dask workers can each take a file to download instead of a single worker being used to download everything.In pseudo-code with a lof of metadata handling removed:
The text was updated successfully, but these errors were encountered: