-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Hello,
I have just waited for several minutes as jobsub_submit was
re-compressing an already compressed code tarball. It was specified
with the
--tar_file_name dropbox:///pnfs/mu2e/resilient/users/gandr/gridexport/tmp.9I7Gv1adwT/Code.tar.bz
option, and then I saw a large file named Code.tar.bz2473.tbz2 appear
in my working directory as I was waiting for the submission to
complete.
Maybe the compression step should be delegated to the user, and
jobsub_submit should not try to re-pack the user-provided file. Just
upload it as is from its original location.
Andrei
Activity
marcmengel commentedon Feb 28, 2024
What jobsub_lite is doing is rewriting the tarfile with the permissions modified, to prevent people putting things into cvmfs that they cannot read.
See: https://github.com/fermitools/jobsub_lite/blob/master/lib/tarfiles.py#L83
The generated tarfile is compressed just to minimize the disk required.
gaponenko commentedon Feb 28, 2024
marcmengel commentedon Feb 29, 2024
We tried that, but users complained they were
and they found that behavior unacceptable.
Also, decompressing and reading the whole tarfile to check the permissions on everything is not significantly faster than copying it and modifying it.
Just how big is this tarfile you're sending?
marcmengel commentedon Feb 29, 2024
Also, why are you asking that a file already in /pnfs/mu2e/resilient be re-copied to a dropbox: location, when it is already in resilient? Just leave the dropbox: off of the front and use it where it is...
gaponenko commentedon Feb 29, 2024