Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Futures Executor fails when using Condor #3989

Open
RamenMode opened this issue Nov 19, 2024 · 1 comment
Open

Futures Executor fails when using Condor #3989

RamenMode opened this issue Nov 19, 2024 · 1 comment
Assignees
Labels
bug For modifications that fix a flaw in the code. TaskVine

Comments

@RamenMode
Copy link
Contributor

RamenMode commented Nov 19, 2024

Running the following code for a basic future fails when batch_type is set to "condor"

import ndcctools.taskvine as vine

def my_sum(x, y):
    return x + y

m = vine.FuturesExecutor(manager_name='my_manager', batch_type="condor")

a = m.submit(my_sum, 3, 4)
b = m.submit(my_sum, 5, 2)
c = m.submit(my_sum, a, b)  # note that the futures a and b are
                            # passed as any other argument.

print(c.result())

This did work on the local machine, but when condor was specified, it did not seem to complete execution on the crcfe01 machine. In addition, according to vine_status, a and b were completed but c never did. The same was true for a reduction i tested, which completed 10 tasks but not the final task (which combined the 10 results).

The following code also failed nondeterministically in two ways

import ndcctools.taskvine as vine

def my_sum(x, y):
    return x + y

m = vine.FuturesExecutor(manager_name='my_manager', batch_type="condor")

a = m.submit(my_sum, 3, 4)
print(a.result())
(cctools-dev) [kxue2@crcfe02 taskvine_tests]$ python3 single_future.py 
b''
Traceback (most recent call last):
  File "/afs/crc.nd.edu/user/k/kxue2/taskvine_tests/single_future.py", line 11, in <module>
    print(a.result())
          ^^^^^^^^^^
  File "/afs/crc.nd.edu/user/k/kxue2/miniconda3/envs/cctools-dev/lib/python3.12/site-packages/ndcctools/taskvine/futures.py", line 399, in result
    raise result
  File "/afs/crc.nd.edu/user/k/kxue2/miniconda3/envs/cctools-dev/lib/python3.12/site-packages/ndcctools/taskvine/futures.py", line 523, in output
    self._output = cloudpickle.loads(self._output_file.contents())
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
EOFError: Ran out of input

Does not terminate

@RamenMode RamenMode added the bug For modifications that fix a flaw in the code. label Nov 19, 2024
@dthain
Copy link
Member

dthain commented Nov 22, 2024

@RamenMode please attach the taskvine debug log for the run, that will illuminate what is going on...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug For modifications that fix a flaw in the code. TaskVine
Projects
None yet
Development

No branches or pull requests

3 participants