You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Analysis seems to fail from OOM, the following clean-up has some unusual errors which point to something not being handled correctly.
botocore.exceptions.ClientError: An error occurred (MalformedXML) ~ when deleting tmp stored data?
find: ‘output’: No such file or directory from the failed bash script
Version / Environment information
Worker 2.3.6
Example data / logs
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/worker/.local/lib/python3.10/site-packages/celery/utils/dispatch/signal.py", line 276, in send
response = receiver(signal=self, sender=sender, **named)
File "/home/worker/src/model_execution_worker/distributed_tasks.py", line 1033, in handle_task_failure
filestore.delete_dir(dir_remote_data)
File "/home/worker/src/model_execution_worker/backends/aws_storage.py", line 347, in delete_dir
rsp = self.bucket.delete_objects(Delete=del_request)
File "/home/worker/.local/lib/python3.10/site-packages/boto3/resources/factory.py", line 581, in do_action
response = action(self, *args, **kwargs)
File "/home/worker/.local/lib/python3.10/site-packages/boto3/resources/action.py", line 88, in __call__
response = getattr(parent.meta.client, operation_name)(*args, **params)
File "/home/worker/.local/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/worker/.local/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (MalformedXML) when calling the DeleteObjects operation: The XML you provided was not well-formed or did not validate against our published schema
[2024-07-11 00:18:34,411: ERROR/ForkPoolWorker-13] Task generate_losses_chunk[6ec3cac5-2219-4707-890b-d9dee361582c] raised unexpected: OasisException('Ktools run Error: non-zero exit code or error/warning messages detected in STDERR output.\nKilling all processes. To disable this automated check run with `--ktools-disable-guard`.\nLogs stored in: /tmp/run/analysis-1782_losses-b7c033e1f07d4b6c9572691fabcddd83/run-data/log/46')
[2024-07-11 00:18:32,151: INFO/ForkPoolWorker-6] generate_losses_chunk[452036cf-c668-474a-8fa1-2a3b4e3d019f]: WARNING: task requeue detected - retry 2
[2024-07-11 00:18:32,177: INFO/ForkPoolWorker-6] RUNNING: oasislmf.manager.interface
[2024-07-11 00:18:32,179: INFO/ForkPoolWorker-6] Generated loss Chunk 63 of 64 in, /tmp/run/analysis-1781_losses-bce6b4453c234bafa7d047b69613e25e/run-data
[2024-07-11 00:18:32,179: INFO/ForkPoolWorker-6] RUNNING: oasislmf.execution.runner.run_analysis
find: ‘output’: No such file or directory
[2024-07-11 00:18:32,241: INFO/ForkPoolWorker-6]
KTOOLS_STDERR:
[2024-07-11 00:18:32,241: INFO/ForkPoolWorker-6]
[2024-07-11 00:18:32,241: ERROR/ForkPoolWorker-6] generate_losses_chunk[452036cf-c668-474a-8fa1-2a3b4e3d019f]: Error occured in 'loss_generation_task':
Traceback (most recent call last):
File "/home/worker/.local/lib/python3.10/site-packages/oasislmf/computation/generate/losses.py", line 520, in run
return model_runner_module.run_analysis(**bash_params)
File "/home/worker/.local/lib/python3.10/site-packages/oasislmf/utils/log.py", line 123, in wrapper
result = func(*args, **kwargs)
File "/home/worker/.local/lib/python3.10/site-packages/oasislmf/execution/runner.py", line 119, in run_analysis
bash_trace = subprocess.check_output(['bash', params['filename']]).decode('utf-8')
File "/usr/lib/python3.10/subprocess.py", line 421, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['bash', '/tmp/run/analysis-1781_losses-bce6b4453c234bafa7d047b69613e25e/run-data/63.run_analysis.sh']' died with <Signals.SIGKILL: 9>.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/worker/src/model_execution_worker/distributed_tasks.py", line 861, in run
return fn(self, params, *args, analysis_id=analysis_id, **kwargs)
File "/home/worker/src/model_execution_worker/distributed_tasks.py", line 943, in generate_losses_chunk
OasisManager().generate_losses_partial(**chunk_params)
File "/home/worker/.local/lib/python3.10/site-packages/oasislmf/utils/log.py", line 123, in wrapper
result = func(*args, **kwargs)
File "/home/worker/.local/lib/python3.10/site-packages/oasislmf/manager.py", line 94, in interface
return computation_cls(**kwargs).run()
File "/home/worker/.local/lib/python3.10/site-packages/oasislmf/computation/generate/losses.py", line 523, in run
self._print_error_logs(log_fp, e)
File "/home/worker/.local/lib/python3.10/site-packages/oasislmf/computation/generate/losses.py", line 158, in _print_error_logs
raise OasisException(
oasis_data_manager.errors.OasisException: Ktools run Error: non-zero exit code or error/warning messages detected in STDERR output.
Killing all processes. To disable this automated check run with `--ktools-disable-guard`.
Logs stored in: /tmp/run/analysis-1781_losses-bce6b4453c234bafa7d047b69613e25e/run-data/log/63
The text was updated successfully, but these errors were encountered:
Issue Description
Analysis seems to fail from OOM, the following clean-up has some unusual errors which point to something not being handled correctly.
botocore.exceptions.ClientError: An error occurred (MalformedXML)
~ when deleting tmp stored data?find: ‘output’: No such file or directory
from the failed bash scriptVersion / Environment information
Example data / logs
The text was updated successfully, but these errors were encountered: