-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grid engine support for terabyte (T) MEMTOT output from qhost, and cpu specifications #41
base: master
Are you sure you want to change the base?
Conversation
Thank you for the pull request. jobTree is now Toil and is maintained in a different repository. We are working to integrate your changes to Toil. |
@@ -69,7 +71,7 @@ def prepareQsub(cpu, mem): | |||
"LD_LIBRARY_PATH=%s" % os.environ["LD_LIBRARY_PATH"]] | |||
reqline = list() | |||
if cpu is not None: | |||
reqline.append("p="+str(cpu)) | |||
qsubline.extend(["-pe", "shm", str(int(cpu))]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Toil already switched to -pe
but it uses -pe smp
instead of -pe shm
. Do you think they are equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Hannes,
I don't think "-pe shm" and "-pe smp" are equivalent, but I'm not entirely
sure.
For example, on the computer I use, the results of the command qconf -spl
are:
orte
shm
So I don't think I can use the smp parallel environment on this particular
computer without making some changes.
I can't think of an elegant solution at the moment, but maybe you could
prompt the user for the name of the appropriate parallel environment, just
once, during package setup.
cheers,
Tom
On Tue, Oct 6, 2015 at 6:05 PM, Hannes Schmidt [email protected]
wrote:
In batchSystems/gridengine.py
#41 (comment):@@ -69,7 +71,7 @@ def prepareQsub(cpu, mem):
"LD_LIBRARY_PATH=%s" % os.environ["LD_LIBRARY_PATH"]]
reqline = list()
if cpu is not None:
reqline.append("p="+str(cpu))
qsubline.extend(["-pe", "shm", str(int(cpu))])
Toil already switched to -pe but it uses -pe smp instead of -pe shm. Do
you think they are equivalent?—
Reply to this email directly or view it on GitHub
https://github.com/benedictpaten/jobTree/pull/41/files#r41341883.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thank you.
The function obtainSystemConstants() in the GridEngineBatchSystem class in batchSystems/gridengine.py threw the error "ValueError: invalid literal for float(): 1.5T" when I tried to run it on a system that has 1.5T of available memory. I modified the MemoryString class to handle qhost output in the terabyte (T) range.
jobTree then worked fine, but the jobs it submitted to sge sat in queued "qw" state indefinitely. The reason was it was requesting a single processor per node via "qsub -l num_proc=1", but none of the nodes on my system have exactly one processor (they have more than that). I modified the prepareQsub(cpu, mem) function to use "qsub -pe shm 1". This now works on my system, but the function might have to be generalized to work on others (if something other than the shm parallel environment is being used).