Skip to content

Conversation

cmeesters
Copy link
Member

@cmeesters cmeesters commented Sep 4, 2025

We need to perform the account testing with sacctmgr, first, then with sshare. sshare might not provide all accounts, as SLURM in a multicluster environment is only queried on one cluster. This is an observation from my cluster, where sshare did only show accounts for cluster A, whereas sacctmgr displayed for clusters A and B.

The reason is, that sshare queries the slurmctld of a given cluster, whereas sacctmgr queries the overlooking SLURM database of all clusters in a multi cluster environment.

Summary by CodeRabbit

  • New Features
    • Support specifying multiple SLURM accounts in a single argument (comma- or whitespace-separated), with validation of each account.
  • Bug Fixes
    • More robust account validation in multicluster environments by checking accounts via sacctmgr first and falling back to sshare.
    • Clearer, more actionable error and log messages, including deduplicated account listings and explicit reporting when requested accounts are not available.

Copy link
Contributor

coderabbitai bot commented Sep 4, 2025

Walkthrough

Prefers sacctmgr when validating SLURM accounts, falls back to sshare if needed, and adds support for validating multiple comma/whitespace-separated accounts with deduplication and clearer sacctmgr/sshare-specific reporting; no public API signature changes.

Changes

Cohort / File(s) Summary
SLURM account parsing & validation
snakemake_executor_plugin_slurm/__init__.py
- get_account_arg accepts multiple accounts (split on commas/whitespace) and validates each via test_account while returning the original combined argument.
- test_account now attempts sacctmgr first, falls back to sshare if needed; collects distinct sacctmgr_report / sshare_report results, deduplicates discovered accounts, and raises WorkflowError with combined reporting when both tools fail.
- Updated log/error messages and added comment noting sacctmgr-first for multicluster environments.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant Snakemake
  participant Plugin as SLURM Plugin
  participant sacctmgr
  participant sshare

  User->>Snakemake: provide --cluster/account
  Snakemake->>Plugin: get_account_arg(account_str)
  Note over Plugin: split on commas/whitespace\nvalidate each account via test_account
  loop per account
    Plugin->>sacctmgr: query accounts (primary)
    alt sacctmgr returns accounts
      sacctmgr-->>Plugin: accounts
    else sacctmgr fails or empty
      Plugin->>sshare: query accounts (fallback)
      sshare-->>Plugin: accounts or error
    end
    alt account present in discovered set
      Plugin-->>Plugin: mark valid
    else
      Plugin-->>Plugin: raise WorkflowError (list available)
    end
  end
  Plugin-->>Snakemake: return original combined account arg
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • johanneskoester
  • DrYak

Pre-merge checks (2 passed, 1 inconclusive)

❌ Failed Checks (1 inconclusive)
Check Name Status Explanation Resolution
Description Check ❓ Inconclusive The current description outlines the change ordering and rationale for using sacctmgr before sshare in multicluster setups, but it omits details about multi-account splitting and the enhanced error handling logic, and the repository lacks a PR template to ensure consistent, comprehensive submissions. Please expand the description to include all implemented features—such as support for multiple accounts, updated error messages, and fallback logic—and consider adding a pull request template to this repository to guide contributors in providing complete context.
✅ Passed Checks (2 passed)
Check Name Status Explanation
Title Check ✅ Passed The title accurately and concisely reflects the primary change of altering account lookup and testing behavior for multicluster environments, making it clear to reviewers what this PR addresses.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Poem

I hop through logs with eager ears,
sacctmgr first to calm the fears.
Accounts split, then tidy-fit,
Deduped lists — we never quit.
If both fall short, I point the way,
“Try these instead,” I humbly say. 🐇

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  - Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.
  - Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/multi_account_cluster_support

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
snakemake_executor_plugin_slurm/__init__.py (1)

756-771: Normalize case for validation to avoid false negatives; stabilize error output.

You compare account.lower() against raw entries, which can fail if the cluster returns mixed-case accounts. Normalize both sides and sort the list when displaying.

-        accounts = set(_.strip() for _ in accounts.split("\n") if _)
+        accounts = {a.strip().lower() for a in accounts.splitlines() if a.strip()}
@@
-        if account.lower() not in accounts:
+        if account.strip().lower() not in accounts:
             raise WorkflowError(
-                f"The given account {account} appears to be invalid. Available "
-                f"accounts:\n{', '.join(accounts)}"
+                f"The given account {account} appears to be invalid. Available "
+                f"accounts:\n{', '.join(sorted(accounts))}"
             )
🧹 Nitpick comments (1)
snakemake_executor_plugin_slurm/__init__.py (1)

730-733: Prefer shell=False for subprocess calls (ruff S602).

Avoid shell=True where not needed; pass argv list. This reduces injection risk and aligns with best practices. Using shlex.split here is a minimal-change approach.

-            accounts = subprocess.check_output(
-                cmd, shell=True, text=True, stderr=subprocess.PIPE
-            )
+            accounts = subprocess.check_output(
+                shlex.split(cmd), text=True, stderr=subprocess.PIPE
+            )
@@
-                accounts = subprocess.check_output(
-                    cmd, shell=True, text=True, stderr=subprocess.PIPE
-                )
+                accounts = subprocess.check_output(
+                    shlex.split(cmd), text=True, stderr=subprocess.PIPE
+                )

Also applies to: 743-745

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 28924a7 and 77a8fa1.

📒 Files selected for processing (1)
  • snakemake_executor_plugin_slurm/__init__.py (3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-03-31T11:52:05.373Z
Learnt from: cmeesters
PR: snakemake/snakemake-executor-plugin-slurm#249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.

Applied to files:

  • snakemake_executor_plugin_slurm/__init__.py
📚 Learning: 2024-12-11T14:17:08.749Z
Learnt from: cmeesters
PR: snakemake/snakemake-executor-plugin-slurm#178
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2024-12-11T14:17:08.749Z
Learning: In the `snakemake-executor-plugin-slurm` project, when handling exceptions in `snakemake_executor_plugin_slurm/__init__.py`, prefer concise error messages and avoid unnecessary verbosity or exception chaining when it's not necessary.

Applied to files:

  • snakemake_executor_plugin_slurm/__init__.py
🪛 Ruff (0.12.2)
snakemake_executor_plugin_slurm/__init__.py

730-730: subprocess call with shell=True identified, security issue

(S602)


743-743: subprocess call with shell=True identified, security issue

(S602)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: testing
🔇 Additional comments (1)
snakemake_executor_plugin_slurm/__init__.py (1)

660-667: Enforce single Slurm account, filter empties, and safely quote

In get_submit_command (snakemake_executor_plugin_slurm/init.py around line 660):

  • Split and strip job.resources.slurm_account, filter out any empty tokens.
  • If the resulting list’s length is not exactly one, raise a WorkflowError.
  • Call self.test_account on that single account and return -A {shlex.quote(account)}.
⛔ Skipped due to learnings
Learnt from: cmeesters
PR: snakemake/snakemake-executor-plugin-slurm#249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.
Learnt from: cmeesters
PR: snakemake/snakemake-executor-plugin-slurm#178
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2024-12-11T14:17:08.749Z
Learning: In the `snakemake-executor-plugin-slurm` project, when handling exceptions in `snakemake_executor_plugin_slurm/__init__.py`, prefer concise error messages and avoid unnecessary verbosity or exception chaining when it's not necessary.

@cmeesters
Copy link
Member Author

@johanneskoester this is rel. urgent, as I cannot proceed well without this fix.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
snakemake_executor_plugin_slurm/__init__.py (1)

757-772: Case-normalization mismatch can cause false negatives

account.lower() is compared against a case-sensitive set. Normalize both sides; also sort for stable messaging.

-        accounts = set(_.strip() for _ in accounts.split("\n") if _)
+        accounts = {_.strip() for _ in accounts.splitlines() if _.strip()}
+        accounts_lower = {a.lower() for a in accounts}
@@
-        if account.lower() not in accounts:
+        if account.lower() not in accounts_lower:
             raise WorkflowError(
-                f"The given account {account} appears to be invalid. Available "
-                f"accounts:\n{', '.join(accounts)}"
+                f"The given account {account} appears to be invalid. Available "
+                f"accounts:\n{', '.join(sorted(accounts))}"
             )
♻️ Duplicate comments (1)
snakemake_executor_plugin_slurm/__init__.py (1)

729-729: Nice: Initialization prevents UnboundLocalError on error paths

Initializing sacctmgr_report/sshare_report resolves the earlier “referenced before assignment” issue.

🧹 Nitpick comments (4)
snakemake_executor_plugin_slurm/__init__.py (4)

726-734: Avoid shell=True; pass argv list and quote USER safely (sacctmgr call)

Eliminates S602 and quoting pitfalls if $USER contains unexpected characters.

-        cmd = f'sacctmgr -n -s list user "{os.environ["USER"]}" format=account%256'
-        sacctmgr_report = sshare_report = ""
-        try:
-            accounts = subprocess.check_output(
-                cmd, shell=True, text=True, stderr=subprocess.PIPE
-            )
+        user = os.environ.get("USER", "")
+        cmd = ["sacctmgr", "-n", "-s", "list", "user", user, "format=account%256"]
+        sacctmgr_report = sshare_report = ""
+        try:
+            accounts = subprocess.check_output(cmd, text=True, stderr=subprocess.PIPE)

741-755: Record sacctmgr ‘empty output’ and drop shell=True in sshare fallback

Improves diagnostics and security; ensures WorkflowError always includes informative sacctmgr_report.

-        if not accounts.strip():
-            cmd = "sshare -U --format Account%256 --noheader"
+        if not accounts.strip():
+            if not sacctmgr_report:
+                sacctmgr_report = (
+                    f"sacctmgr returned an empty account list for user "
+                    f"'{os.environ.get('USER','')}'."
+                )
+            cmd = ["sshare", "-U", "--format", "Account%256", "--noheader"]
             try:
-                accounts = subprocess.check_output(
-                    cmd, shell=True, text=True, stderr=subprocess.PIPE
-                )
+                accounts = subprocess.check_output(cmd, text=True, stderr=subprocess.PIPE)
             except subprocess.CalledProcessError as e:
                 sshare_report = (
                     "Unable to test the validity of the given or guessed "
                     f"SLURM account '{account}' with sshare: {e.stderr}."
                 )
                 raise WorkflowError(
                     f"The 'sacctmgr' reported: '{sacctmgr_report}' "
                     f"and likewise 'sshare' reported: '{sshare_report}'."
                 )

763-765: Clarify warning: tools returned empty list for the user, not “for account ”

Avoids confusion; the commands list user accounts, not a specific account.

-            self.logger.warning(
-                f"Both 'sacctmgr' and 'sshare' returned empty results for account "
-                f"'{account}'. Proceeding without account validation."
-            )
+            self.logger.warning(
+                "Both 'sacctmgr' and 'sshare' returned an empty account list for "
+                f"user '{os.environ.get('USER','')}'. Proceeding without account validation."
+            )

660-667: Reduce repeated cluster queries when multiple accounts are provided

Current approach invokes sacctmgr/sshare for each token via test_account. Consider fetching the user’s account set once and validating all tokens against it to cut latency and load.

Happy to propose a small helper (e.g., _fetch_user_accounts()) and refactor test_account/get_account_arg accordingly.

Also applies to: 726-755

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 77a8fa1 and e5759b4.

📒 Files selected for processing (1)
  • snakemake_executor_plugin_slurm/__init__.py (3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-03-31T11:52:05.373Z
Learnt from: cmeesters
PR: snakemake/snakemake-executor-plugin-slurm#249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.

Applied to files:

  • snakemake_executor_plugin_slurm/__init__.py
📚 Learning: 2024-12-11T14:17:08.749Z
Learnt from: cmeesters
PR: snakemake/snakemake-executor-plugin-slurm#178
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2024-12-11T14:17:08.749Z
Learning: In the `snakemake-executor-plugin-slurm` project, when handling exceptions in `snakemake_executor_plugin_slurm/__init__.py`, prefer concise error messages and avoid unnecessary verbosity or exception chaining when it's not necessary.

Applied to files:

  • snakemake_executor_plugin_slurm/__init__.py
🪛 Ruff (0.12.2)
snakemake_executor_plugin_slurm/__init__.py

731-731: subprocess call with shell=True identified, security issue

(S602)


744-744: subprocess call with shell=True identified, security issue

(S602)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: testing

Comment on lines +660 to 667
# split the account upon ',' and whitespace, to allow
# multiple accounts being given
accounts = re.split(r"[,\s]+", job.resources.slurm_account)
for account in accounts:
# here, we check whether the given or guessed account is valid
# if not, a WorkflowError is raised
self.test_account(account)
return f" -A '{job.resources.slurm_account}'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Submit only one account; filter tokens and avoid passing a comma/space-joined string to -A

sbatch --account expects a single account. Passing the original combined string (e.g., "acc1,acc2") will likely cause submission failure. Also filter out empty tokens.

Apply:

-            accounts = re.split(r"[,\s]+", job.resources.slurm_account)
-            for account in accounts:
+            accounts = [a for a in re.split(r"[,\s]+", job.resources.slurm_account) if a]
+            for account in accounts:
                 # here, we check whether the given or guessed account is valid
                 # if not, a WorkflowError is raised
                 self.test_account(account)
-            return f" -A '{job.resources.slurm_account}'"
+            # sbatch accepts a single --account; use the first valid token
+            selected = accounts[0]
+            if len(accounts) > 1:
+                self.logger.info(
+                    f"Multiple accounts provided ({accounts}); submitting with '{selected}'."
+                )
+            return f" -A {shlex.quote(selected)}"

🌐 Web query:

Does Slurm "sbatch --account" accept multiple accounts (e.g., comma- or space-separated), or only a single account?

💡 Result:

Only a single account. sbatch's --account/-A takes one account string; it does not accept a comma- or space-separated list. See the sbatch option description. [1] For contrast, Slurm commands that list/filter accounts (e.g., sacct/sacctmgr) accept comma-separated account lists. [2][3]

Sources:
[1] sbatch --account option (Slurm docs).
[2] sacct --accounts accepts comma-separated lists (Slurm docs).
[3] sacctmgr/accounting docs mentioning comma-separated names (Slurm docs).


Only pass a single account to sbatch
sbatch’s --account/-A option accepts exactly one account. Split and filter the provided string, validate each token, then submit using only the first valid account (logging if extras were given).

-            accounts = re.split(r"[,\s]+", job.resources.slurm_account)
-            for account in accounts:
+            accounts = [a for a in re.split(r"[,\s]+", job.resources.slurm_account) if a]
+            for account in accounts:
                 # here, we check whether the given or guessed account is valid
                 # if not, a WorkflowError is raised
                 self.test_account(account)
-            return f" -A '{job.resources.slurm_account}'"
+            # sbatch accepts a single --account; use the first valid token
+            selected = accounts[0]
+            if len(accounts) > 1:
+                self.logger.info(
+                    f"Multiple accounts provided ({accounts}); submitting with '{selected}'."
+                )
+            return f" -A {shlex.quote(selected)}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# split the account upon ',' and whitespace, to allow
# multiple accounts being given
accounts = re.split(r"[,\s]+", job.resources.slurm_account)
for account in accounts:
# here, we check whether the given or guessed account is valid
# if not, a WorkflowError is raised
self.test_account(account)
return f" -A '{job.resources.slurm_account}'"
# split the account upon ',' and whitespace, to allow
# multiple accounts being given
accounts = [a for a in re.split(r"[,\s]+", job.resources.slurm_account) if a]
for account in accounts:
# here, we check whether the given or guessed account is valid
# if not, a WorkflowError is raised
self.test_account(account)
# sbatch accepts a single --account; use the first valid token
selected = accounts[0]
if len(accounts) > 1:
self.logger.info(
f"Multiple accounts provided ({accounts}); submitting with '{selected}'."
)
return f" -A {shlex.quote(selected)}"
🤖 Prompt for AI Agents
In snakemake_executor_plugin_slurm/__init__.py around lines 660 to 667, the code
currently validates all comma/whitespace-separated accounts but passes the
entire original string to sbatch; sbatch accepts only one account. Split and
filter job.resources.slurm_account into tokens (re.split r"[,\s]+"), validate
each token with self.test_account, select the first valid token as the account
to submit, and return f" -A '{first_valid_account}'"; if more valid tokens exist
beyond the first, log a warning/info that only the first will be used and list
the ignored extras.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant