Partial fixes for multisite ensembles #3654

infotroph · 2025-10-20T17:57:49Z

Description

A couple small fixes and a questionable hack for running ensemble and uncertainty analyses without database access.

Make sure SA run directories contain site IDs, so that e.g. SA-median/ isn't overwritten for each site in turn
Only try to run SA on the PFTs actually present at a given site
If Bety isn't available and no ensemble ID is present in the settings object, assign a new one by hashing the settings object.

Note that the latter step is done independently for each call to write.*.configs, so in a multisite run this will effectively set up a separate ensemble/SA for each site. This was what I wanted today, but I suspect most people will want outputs aggregated across sites, which this PR does not implement.

Motivation and Context

For the MAGiC project I wanted to quickly evaluate AGB timeseries from many sites, for which the timeseries plots from the ensemble analysis would be perfect except that I'm running with no Bety access and the existing code sets the ensemble ID to NOENSEMBLEID, making each site overwrite the outputs from the previous one.

Since the issue applies to both ensemble and sensitivity I tried to implement a fix for both, but note that I focused on avoiding collisions between distinct ensembles -- there are still places where two sites with the same ensemble ID will overwrite each other.

I'm pasting my working notes below -- @divine7022 and @dlebauer will likely want to consider the unresolved issues in their work on multisite sensitivity.

write.configs fails if SA is requested in a settings with ensemble size > 1
- Workaround: run SA and full ensemble in separate settings files
  => unresolved
(minor): README.txt does not specify which met/IC/soil/event inputs were used
=> unresolved
rundir SA-<pft>-<var>-<quantile> contents are overwritten by each site in turn
=> Resolved by adding site id to the get.run.id call
rundir SA-median- tries to run analysis for "ALL PFT", fails on NAs from pfts not present at that site
=> Resolved by having run.sensitivity.analysis subset PFTs to those in run$site$site.pft. PFT doesn't show up in rundir names, but since only one per site it works.
each site's call to run.write.configs overwrites sensitivity.samples
- Since run.write.configs only sees one site at a time, need to choose one of:
  - write separate samples file for each site, combine later
  - append samples to existing samples file
  - move entire SA sample generation to a step not wrapped in papply
  - ?stop saving sensitivity.samples if not strictly needed after write.sa.configs is finished
    (But I think this is where run IDs are taken from)
    => Unresolved
runModule.run.sensitivity.analysis overwrites outputs as it runs for each site
- Affects ensemble analysis too
- The correct fix for this will probably parallel the fix for run.write.configs
- Hacky workaround: Pass each site a different ensemble id to get n_sites separate outputs, then manually combine posthoc
  => This workaround implemented by setting null ensemble.ids to rlang::hash(settings),
  but if we might consider settings$run$site$id instead. Are there cases where a multiSettings might contain multiple entries from the same site? or where site ID would be unset?

Review Time Estimate

Immediately
Within one week
When possible

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My change requires a change to the documentation.
My name is in the list of CITATION.cff
I agree that PEcAn Project may distribute my contribution under any or all of
- the same license as the existing code,
- and/or the BSD 3-clause license.
I have updated the CHANGELOG.md.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.
I have added tests to cover my changes.
All new and existing tests passed.

CHANGELOG.md

infotroph · 2025-10-29T21:46:29Z

@DongchenZ Will anything I propose here conflict with #3634?

divine7022 · 2025-10-31T16:03:33Z

LGTM, as discussed my debug shows the settings will get mutated between two calls of write.*.configs , so there won't be conflict between the ensemble ids of multisite settings with ensembles only required.
add this missing hash generation to
run.ensemble.analysis() and read.ensemble.ts

infotroph · 2025-10-31T17:05:30Z

add this missing hash generation to run.ensemble.analysis() and read.ensemble.ts

@divine7022 Good catch that those are potential failure points. On a quick look I think their current behavior (of erroring if no ensemble id) is correct -- recall that the model runs have already executed by the time run.ensemble.analysis and read.ensemble.ts is called, so if we can't tell what ensemble ID was used at write.configs time then I think it's too late to create one in these steps. Does that seem right to you?

divine7022 · 2025-10-31T17:42:28Z

sounds correct!, realizing the different ensemble.id validation behaviors for sensitivity and ensemble runs ; and strict ensemble.id requirements for run.ensemble.analysis

modules/uncertainty/R/ensemble.R

modules/uncertainty/R/run.sensitivity.analysis.R

modules/uncertainty/R/ensemble.R

dlebauer

It seems prudent to merge this hack to keep work moving. But before merging, please file one or more follow-up issues to track the open threads that it leaves.

modules/uncertainty/R/ensemble.R

CHANGELOG.md

Co-authored-by: David LeBauer <[email protected]>

infotroph · 2025-11-03T06:28:06Z

please file one or more follow-up issues to track the open threads that it leaves

@dlebauer this PR fixes the problem I was having, so as far as I'm concerned it's not leaving threads open 🤷

Like I said above, the dump of my working notes is specifically for you to take and run with, because the correct resolution for each "unresolved" behavior (or even the decision whether it's a bug at all) could depend greatly on which direction you decide to take the SA work.

divine7022 · 2025-11-03T12:17:25Z

Those fixes are addressed and will be pushed once this PR gets merged

dlebauer

LGTM

infotroph added 5 commits October 17, 2025 14:37

wording and whitespace

cc52beb

use settings hash as ensemble id if not provided

9ccf904

filter to pfts present at this site

91cc792

pass site id when naming rundirs

c56c125

changelog

ef6d42d

github-actions bot added Modules Base labels Oct 20, 2025

infotroph commented Oct 20, 2025

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

Update CHANGELOG.md

159284a

infotroph requested a review from dlebauer October 31, 2025 18:07

mdietze reviewed Oct 31, 2025

View reviewed changes

modules/uncertainty/R/ensemble.R Show resolved Hide resolved

modules/uncertainty/R/run.sensitivity.analysis.R Show resolved Hide resolved

infotroph commented Oct 31, 2025

View reviewed changes

modules/uncertainty/R/run.sensitivity.analysis.R Outdated Show resolved Hide resolved

infotroph commented Oct 31, 2025

View reviewed changes

modules/uncertainty/R/ensemble.R Outdated Show resolved Hide resolved

dlebauer requested changes Nov 2, 2025

View reviewed changes

modules/uncertainty/R/ensemble.R Outdated Show resolved Hide resolved

infotroph commented Nov 3, 2025

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

wording improvements from code review

804c932

Co-authored-by: David LeBauer <[email protected]>

infotroph added this to the 1.10.0 milestone Nov 3, 2025

Apply suggestion from @dlebauer

58b0c11

dlebauer approved these changes Nov 4, 2025

View reviewed changes

dlebauer enabled auto-merge November 4, 2025 22:04

dlebauer disabled auto-merge November 4, 2025 22:04

dlebauer enabled auto-merge November 4, 2025 22:15

Roxygen

bc9edba

dlebauer added this pull request to the merge queue Nov 4, 2025

Merged via the queue into PecanProject:develop with commit 1901bbe Nov 4, 2025
19 of 26 checks passed

infotroph deleted the sa-multisite-hack branch November 4, 2025 23:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Partial fixes for multisite ensembles #3654

Partial fixes for multisite ensembles #3654

infotroph commented Oct 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

infotroph commented Oct 29, 2025

Uh oh!

divine7022 commented Oct 31, 2025

Uh oh!

infotroph commented Oct 31, 2025

Uh oh!

divine7022 commented Oct 31, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dlebauer left a comment

Uh oh!

Uh oh!

Uh oh!

infotroph commented Nov 3, 2025

Uh oh!

divine7022 commented Nov 3, 2025

Uh oh!

dlebauer left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Partial fixes for multisite ensembles #3654

Partial fixes for multisite ensembles #3654

Conversation

infotroph commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Review Time Estimate

Types of changes

Checklist:

Uh oh!

Uh oh!

infotroph commented Oct 29, 2025

Uh oh!

divine7022 commented Oct 31, 2025

Uh oh!

infotroph commented Oct 31, 2025

Uh oh!

divine7022 commented Oct 31, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dlebauer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

infotroph commented Nov 3, 2025

Uh oh!

divine7022 commented Nov 3, 2025

Uh oh!

dlebauer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

infotroph commented Oct 20, 2025 •

edited

Loading