Skip to content

Conversation

OriolMunoz-da
Copy link
Contributor

@OriolMunoz-da OriolMunoz-da commented Oct 6, 2025

Fixes https://github.com/DACH-NY/cn-test-failures/issues/5021 (hopefully, obviously)

Reverse of https://github.com/hyperledger-labs/splice/pull/1727/files:

  • All the fixes done there are untouched (so still there)
  • Newer nix version
  • Bumping cache versions again (assume 6 was broken, so 7) and add $HOME and the nix binary version to the cache key

Unfortunately keeping our caches (binary cache is fine) is proving to be very hard for me. Without them the timing impact is about 2m on average.

Latest version is 2.32
See this change done in 2.14 that is not documented but likely breaks part of our setup.

@OriolMunoz-da OriolMunoz-da changed the title Bump nix binary to 2.31.2 [wip] Bump nix binary to 2.31.2 Oct 6, 2025
@OriolMunoz-da OriolMunoz-da changed the title [wip] Bump nix binary to 2.31.2 [wip] Bump nix binary to $SOMETHING Oct 7, 2025
@OriolMunoz-da
Copy link
Contributor Author

image

expect(screen.getByText(/Submission failed/)).toBeDefined();
expect(screen.getByText(/Service Unavailable/)).toBeDefined();
},
{ timeout: 10000 }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bumping a timeout because it flaked

Signed-off-by: Oriol Muñoz <[email protected]>
Signed-off-by: Oriol Muñoz <[email protected]>
Signed-off-by: Oriol Muñoz <[email protected]>
Signed-off-by: Oriol Muñoz <[email protected]>
Signed-off-by: Oriol Muñoz <[email protected]>
Signed-off-by: Oriol Muñoz <[email protected]>
Signed-off-by: Oriol Muñoz <[email protected]>
@OriolMunoz-da OriolMunoz-da force-pushed the oriol/nix-electric-boogaloo branch from cda8f20 to 6814e44 Compare October 8, 2025 13:10
@OriolMunoz-da
Copy link
Contributor Author

OriolMunoz-da commented Oct 8, 2025

checkpoint: 6814e44 seems to work with a few cache-related things disabled
it's up to 2m slower than current main... let's see how far I get otherwise

Signed-off-by: Oriol Muñoz <[email protected]>
@nicu-da
Copy link
Contributor

nicu-da commented Oct 9, 2025

latest checkpoint: daf9b99 comparing times between here (there was a flake, but doesn't look related to nix) and some main run The avg time impact is of about 2s

but the setup is more than a minute in diff, right?

the diff is smaller because with the caches for some reason nix had a 10s time period where it did nothing, never managed to figure out why, but that's dependent on how many run with nix actions you have in a workflow. for the integration tests we usually have a lot less so the diff is a bit higher

@OriolMunoz-da
Copy link
Contributor Author

@nicu-da
working-late typo:

The avg time impact is of about 2s

was supposed to be 2m... 🤦
that includes everything done in a test job, not just the nix setup

The unfortunate thing is that most test runs don't show a separate step for setting up nix, but rather it's included in the "Run tests" step, so I cannot check for individual times from gh

@OriolMunoz-da OriolMunoz-da changed the title [wip] Bump nix binary to $SOMETHING [wip] Bump nix binary to 2.32.0 Oct 10, 2025
@OriolMunoz-da OriolMunoz-da changed the title [wip] Bump nix binary to 2.32.0 Bump nix binary to 2.32.0 Oct 10, 2025

- name: Check out repository code
uses: actions/checkout@v4
if: inputs.commit_sha != ''
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this and below are the only ones that had this... what for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isegall-da is that leftover? I could see it only being set when you run from a fork but then it should be used for every checkout and it's definitely not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @isegall-da , don't want to merge without knowing this is 100% fine

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @OriolMunoz-da now, just copying the conclusion here as well:
In these tests we do:
Checkout the PR head commit to get the commit message first
check the static label
Checkout commit_sha

In tests that run split_tests first, we check the static label in split_tests which is why we don't have these separate steps there, and we just checkout commit_sha there.

(in ts_cli we flipped the order between (2) and (3) which is wrong, we'll fix that)


- name: Check out repository code
uses: actions/checkout@v4
if: inputs.commit_sha != ''
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also this

@OriolMunoz-da OriolMunoz-da marked this pull request as ready for review October 10, 2025 13:34
Copy link
Contributor

@moritzkiefer-da moritzkiefer-da left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice thanks! Please hold off on merging until Martin and Nicu weigh in.


- name: Check out repository code
uses: actions/checkout@v4
if: inputs.commit_sha != ''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isegall-da is that leftover? I could see it only being set when you run from a fork but then it should be used for every checkout and it's definitely not.

rsync -avi "/cache/nix/$cache_key/nix" $HOME/.config/
rsync -avi "/cache/nix/$cache_key/nix_store/var/" /nix/var
sudo mount --bind /cache/nix/$cache_key/nix_store/store /nix/store
# rsync -avi "/cache/nix/$cache_key/nix_store/var/" /nix/var
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do uncomment that let's create an issue for it documenting how it fails and what you tried.

@martinflorian-da @nicu-da wdyt? I'm leaning towards saying getting an upgrade in with the slowdown and then moving the fix for this to the top of our prio list is better than to keep pushing out that upgrade for another few months (the pressure from slow CI also felt more so chances are higher we'll actually tackle it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(With my limited context:) Agreed to merging this without this caching part, a bit worried about "just moving the fix to the top of our prio list". Can we give assign it to a concrete person already that will realistically get to it soon?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @martinflorian-da those 2 minutes were really annoying in the past, though I guess with the current CI timing it might be less so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2663 adding as TODOs

@OriolMunoz-da OriolMunoz-da changed the title Bump nix binary to 2.32.0 Bump nix binary to 2.32.0 without caching Oct 10, 2025
@OriolMunoz-da
Copy link
Contributor Author

Please hold off on merging until ...

...Monday, because nobody is dealing with any potential downfall today 😄

@isegall-da
Copy link
Contributor

@OriolMunoz-da

I cannot check for individual times from gh

Note that GH has a "show timestamps" when you look at its output.

@OriolMunoz-da OriolMunoz-da merged commit f333d10 into main Oct 14, 2025
57 checks passed
@OriolMunoz-da OriolMunoz-da deleted the oriol/nix-electric-boogaloo branch October 14, 2025 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants