Skip to content

Conversation

fzakaria
Copy link
Contributor

Motivation

Users for builtins.fetchGit may specified:

  • nothing (will use default branch)
  • rev
  • ref
  • rev & ref
  • all of the above with allRefs

Previously if rev was ever specified, the ref attribute was completely ignored which to some felt like a bug. The bug is if the commit is valid but the ref was wrong, the git fetch works and people think they are on the matching ref.

This commit changes the behavior such that the ref if specified is always fetched and then the commit is searched using that ref.

This will help catch cases where the rev may be not matched with the specified ref.

If no ref is specified then the behavior continues as normal by trying to fetch the commit.

fixes #12974


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

Users for `builtins.fetchGit` may specified:
* nothing (will use default branch)
* rev
* ref
* rev & ref
* all of the above with `allRefs`

Previously if `rev` was ever specified, the `ref` attribute was
completely ignored which to some felt like a bug. The bug is if the
commit is valid but the ref was wrong, the `git fetch` works and people
think they are on the matching `ref`.

This commit changes the behavior such that the `ref` if specified is
always fetched and then the commit is searched using that ref.

This will help catch cases where the rev may be not matched with the
specified ref.

If no ref is specified then the behavior continues as normal by trying
to fetch the commit.

fixes NixOS#12974
@fzakaria fzakaria requested a review from edolstra as a code owner July 10, 2025 05:10
@github-actions github-actions bot added the fetching Networking with the outside (non-Nix) world, input locking label Jul 10, 2025
Comment on lines +631 to +639
} else if (originalRef) {
// FIXME: We should just use input.getRef() but it's modified above
// If ref is specified try that next to fetch the latest ref
// we will then check that the commit is part of it if given.
if (ref.compare(0, 5, "refs/")) {
fetchRef = fmt("%1%:%1%", ref);
} else {
fetchRef = *originalRef;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this just ignores rev when ref is specified, which is the original issue. So I don't understand how this is meant to fix #12974, unless rev is checked elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #13440 (comment)

@fzakaria
Copy link
Contributor Author

fzakaria commented Jul 10, 2025

@l0b0 it's a bit difficult to see what's going on in the change however here is an example.

Let's use 5a4aa7adeb087067cdb8539f6b6f486014c3b213 from my blog fzakaria/fzakaria.com

if we specify an invalid ref even with a valid commit, it fails.

> nix eval --expr 'builtins.fetchGit {url = "https://github.com/fzakaria/fzakaria.com"; rev = "5a4aa7adeb087067cdb8539f6b6f486014c3b213"; ref = "no-such-ref";}'

 error: Cannot find Git revision '5a4aa7adeb087067cdb8539f6b6f486014c3b213' in ref 'no-such-ref' of repository 'https://github.com/fzakaria/fzakaria.com'! Please make sure that the rev exists on the ref you've specified or add allRefs = true; to fetchGit.

if we specify a valid commit without a ref, it will fetch the commit and that will work if any of the advertised refs by the remote have that commit reachable.

> nix eval --expr 'builtins.fetchGit {url = "https://github.com/fzakaria/fzakaria.com"; rev = "5a4aa7adeb087067cdb8539f6b6f486014c3b213";}'

{ lastModified = 1751929943; lastModifiedDate = "20250707231223"; narHash = "sha256-rYwNDGotROs5vafWKXtBjFQhI6KbL0lIlYCH5Ui14pI="; outPath = "/nix/store/plcpv7vcvmppa2f33gik0d8amhjixn3l-source"; rev = "5a4aa7adeb087067cdb8539f6b6f486014c3b213"; revCount = 264; shortRev = "5a4aa7a"; submodules = false; }

I'll now use ref gh-pages which is just the compiled blog posts and is a ref with only ever 1 commit. This is what I believe your bug is about, the ref and rev don't match.

> nix eval --expr 'builtins.fetchGit {url = "https://github.com/fzakaria/fzakaria.com"; rev = "5a4aa7adeb087067cdb8539f6b6f486014c3b213"; ref = "gh-pages";}'

error: Cannot find Git revision '5a4aa7adeb087067cdb8539f6b6f486014c3b213' in ref 'gh-pages' of repository 'https://github.com/fzakaria/fzakaria.com'! Please make sure that the rev exists on the ref you've specified or add allRefs = true; to fetchGit.

If I specify a ref that does have the commit reachable everything works.

> nix eval --expr 'builtins.fetchGit {url = "https://github.com/fzakaria/fzakaria.com"; rev = "5a4aa7adeb087067cdb8539f6b6f486014c3b213"; ref = "master";}'

{ lastModified = 1751929943; lastModifiedDate = "20250707231223"; narHash = "sha256-rYwNDGotROs5vafWKXtBjFQhI6KbL0lIlYCH5Ui14pI="; outPath = "/nix/store/plcpv7vcvmppa2f33gik0d8amhjixn3l-source"; rev = "5a4aa7adeb087067cdb8539f6b6f486014c3b213"; revCount = 264; shortRev = "5a4aa7a"; submodules = false; }

@roberth
Copy link
Member

roberth commented Jul 10, 2025

Feature branches may disappear on the remote, but it's useful to be able to fetch it even after a merge+delete.
However, that's indistinguishable from an attack given the limited amount of information; specifically we'd need to know which alternate refs are ok to fetch this commit from.
I don't feel like users would proactively add extra refs to their fetchGit/fetchTree call, so that's not much of a solution.
We do need a solution, because when this happens in a dependency deep down, especially with fetchGit where we don't have a generic overriding mechanism like the flake lock, then it's rather time consuming for a user to set up all the forks and branches to make it evaluate again.
So I'd like to propose to add a flag that turns this error into a warning, so that we don't regress the pinnned feature branch UX too much.

@l0b0
Copy link
Contributor

l0b0 commented Jul 10, 2025

@l0b0 it's a bit difficult to see what's going on in the change however here is an example.

Thanks for the thorough examples!

Let's use 5a4aa7adeb087067cdb8539f6b6f486014c3b213 from my blog fzakaria/fzakaria.com

if we specify an invalid ref even with a valid commit, it fails.

> nix eval --expr 'builtins.fetchGit {url = "https://github.com/fzakaria/fzakaria.com"; rev = "5a4aa7adeb087067cdb8539f6b6f486014c3b213"; ref = "no-such-ref";}'

 error: Cannot find Git revision '5a4aa7adeb087067cdb8539f6b6f486014c3b213' in ref 'no-such-ref' of repository 'https://github.com/fzakaria/fzakaria.com'! Please make sure that the rev exists on the ref you've specified or add allRefs = true; to fetchGit.

That's as expected.

if we specify a valid commit without a ref, it will fetch the commit and that will work if any of the advertised refs by the remote have that commit reachable.

> nix eval --expr 'builtins.fetchGit {url = "https://github.com/fzakaria/fzakaria.com"; rev = "5a4aa7adeb087067cdb8539f6b6f486014c3b213";}'

{ lastModified = 1751929943; lastModifiedDate = "20250707231223"; narHash = "sha256-rYwNDGotROs5vafWKXtBjFQhI6KbL0lIlYCH5Ui14pI="; outPath = "/nix/store/plcpv7vcvmppa2f33gik0d8amhjixn3l-source"; rev = "5a4aa7adeb087067cdb8539f6b6f486014c3b213"; revCount = 264; shortRev = "5a4aa7a"; submodules = false; }

Also as expected.

I'll now use ref gh-pages which is just the compiled blog posts and is a ref with only ever 1 commit. This is what I believe your bug is about, the ref and rev don't match.

> nix eval --expr 'builtins.fetchGit {url = "https://github.com/fzakaria/fzakaria.com"; rev = "5a4aa7adeb087067cdb8539f6b6f486014c3b213"; ref = "gh-pages";}'

error: Cannot find Git revision '5a4aa7adeb087067cdb8539f6b6f486014c3b213' in ref 'gh-pages' of repository 'https://github.com/fzakaria/fzakaria.com'! Please make sure that the rev exists on the ref you've specified or add allRefs = true; to fetchGit.

If I specify a ref that does have the commit reachable everything works.

In Git, a ref corresponds to a single commit (although for practical purposes it's often used to also refer to every ancestor of that commit), so this seems fishy. I guess some users would not consider that a bug. I'd rather it was an error, at least by default.

@fzakaria
Copy link
Contributor Author

I'm not clear from your answer if you are saying you agree or not....

Basically, a commit has to be reachable from a ref. If you don't include the ref then the commit only works if it's reachable from the advertised refs git ls-remote or if the server has allowAnySHA1InWant enabled ref.

So if you have specify a commit, you are kind of hoping that it's advertised or that setting is set.

By adding a ref & a rev, you are being more explicit "Hey, I expect this commit to be an ancestor of this ref". This can be useful for tying a commit to a release branch.

The nice thing is, if you put a commit but a faulty ref, this should catch some cases of that (unless the faulty ref also has that as an ancestor).

@roberth thanks for taking the time to look at it.
Does my explanation/investigation make sense? If so, I can wrap it with a flag to trigger previous logic and update the tests accordingly.

@l0b0
Copy link
Contributor

l0b0 commented Jul 11, 2025

I'm not clear from your answer if you are saying you agree or not....

Sorry 😁. I consider the last case, where the rev is reachable but not equal to the ref, to be an error. Consider the possibility of a supply chain attack, for example. The attacker can change a ref to point to any commit. I'd rather see an implementation which returns an error if ref does not point to exactly rev.

@roberth
Copy link
Member

roberth commented Jul 11, 2025

By adding a ref & a rev, you are being more explicit "Hey, I expect this commit to be an ancestor of this ref". This can be useful for tying a commit to a release branch.

I agree that that's a useful feature. My worry is that people will use it excessively for two reasons

  1. previous versions of Nix required a ref, and some users haven't unlearned that
  2. "the more info the better"
    • avoid having to fetch everything, brought up in discussions of (1)
    • document where it's from

So I believe it will be overused, and we need an escape hatch for the case when inevitable feature branches are merged and deleted. Doing the archaeology to figure out your chain of transitive reverse deps and making PRs for all of them is a bunch of work that shouldn't block your deployment.

I'd rather see an implementation which returns an error if ref does not point to exactly rev.

For tags I agree, but for pinning a branch it would be unusable.

@fzakaria
Copy link
Contributor Author

@roberth it's funny to worry it'll be over-used -- based on my research when providing a commit (rev) only, you are still open to failure if the server stops publishing a ref that has it as an ancestor. Seems like there is no "perfect solution"

@l0b0 I consider the case where rev == ref (not just an ancestor) the most brittle. That means any committed flake to a branch, let's say release-25.05, will fail once the ref has moved forward. Maybe in that case a warning that the commit is no longer the tip of the ref is worthwhile.

so... where did we land on where we should take this change?
@roberth I want to defer to you here -- I am looking just to contribute so I am happy to set-aside my own ideology :)

@l0b0
Copy link
Contributor

l0b0 commented Jul 11, 2025

@l0b0 I consider the case where rev == ref (not just an ancestor) the most brittle. That means any committed flake to a branch, let's say release-25.05, will fail once the ref has moved forward. Maybe in that case a warning that the commit is no longer the tip of the ref is worthwhile.

For me, brittle doesn't come into it. I would never use a ref which I should expect to just change at any time. I'd use a tag, and ask the maintainer whether anything is wrong if the tag ever moves.

@fzakaria
Copy link
Contributor Author

@l0b0 super popular is to reference nixpkgs by branch name though:

inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-24.11";

These branches get security updates and change at any time.

@l0b0
Copy link
Contributor

l0b0 commented Jul 12, 2025

@l0b0 super popular is to reference nixpkgs by branch name though:

inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-24.11";

These branches get security updates and change at any time.

Is that applicable? When used without a lock file, it's also used without a commit ID, so there's no issue (TOFU). When used with a lock file, whatever reads the lock file and performs the download surely just uses the commit ID without the (at that point) pointless ref.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fetching Networking with the outside (non-Nix) world, input locking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fetchGit should fail if rev and ref don't match
3 participants