Skip to content

New remotes save with unexpected metadata in the presence of global configuration #1951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
emilazy opened this issue Apr 12, 2025 · 3 comments
Labels
acknowledged an issue is accepted as shortcoming to be fixed help wanted Extra attention is needed

Comments

@emilazy
Copy link
Contributor

emilazy commented Apr 12, 2025

Current behavior 😯

When a user has global configuration like remote.origin.prune = true, a repository configuration with no remote is opened, and a remote is created and saved with that name to the repository’s configuration, the new configuration is written to the existing section from the global configuration. This means that the metadata from the global configuration (including trust) is used instead of the file’s metadata, and an attempt to save the modified configuration with config.write_to_filter(&mut config_file, |section| section.meta() == config.meta()) will omit the new remote.

This differs from the behaviour when there is no configuration for that remote outside of the repository configuration file, or the behaviour where a separate remote section already exiets in the repository configuration.

Expected behavior 🤔

A new remote section is created if it does not already exist in the gix::config::File, even if there is configuration for that remote from other sources.

In general I’m not sure when you would ever want the behaviour of the current section_mut family of functions, and I suspect this is downstream of the fact that gix::config::File mixes together sections from all sources, which also makes editing and saving configuration somewhat inconvenient. I think it is likely that a better design would have a gix::config::File per configuration source, with an abstraction on top that layers them together for look‐up of values. Mutating an abstraction consisting of the combination of multiple cascading configurations doesn’t make that much sense; you want to use it to decide the effective values, but operate on one specific file for mutation and writing.

However, this could be worked around in the remote API by simply creating a new section if one with metadata that matches the actual file doesn’t already exist.

Git behavior

yuyuko:/v/f/y/m/T/tmp.vVfQbOHnTq
❭ git init
Initialized empty Git repository in /private/var/folders/yd/mh726b5d2vqfyp132jtzq9t80000gn/T/tmp.vVfQbOHnTq/.git/

yuyuko:/v/f/y/m/T/tmp.vVfQbOHnTq
❭ git -c remote.origin.prune=true remote add origin https://example.com/

yuyuko:/v/f/y/m/T/tmp.vVfQbOHnTq
❭ cat .git/config
[core]
        repositoryformatversion = 0
        filemode = true
        bare = false
        logallrefupdates = true
        ignorecase = true
        precomposeunicode = true
[remote "origin"]
        url = https://example.com/
        fetch = +refs/heads/*:refs/remotes/origin/*

Steps to reproduce 🕹

use std::fs::File;

use tempfile::TempDir;

type Error = Box<dyn std::error::Error>;

fn save_config(config: &gix::config::File) -> Result<(), Error> {
    let mut config_file = File::create(config.meta().path.as_ref().unwrap())?;
    config.write_to(&mut config_file)?;
    Ok(())
}

fn dump_remote_sections(header: &str, config: &gix::config::File) -> Result<(), Error> {
    println!("# {header}");
    println!("# {:?}", config.meta());
    for section in config.sections_by_name("remote").unwrap() {
        println!("## {:?}", section.meta());
        section.write_to(&mut std::io::stdout())?;
    }
    println!();
    Ok(())
}

fn open_repo(repo_dir: &TempDir) -> Result<gix::Repository, Error> {
    Ok(gix::open::Options::default()
        .with(gix::sec::Trust::Reduced)
        .config_overrides([b"remote.origin.prune = true"])
        .open(repo_dir.path())?
        .to_thread_local())
}

fn demonstrate_bug(repo_dir: &TempDir) -> Result<(), Error> {
    let repo = open_repo(repo_dir)?;
    let mut config = repo.config_snapshot().clone();
    dump_remote_sections("Initial state", &config)?;
    repo.remote_at("https://example.com/")?
        .save_as_to("origin", &mut config)?;
    dump_remote_sections("After saving remote", &config)?;
    save_config(&config)?;

    dump_remote_sections("After reload", &open_repo(repo_dir)?.config_snapshot())?;

    Ok(())
}

fn main() -> Result<(), Error> {
    let repo_dir = tempfile::tempdir()?;
    gix::init(&repo_dir)?;
    demonstrate_bug(&repo_dir)?;
    Ok(())
}

Output:

# Initial state
# Metadata { path: Some("/var/folders/yd/mh726b5d2vqfyp132jtzq9t80000gn/T/.tmpke1Cbq/.git/config"), source: Local, level: 0, trust: Reduced }
## Metadata { path: None, source: Api, level: 0, trust: Full }
[remote "origin"]
        prune = true

# After saving remote
# Metadata { path: Some("/var/folders/yd/mh726b5d2vqfyp132jtzq9t80000gn/T/.tmpke1Cbq/.git/config"), source: Local, level: 0, trust: Reduced }
## Metadata { path: None, source: Api, level: 0, trust: Full }
[remote "origin"]
        prune = true
        url = https://example.com/

# After reload
# Metadata { path: Some("/var/folders/yd/mh726b5d2vqfyp132jtzq9t80000gn/T/.tmpke1Cbq/.git/config"), source: Local, level: 0, trust: Reduced }
## Metadata { path: Some("/var/folders/yd/mh726b5d2vqfyp132jtzq9t80000gn/T/.tmpke1Cbq/.git/config"), source: Local, level: 0, trust: Reduced }
[remote "origin"]
        prune = true
        url = https://example.com/
## Metadata { path: None, source: Api, level: 0, trust: Full }
[remote "origin"]
        prune = true
@Byron Byron added help wanted Extra attention is needed acknowledged an issue is accepted as shortcoming to be fixed labels Apr 13, 2025
@Byron
Copy link
Member

Byron commented Apr 13, 2025

Apologies for the late response, thanks a lot for reporting and sorry for the hassle.

I understand that section_mut() as used in Remote::save_as_to() finds a global section and changes it instead. When saving that configuration, it writes everything to the local configuration due to the way this is implemented, but also code has no way of telling global and local modifications apart anymore.

After reloading, this is also what we see - global and local modifications were merged effectively, something that isn't happening when Git is performing the same operation.

However, this could be worked around in the remote API by simply creating a new section if one with metadata that matches the actual file doesn’t already exist.

I agree that a fix would be pass only the local config to begin with, which currently isn't super convenient. Repository::config_snapshot() will always provide the fully-merged result.

To me, a fix would be a documentation change to make clear that the passed configuration isn't filtered, and is expected to be filtered based on the user needs. Typically, this config should only contain local sections, it kind of overrides everything that is present already anyway.

As a second step, there could be a config_snapshot(_mut)() variant which returns pre-filtered configuration, maybe even with a closure, or add a method to filter sections to get a similar effect as git2::Config::open_level().

Do you think that would solve the problem decently? Maybe there are other solutions you imagined?
In any case, your help with this would be very welcome.
Thank you.

@emilazy
Copy link
Contributor Author

emilazy commented Apr 13, 2025

I understand that section_mut() as used in Remote::save_as_to() finds a global section and changes it instead. When saving that configuration, it writes everything to the local configuration due to the way this is implemented, but also code has no way of telling global and local modifications apart anymore.

After reloading, this is also what we see - global and local modifications were merged effectively, something that isn't happening when Git is performing the same operation.

Right. So when we actually write out configuration in Jujutsu, we do config.write_to_filter(&mut config_file, |section| section.meta() == config.meta()), and this basically works fine except in this edge case and a few others. The result in that edge case is actually kinda worse than in the reproducer program I’ve shown here, because the section merging can end up with us removing the remote from the repository configuration entirely. We have documented some of these edge cases in jj-vcs/jj@045b17e and worked around this one in jj-vcs/jj@af67d1d by creating a new empty section before writing out the remote.

I agree that a fix would be pass only the local config to begin with, which currently isn't super convenient. Repository::config_snapshot() will always provide the fully-merged result.

To me, a fix would be a documentation change to make clear that the passed configuration isn't filtered, and is expected to be filtered based on the user needs. Typically, this config should only contain local sections, it kind of overrides everything that is present already anyway.

As a second step, there could be a config_snapshot(_mut)() variant which returns pre-filtered configuration, maybe even with a closure, or add a method to filter sections to get a similar effect as git2::Config::open_level().

I am not sure this would be a great or at least complete fix. It would mean that you can’t really use the higher-level Remote API to do manipulation of remote configuration in files, since the Remote API operates on repositories rather than configurations. There’s also the existing problem where you can’t really make a new remote and save it to a repository’s mutable configuration, because both of them want to borrow the repository.

My feeling is that a more holistic fix might look like: gix::config::File represents one single file losslessly, i.e. it has a single Metadata, and gix::config::Layered (or something) stores a stack of gix::config::Files, and has lookup functions that resolves cascading overrides. You can borrow individual gix::config::Files from it to mutate them, but it does not directly offer mutation. APIs read from gix::config::Layered, but save to gix::config::Files. Remotes are created from gix::config::Layered, so you can construct one from a repository’s entire configuration in order to get the resolved configuration for a remote, or filter out a singleton gix::config::Layered with only the repository configuration for remote management operations, and write back modified configuration to the same. Since it keeps around a reference it can save directly to one of the Files backing its configuration rather than needing one passed in. That would also allow access to more elaborate remote‐related configuration, like the various other remote settings and branch remote config, which would help implement things like remote renaming.

Or something like that; it’s just a sketch and there are trade‐offs here. Unfortunately I don’t think I’ll have time to do a restructuring this big any time soon :( The workaround will probably work fine for us and we can do a more complicated one if our remote management needs become more complex.

@Byron
Copy link
Member

Byron commented Apr 14, 2025

I understand that section_mut() as used in Remote::save_as_to() finds a global section and changes it instead. When saving that configuration, it writes everything to the local configuration due to the way this is implemented, but also code has no way of telling global and local modifications apart anymore.
After reloading, this is also what we see - global and local modifications were merged effectively, something that isn't happening when Git is performing the same operation.

Right. So when we actually write out configuration in Jujutsu, we do config.write_to_filter(&mut config_file, |section| section.meta() == config.meta()), and this basically works fine except in this edge case and a few others. The result in that edge case is actually kinda worse than in the reproducer program I’ve shown here, because the section merging can end up with us removing the remote from the repository configuration entirely. We have documented some of these edge cases in jj-vcs/jj@045b17e and worked around this one in jj-vcs/jj@af67d1d by creating a new empty section before writing out the remote.

‼️

I am not sure this would be a great or at least complete fix. It would mean that you can’t really use the higher-level Remote API to do manipulation of remote configuration in files, since the Remote API operates on repositories rather than configurations. There’s also the existing problem where you can’t really make a new remote and save it to a repository’s mutable configuration, because both of them want to borrow the repository.

I see, thanks for pointing this out! The save_as_to() method was created to be able to perform a full clone, and as such it's more of a hack than anything that is fully thought out. And of course, that comes with (unanticipated) problems that you now have to deal with - I am sorry for that.

My feeling is that a more holistic fix might look like: gix::config::File represents one single file losslessly, i.e. it has a single Metadata, and gix::config::Layered (or something) stores a stack of gix::config::Files, and has lookup functions that resolves cascading overrides. You can borrow individual gix::config::Files from it to mutate them, but it does not directly offer mutation. APIs read from gix::config::Layered, but save to gix::config::Files.

I also think something like this is needed, but only when writing files back is desired. There are legitimate cases when one wants to do in-memory edits, and this is (seemingly) fine with the current system.
However, when writing back changes is desired, the current system definitely falls over and is hard to use, the save_as_to() method being a good example for that (I tried and failed).
However, I also think that git2::Config::open_level() can show a way towards writing files back, somewhat in the vein of what you are proposing.

Without going into any detail, it feels like gix can abstract over this when offering an API to make writing changes back to disk easier, specifically cross-checked with use-cases like they occur with remotes: create/update/rename/delete should all work intuitively with it. All that is assuming that the underlying system coming with gix-config is powerful enough.

Finally, once gix has found a way, there might be opportunities to push that API down to gix-config.

Remotes are created from gix::config::Layered, so you can construct one from a repository’s entire configuration in order to get the resolved configuration for a remote, or filter out a singleton gix::config::Layered with only the repository configuration for remote management operations, and write back modified configuration to the same. Since it keeps around a reference it can save directly to one of the Files backing its configuration rather than needing one passed in. That would also allow access to more elaborate remote‐related configuration, like the various other remote settings and branch remote config, which would help implement things like remote renaming.

That is an interesting thought, and I admittedly don't understand how it would work, but we will get to it when we get to it :).
No matter what, any update to how configuration files are written needs to be able to provide answers to the issues that came up when dealing with remote configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledged an issue is accepted as shortcoming to be fixed help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants