Skip to content

Object database should transitively include alternate directories #38

Open
@andrewbraxton

Description

@andrewbraxton

This is a bit of a follow-up to the previous issue I filed in the main git-lfs repo as part of my effort to get Git LFS working more smoothly with our AOSP codebase, which is essentially tied to the use of git-repo. (Thanks for the response and resolution by the way!)

In that issue, @chrisd8088 pointed out that that git-repo has some preliminary support for the proper way of sharing Git objects, which is the alternates mechanism (in contrast to the hacky symlink business). I've been playing around with REPO_USE_ALTERNATES=1 and found that it mostly works fine, but occasionally when running certain Git LFS commands I noticed errors about missing objects.

I believe I have tracked this down to the fact that gitobj does not transitively include alternate directories, which is what git-repo sets up when you create a repo mirror (repo init --mirror).

That is to say, if Git Repo A has Git Repo B as an alternate object store, and Git Repo B has Git Repo C as an alternate object store, gitobj will not include Git Repo C in its database, causing missing objects errors with certain Git LFS commands.

It's easy to reproduce this by starting with an LFS repo, cloning it with git clone --shared, then cloning another shared repo off that one.

Suppose that "$REMOTE_REPO" is an LFS repo:

git clone $REMOTE_REPO remote
git clone --shared remote mirror
git clone --shared mirror local
cd local

# Error due to missing object
git lfs prune --dry-run

# But Git itself can find the object
git show <missing object>

# If you specify the env var, there will be no error, because `gitobj` properly respects the env var
GIT_ALTERNATE_OBJECT_DIRECTORIES=../remote/.git/objects:../mirror/.git/objects git lfs prune --dry-run

# If you do a repack such that all the objects end up duplicated inside local's .git/objects store, the command will succeed
git repack -adf
git lfs prune --dry-run

Before creating this issue, I wanted to make sure that this transitive inheritance of object stores is actually an intended/supported feature by Git, and it looks like it is based on this commit: git/git@c2f493a

Thus, I believe this is the code in gitobj that should be updated to support transitive object store inheritance:

gitobj/backend.go

Lines 45 to 72 in e2a3f83

func findAllBackends(mainLoose *fileStorer, mainPacked *pack.Storage, root string, algo hash.Hash) ([]storage.Storage, error) {
storage := make([]storage.Storage, 2)
storage[0] = mainLoose
storage[1] = mainPacked
f, err := os.Open(path.Join(root, "info", "alternates"))
if err != nil {
// No alternates file, no problem.
if err != os.ErrNotExist {
return storage, nil
}
return nil, err
}
defer f.Close()
scanner := bufio.NewScanner(f)
for scanner.Scan() {
storage, err = addAlternateDirectory(storage, scanner.Text(), algo)
if err != nil {
return nil, err
}
}
if err := scanner.Err(); err != nil {
return nil, err
}
return storage, nil
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp-wantedThe core team would like assistance in implementing this feature.

    Type

    No type

    Projects

    Status

    Enhancements

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions