Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyzer: Git downloader does not use latest commit when revision is not fixed #7725

Closed
nnobelis opened this issue Oct 20, 2023 · 8 comments
Closed
Labels
downloader About the downloader tool

Comments

@nnobelis
Copy link
Member

One of our customers wants to run the Analyzer on the master branch of his repository.

When the run has finished :

vcs_processed:
    type: "Git"
    url: "https://XXX.git"
    revision: "3f3c85c3c662aad724536c4a64e7588e93c9ab0c"
    path: ""

This revision is not the latest commit from master, this is some old commit from 6 months ago !

I have seen that #7556 introduced a FETCH_HEAD mechanism for non fixed revisions.
I have also found that JGit may not be setting FETCH_HEAD properly: https://stackoverflow.com/questions/24682310/fetching-with-jgit-doesnt-show-the-last-commit

Could we have an issue here ?
I don't know what I could give for extra information but I would like @mnonnenmacher opinion on this.

@sschuberth sschuberth added bug downloader About the downloader tool labels Oct 20, 2023
@mnonnenmacher mnonnenmacher changed the title Analyzer: Git package manager does no use latest commit when tag is not a fixed revision Analyzer: Git downloader does not use latest commit when revision is not fixed Oct 29, 2023
@mnonnenmacher
Copy link
Member

@nnobelis Have you verified that the issue was introduced by #7556?

@sschuberth sschuberth added the needs info An issue where further information is required label Oct 31, 2023
@nnobelis
Copy link
Member Author

nnobelis commented Nov 2, 2023

@mnonnenmacher I tested with and without 53e755d and this is the guilty bit.

@sschuberth
Copy link
Member

I'm having a look at this... meanwhile, would you be able to contribute a (for now failing) test case @nnobelis?

@nnobelis
Copy link
Member Author

nnobelis commented Nov 2, 2023

@sschuberth I will try but I imagine this is not an obvious issue to pinpoint: As far as I know, only one of our customers reported this behavior.

@sschuberth sschuberth removed the needs info An issue where further information is required label Nov 2, 2023
@nnobelis
Copy link
Member Author

nnobelis commented Dec 5, 2023

@mnonnenmacher, @sschuberth

Unfortunately, I didn't manage to get a working test, but I think I know where the error might lies.

In the test written by @mnonnenmacher , https://github.com/bosch-io/oss-review-toolkit/blob/400e9efc4b6dd8bfba6c4cad37ec85b11f693b50/plugins/version-control-systems/git/src/funTest/kotlin/GitFunTest.kt#L84-L92, the run is an "happy path" : The test managed to do the git fetchwith JGit during the first try, i.e. Trying to fetch only revision XXX.

In this case, the file .git/FETCH_HEAD contains

0c58ea81d5c8112affab7a9cd6308deb4bc51589	not-for-merge	branch 'branch1' of https://github.com/oss-review-toolkit/ort-test-data-git.git
0c58ea81d5c8112affab7a9cd6308deb4bc51589	not-for-merge	tag 'tag1' of https://github.com/oss-review-toolkit/ort-test-data-git.git

Which means only one branch is considered.

In the case of our customer, all the fetches with JGit fail with this error:

org.eclipse.jgit.api.errors.TransportException: Expected ACK/NAK, got: shallow 30ac3a2add0828fd8b3362d66fe059242da68356

Maybe due to https://bugs.eclipse.org/bugs/show_bug.cgi?id=581228.

Anyway, in the end Git CLI is used instead of JGit for the fetch:

14:25:44.322 [main] INFO  org.ossreviewtoolkit.plugins.versioncontrolsystems.git.Git - Falling back to Git CLI.
14:25:44.327 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'git fetch --unshallow --tags origin' in '/tmp/test/tmp'...

In this case, the file .git/FETCH_HEAD contains a lots of tags/hashes. The first hash in the file is a very old one and is used when doing:
git reset --hard FETCH_HEAD.

FYI: This interesting comment https://stackoverflow.com/a/50568481 mentions that FETCH_HEAD is not a single branch information.

@nnobelis
Copy link
Member Author

nnobelis commented Dec 7, 2023

I wrote this morning a test with MockK to force the usage of Git CLI. Unfortunately, it seems this usage is not the culprit here, as suggested by my previous comment.

I noticed that all jgit fetch commands done by ORT (even the failing ones because of the TransportException), create a file .git/FETCH_HEAD. This file contains several references.

Additionally, I compared the file .git/FETCH_HEAD in a separate manual clone after a git fetch to the one in the ORT temporary working directory after the jGit checkout: The former has an extra line at the top:

2c1fe7c8ca49f1a8b427574e14fe8e1492f66c29		branch 'master' of https://dev.azure.com/XXX

So I think in the case of ORT, this file is not generated correctly by JGit because we are missing the entry, hence the bug.

sschuberth added a commit that referenced this issue Jan 24, 2024
The `.git/FETCH_HEAD` file does not only contain a single ref for the
current branch, but all refs that have been fetched last. Do not assume
the entry for the current branch to be the first one, and use JGit's
`FetchResult` to search for the correct resolved revision to reset to.

Fixes #7725.

Signed-off-by: Sebastian Schuberth <[email protected]>
@sschuberth
Copy link
Member

@nnobelis can you give the code in #8170 a try to verify that it fixes the issue for you?

sschuberth added a commit that referenced this issue Jan 25, 2024
The `.git/FETCH_HEAD` file does not only contain a single ref for the
current branch, but all refs that have been fetched last. Do not assume
the entry for the current branch to be the first one, and use JGit's
`FetchResult` to search for the correct resolved revision to reset to.

Fixes #7725.

Signed-off-by: Sebastian Schuberth <[email protected]>
sschuberth added a commit that referenced this issue Jan 25, 2024
The `.git/FETCH_HEAD` file does not only contain a single ref for the
current branch, but all refs that have been fetched last. Do not assume
the entry for the current branch to be the first one, and use JGit's
`FetchResult` to search for the correct resolved revision to reset to.

Fixes #7725.

Signed-off-by: Sebastian Schuberth <[email protected]>
@nnobelis
Copy link
Member Author

@sschuberth Yes the fix did the trick. Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
downloader About the downloader tool
Projects
None yet
Development

No branches or pull requests

3 participants