Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cabal-install 3.14 linking fails with "shared object file not found" #10759

Closed
brandonchinn178 opened this issue Jan 19, 2025 · 19 comments · Fixed by #10828
Closed

cabal-install 3.14 linking fails with "shared object file not found" #10759

brandonchinn178 opened this issue Jan 19, 2025 · 19 comments · Fixed by #10828

Comments

@brandonchinn178
Copy link
Collaborator

Describe the bug

In my project, if I use the latest version of cabal-install in CI (3.14), I get a failure, but CI passes when pinning cabal-install to 3.12. PR: brandonchinn178/aeson-schemas#99. My project seems like a pretty normal library+executable+benchmark project, without anything special like compiler plugins. I'm running a simple cabal build && cabal exec -- cabal test. I can't repro locally.

CI logs:

One difference in 3.14 is it says "cabal_macros.h changed", presumably why it's rebuilding. Printing out cabal_macros.h before and after, I think the change is happy is not included after cabal build, but is included in the subsequent cabal exec -- cabal test. I don't use or declare happy anywhere in my cabal file, but it looks like a dependency, haskell-src-exts, uses it.

I do see this error message reported in other issues, including #8580 which I reported, but I'm reporting this separately because it looks like a possible regression between 3.12 and 3.14, instead of something I'm doing wrong.

To Reproduce
Steps to reproduce the behavior:

$ cabal build && cabal exec -- cabal test

Expected behavior
Should succeed

System information

  • Operating system
  • cabal, ghc versions

Additional context
Add any other context about the problem here.

@mpickering
Copy link
Collaborator

I can reproduce this locally.

@mpickering
Copy link
Collaborator

Is there a reason in particular you are running cabal exec -- cabal test rather than just cabal test?

@mpickering
Copy link
Collaborator

The issue appears to be that when cabal exec -- cabal test is run, there is an environment file in scope, this causes the .so to have more dependencies than it did initially (since more -package args are passed). cabal then sets the rpath, but only to actual dependencies (not everything in the environment file).

Then the .so is attempted to be loaded, the shared library fails to be found because it is not on the rpath.

@brandonchinn178
Copy link
Collaborator Author

Yes, I believe this test needs cabal exec because it shells out to ghc:

https://github.com/brandonchinn178/aeson-schemas/blob/main/test%2FTests%2FGetQQ.hs#L414

There might be other tests too. Removing cabal exec should see failures

@mpickering
Copy link
Collaborator

The cabal_macros.h file changes for the same reason, the ghc invocation picks up the environment file and hence there are extra -package arguments to ghc.

@mpickering
Copy link
Collaborator

@brandonchinn178 I can't reproduce with cabal-install at commit - 5a21af6 .. possibly already fixed. With 3.14.1.0 I see "configuration changed" as the reason it has decided to rebuild.

Perhaps another way to implement this would be cabal exec -- $(cabal list-bin aeson-schemas-test)

The configuration changed message is just a generic message which says that ElaboratedConfiguredPackage is different somehow.

@mpickering
Copy link
Collaborator

Good news is I can reproduce with a locally built 3.14.1.1 cabal-install.

@mpickering
Copy link
Collaborator

mpickering commented Mar 10, 2025

The reason why it thinks that the configuration has changed is because when cabal test runs inside cabal exec the PATH to happy is configured.

"/nix/store/2b5jf6vy42z8jinw0n023fld1b9xf5xm-ghc-9.  "/nix/store/2b5jf6vy42z8jinw0n023fld1b9xf5xm-ghc-9. 
10.1/bin/ghc"),                                      10.1/bin/ghc"),                                     
("ghc-pkg",                                          ("ghc-pkg",                                         
"/nix/store/2b5jf6vy42z8jinw0n023fld1b9xf5xm-ghc-9.  "/nix/store/2b5jf6vy42z8jinw0n023fld1b9xf5xm-ghc-9. 
10.1/bin/ghc-pkg-9.10.1"),                           10.1/bin/ghc-pkg-9.10.1"),                          
("haddock",                                          ("haddock",                                         
"/nix/store/2b5jf6vy42z8jinw0n023fld1b9xf5xm-ghc-9.  "/nix/store/2b5jf6vy42z8jinw0n023fld1b9xf5xm-ghc-9. 
10.1/bin/haddock-ghc-9.10.1"),                       10.1/bin/haddock-ghc-9.10.1"),                      
("happy",                                                                                                
"/tmp/nix-shell-891311-0/tmp.8XPSHwULXW/aeson-schem                                                      
as/new_store/ghc-9.10.1-inplace/happy-2.1.5-e-happy                                                      
-c05fa464c1d118eac060d2bb66deb7ee67bdbead053d4a468e                                                      
f31979c7e04537/bin/happy"),                                                                              
("hpc",                                              ("hpc",                                             
"/nix/store/2b5jf6vy42z8jinw0n023fld1b9xf5xm-ghc-9.  "/nix/store/2b5jf6vy42z8jinw0n023fld1b9xf5xm-ghc-9. 
10.1/bin/hpc-ghc-9.10.1"),                           10.1/bin/hpc-ghc-9.10.1"),                          
("hsc2hs",                                           ("hsc2hs",                                          
"/nix/store/2b5jf6vy42z8jinw0n023fld1b9xf5xm-ghc-9.  "/nix/store/2b5jf6vy42z8jinw0n023fld1b9xf5xm-ghc-9. 
10.1/bin/hsc2hs-ghc-9.10.1"),                        10.1/bin/hsc2hs-ghc-9.10.1"),                       
("ld",                                               ("ld",                                              
---                                                  ---                                                 
UnitId "first-class-families-0.8.1.0-5dad8c3d5d9563  UnitId "first-class-families-0.8.1.0-5dad8c3d5d9563 
576e1f44ad47c8184ecbe7c5145b991652b7f6c9cfb29edd5c"  576e1f44ad47c8184ecbe7c5145b991652b7f6c9cfb29edd5c" 
,                                                    ,                                                   
UnitId "hashable-1.5.0.0-bbf1f845af13b94075c573458a  UnitId "hashable-1.5.0.0-bbf1f845af13b94075c573458a 
b31dee318bda01c596ac2f915a9df45de776d3",             b31dee318bda01c596ac2f915a9df45de776d3",            
UnitId "megaparsec-9.7.0-627834fd7888b070e48135ff5d  UnitId "megaparsec-9.7.0-627834fd7888b070e48135ff5d 
2f9de070a4a86da0683b0b419a2044853aeefd",             2f9de070a4a86da0683b0b419a2044853aeefd",            
UnitId "template-haskell-2.22.0.0-inplace",          UnitId "template-haskell-2.22.0.0-inplace",         
UnitId "text-2.1.1-inplace",                         UnitId "text-2.1.1-inplace",                        
UnitId "unordered-containers-0.2.20-514b90ab4cd1751  UnitId "unordered-containers-0.2.20-514b90ab4cd1751 
972e07befc98d14e821b236a1f1757c58e46c6fd38d46e31c"]  972e07befc98d14e821b236a1f1757c58e46c6fd38d46e31c"] 
})},                                                 })})    

@mpickering
Copy link
Collaborator

Therefore I think this is the same issue as #10692 which was fixed by #10731.

However, the real issue is arguably that GHC called by cabal-install is getting affected by the existence of environment files. cabal-install already has logic to avoid getting affected by GHC_PACKAGE_PATH, so it should also be isolated by environment files existing.

@brandonchinn178
Copy link
Collaborator Author

I think that makes sense, thanks for investigating! Are there any plans for 3.14.1.2?

mpickering added a commit that referenced this issue Mar 11, 2025
Issue #10759 highlighted the issue that we were not isolating the calls
to ghc from the existence of environment files.

This manifested in a terminal bug where extra arguments form the
environment file were causing a link failure which was due to a
combination of #10692.

However, even before this bug the test executable was relinked to due to
the extra flags from the environment file.

```
Building test suite 'aeson-schemas-test' for aeson-schemas-1.4.2.1...
Loaded package environment from /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/tmp/environment.-69233/.ghc.environment.x86_64-linux-9.6.6
Loaded package environment from /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/tmp/environment.-69233/.ghc.environment.x86_64-linux-9.6.6
[23 of 23] Linking /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/build/x86_64-linux/ghc-9.6.6/aeson-schemas-1.4.2.1/t/aeson-schemas-test/build/aeson-schemas-test/aeson-schemas-test [Flags changed]
```

The correct solution is that calls to `ghc` made by `Cabal` should never
implicitly use an environment file. This is similar to how
`GHC_PACKAGE_PATH` is treated.

Fixes #10759
mpickering added a commit that referenced this issue Mar 11, 2025
The test tries to run `cabal` in an environment where `GHC_ENVIRONMENT`
is set, and checks that the compilation of a simple package isn't
affected by the variable being set.
mpickering added a commit that referenced this issue Mar 11, 2025
Issue #10759 highlighted the issue that we were not isolating the calls
to ghc from the existence of environment files.

This manifested in a terminal bug where extra arguments form the
environment file were causing a link failure which was due to a
combination of #10692.

However, even before this bug the test executable was relinked to due to
the extra flags from the environment file.

```
Building test suite 'aeson-schemas-test' for aeson-schemas-1.4.2.1...
Loaded package environment from /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/tmp/environment.-69233/.ghc.environment.x86_64-linux-9.6.6
Loaded package environment from /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/tmp/environment.-69233/.ghc.environment.x86_64-linux-9.6.6
[23 of 23] Linking /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/build/x86_64-linux/ghc-9.6.6/aeson-schemas-1.4.2.1/t/aeson-schemas-test/build/aeson-schemas-test/aeson-schemas-test [Flags changed]
```

The correct solution is that calls to `ghc` made by `Cabal` should never
implicitly use an environment file. This is similar to how
`GHC_PACKAGE_PATH` is treated.

Fixes #10759
mpickering added a commit that referenced this issue Mar 11, 2025
The test tries to run `cabal` in an environment where `GHC_ENVIRONMENT`
is set, and checks that the compilation of a simple package isn't
affected by the variable being set.
mpickering added a commit that referenced this issue Mar 11, 2025
Issue #10759 highlighted the issue that we were not isolating the calls
to ghc from the existence of environment files.

This manifested in a terminal bug where extra arguments form the
environment file were causing a link failure which was due to a
combination of #10692.

However, even before this bug the test executable was relinked to due to
the extra flags from the environment file.

```
Building test suite 'aeson-schemas-test' for aeson-schemas-1.4.2.1...
Loaded package environment from /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/tmp/environment.-69233/.ghc.environment.x86_64-linux-9.6.6
Loaded package environment from /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/tmp/environment.-69233/.ghc.environment.x86_64-linux-9.6.6
[23 of 23] Linking /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/build/x86_64-linux/ghc-9.6.6/aeson-schemas-1.4.2.1/t/aeson-schemas-test/build/aeson-schemas-test/aeson-schemas-test [Flags changed]
```

The correct solution is that calls to `ghc` made by `Cabal` should never
implicitly use an environment file. This is similar to how
`GHC_PACKAGE_PATH` is treated.

Fixes #10759
mpickering added a commit that referenced this issue Mar 11, 2025
The test tries to run `cabal` in an environment where `GHC_ENVIRONMENT`
is set, and checks that the compilation of a simple package isn't
affected by the variable being set.
@Mikolaj
Copy link
Member

Mikolaj commented Mar 13, 2025

@brandonchinn178: we still have quit a lot of regressions not fixed or not backported or not reviewed, so we are thinking how soon to try to release 3.14.2. What are you thoughts? Is a new release urgent for you?

BTW, is a fix this ticket ready? If so, could the kind author of the fix prepare it (including a backport) for the minor release? Which PR is it?

@brandonchinn178
Copy link
Collaborator Author

The PR that fixes this is #10731. @mpickering could you backport for 3.14.2?

New release is not urgent.

mpickering added a commit that referenced this issue Mar 14, 2025
Issue #10759 highlighted the issue that we were not isolating the calls
to ghc from the existence of environment files.

This manifested in a terminal bug where extra arguments form the
environment file were causing a link failure which was due to a
combination of #10692.

However, even before this bug the test executable was relinked to due to
the extra flags from the environment file.

```
Building test suite 'aeson-schemas-test' for aeson-schemas-1.4.2.1...
Loaded package environment from /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/tmp/environment.-69233/.ghc.environment.x86_64-linux-9.6.6
Loaded package environment from /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/tmp/environment.-69233/.ghc.environment.x86_64-linux-9.6.6
[23 of 23] Linking /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/build/x86_64-linux/ghc-9.6.6/aeson-schemas-1.4.2.1/t/aeson-schemas-test/build/aeson-schemas-test/aeson-schemas-test [Flags changed]
```

The correct solution is that calls to `ghc` made by `Cabal` should never
implicitly use an environment file. This is similar to how
`GHC_PACKAGE_PATH` is treated.

Fixes #10759
mpickering added a commit that referenced this issue Mar 14, 2025
The test tries to run `cabal` in an environment where `GHC_ENVIRONMENT`
is set, and checks that the compilation of a simple package isn't
affected by the variable being set.
@mpickering
Copy link
Collaborator

I have put a "proper fix", which I would not advise backporting, in #10828

mpickering added a commit that referenced this issue Mar 14, 2025
Issue #10759 highlighted the issue that we were not isolating the calls
to ghc from the existence of environment files.

This manifested in a terminal bug where extra arguments form the
environment file were causing a link failure which was due to a
combination of #10692.

However, even before this bug the test executable was relinked to due to
the extra flags from the environment file.

```
Building test suite 'aeson-schemas-test' for aeson-schemas-1.4.2.1...
Loaded package environment from /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/tmp/environment.-69233/.ghc.environment.x86_64-linux-9.6.6
Loaded package environment from /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/tmp/environment.-69233/.ghc.environment.x86_64-linux-9.6.6
[23 of 23] Linking /home/runner/work/aeson-schemas/aeson-schemas/dist-newstyle/build/x86_64-linux/ghc-9.6.6/aeson-schemas-1.4.2.1/t/aeson-schemas-test/build/aeson-schemas-test/aeson-schemas-test [Flags changed]
```

The correct solution is that calls to `ghc` made by `Cabal` should never
implicitly use an environment file. This is similar to how
`GHC_PACKAGE_PATH` is treated.

Fixes #10759
mpickering added a commit that referenced this issue Mar 14, 2025
The test tries to run `cabal` in an environment where `GHC_ENVIRONMENT`
is set, and checks that the compilation of a simple package isn't
affected by the variable being set.
mpickering added a commit that referenced this issue Mar 14, 2025
The test tries to run `cabal` in an environment where `GHC_ENVIRONMENT`
is set, and checks that the compilation of a simple package isn't
affected by the variable being set.
@Mikolaj
Copy link
Member

Mikolaj commented Mar 17, 2025

How about the non-proper fix? Is it a good idea to backport it for a bufix 3.14 release? @mpickering @brandonchinn178

@mpickering
Copy link
Collaborator

Yes it is very important to backport #10731 as that fixes another reported regression.

@Mikolaj
Copy link
Member

Mikolaj commented Mar 17, 2025

Got it. PR marked. I will leave the actual backporting to @Kleidukos, because order of backports matters.

@ulysses4ever
Copy link
Collaborator

Is this bug considered fixed by #10731? In that case, I suggest closing this regression issue for PR (as in "public relations") reasons. The "proper" fix has already been submitted in #10828 and will hopefully make it into 3.16 regardless of the status of this issue (closed or open), so I don't see a reason to keep it open.

Also, it'd be good if someone double-checked that cabal-head no longer exhibits this issue (after #10731 was merged). @brandonchinn178, maybe?

@brandonchinn178
Copy link
Collaborator Author

@ulysses4ever
Copy link
Collaborator

@brandonchinn178 awesome, thank you! (That "Build cabal-head" in the CI hit me in the gut as the person who spearheaded the distribution of cabal-head binaries, but then I saw that it does download the binaries, not "build" them 😅)

Mikolaj pushed a commit that referenced this issue Mar 26, 2025
The test tries to run `cabal` in an environment where `GHC_ENVIRONMENT`
is set, and checks that the compilation of a simple package isn't
affected by the variable being set.
@mergify mergify bot closed this as completed in #10828 Mar 26, 2025
@mergify mergify bot closed this as completed in fb8bb05 Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants