Skip to content

Commit be67e91

Browse files
tbidneBodigrim
authored andcommitted
Generate package set on-the-fly
Instead of a library containing a manually written build-depends corresponding to a stackage snapshot, we now have an executable that queries stackage directly, and then uses the response to generate the desired cabal file. The executable then builds that project. The executable also includes the ability to split the package set into smaller groups, where each group is built sequentially. This allows for scenarios where building the entire set at once is not feasible, at the cost of performance. We also add 'postgresql-libpq' to linux/osx (requires postgres dep), and 'hfsevents' to osx.
1 parent 0cb9711 commit be67e91

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+3409
-2731
lines changed

.gitattributes

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
*.golden -text

.github/workflows/ci.yaml

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
name: ci
2+
on:
3+
push:
4+
branches:
5+
- master
6+
7+
pull_request:
8+
branches:
9+
- master
10+
11+
workflow_dispatch:
12+
jobs:
13+
cabal:
14+
strategy:
15+
fail-fast: false
16+
matrix:
17+
os:
18+
- "macos-latest"
19+
- "ubuntu-latest"
20+
runs-on: ${{ matrix.os }}
21+
steps:
22+
- uses: actions/checkout@v4
23+
- uses: haskell-actions/setup@v2
24+
with:
25+
ghc-version: "9.8.2"
26+
- name: Configure
27+
run: |
28+
cabal configure --enable-tests --ghc-options -Werror
29+
30+
- name: Build executable
31+
run: cabal build clc-stackage
32+
33+
- name: Unit Tests
34+
id: unit
35+
run: cabal test unit
36+
37+
- name: Print unit failures
38+
if: ${{ failure() && steps.unit.conclusion == 'failure' }}
39+
run: |
40+
cd test/unit/goldens
41+
42+
for f in $(ls); do
43+
echo "$f"
44+
cat "$f"
45+
done
46+
47+
- name: Functional Tests
48+
id: functional
49+
run: cabal test functional
50+
51+
- name: Print functional failures
52+
if: ${{ failure() && steps.functional.conclusion == 'failure' }}
53+
run: |
54+
cd test/functional/goldens
55+
56+
for f in $(ls); do
57+
echo "$f"
58+
cat "$f"
59+
done

.gitignore

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,6 @@
1-
/dist-newstyle
1+
/bin
2+
/dist-newstyle
3+
/generated/cabal.project.local
4+
/generated/dist-newstyle
5+
/generated/generated.cabal
6+
/output

README.md

Lines changed: 80 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## How to?
44

5-
This is a meta-package to facilitate impact assessment for [CLC proposals](https://github.com/haskell/core-libraries-committee). The package `clc-stackage.cabal` lists almost entire Stackage as `build-depends`, so that `cabal build` transitively compiles them all.
5+
This is a meta-package to facilitate impact assessment for [CLC proposals](https://github.com/haskell/core-libraries-committee).
66

77
An impact assessment is due when
88

@@ -13,32 +13,96 @@ An impact assessment is due when
1313
The procedure is as follows:
1414

1515
1. Rebase changes, mandated by your proposal, atop of `ghc-9.8` branch.
16+
1617
2. Compile a patched GHC, say, `~/ghc/_build/stage1/bin/ghc`.
17-
3. `git clone https://github.com/Bodigrim/clc-stackage`, then `cd clc-stackage`.
18-
4. Run `cabal build -w ~/ghc/_build/stage1/bin/ghc --keep-going` and wait for a long time.
19-
* On a recent Macbook Air it takes around 12 hours, YMMV.
20-
* You can interrupt `cabal` at any time and rerun again later.
21-
* Consider setting `--jobs` to retain free CPU cores for other tasks.
22-
* Full build requires roughly 7 Gb of free disk space.
23-
5. If any packages fail to compile:
24-
* copy them locally using `cabal unpack`,
25-
* patch to confirm with your proposal,
26-
* link them from `packages` section of `cabal.project`,
27-
* return to Step 4.
28-
6. When everything finally builds, get back to CLC with a list of packages affected and patches required.
18+
19+
3. `git clone https://github.com/haskell/clc-stackage`, then `cd clc-stackage`.
20+
21+
4. Build the exe: `cabal install clc-stackage --installdir=./bin`.
22+
23+
> :warning: **Warning:** Use a normal downloaded GHC for this step, **not** your custom built one. Why? Using the custom GHC can force a build of many dependencies you'd otherwise get for free e.g. `vector`.
24+
25+
5. Uncomment and modify the `with-compiler` line in [generated/cabal.project](generated/cabal.project) e.g.
26+
27+
```
28+
with-compiler: /home/ghc/_build/stage1/bin/ghc
29+
```
30+
31+
6. Run `./bin/clc-stackage` and wait for a long time. See [below](#the-clc-stackage-exe) for more details.
32+
33+
* On a recent Macbook Air it takes around 12 hours, YMMV.
34+
* You can interrupt `cabal` at any time and rerun again later.
35+
* Consider setting `--jobs` to retain free CPU cores for other tasks.
36+
* Full build requires roughly 7 Gb of free disk space.
37+
38+
To get an idea of the current progress, we can run the following commands
39+
on the log file:
40+
41+
```sh
42+
# prints completed / total packages in this group
43+
$ grep -Eo 'Completed|^ -' output/logs/current-build/stdout.log | sort -r | uniq -c | awk '{print $1}'
44+
110
45+
182
46+
47+
# combine with watch
48+
$ watch -n 10 "grep -Eo 'Completed|^ -' output/logs/current-build/stdout.log | sort -r | uniq -c | awk '{print \$1}'"
49+
```
50+
51+
7. If any packages fail to compile:
52+
53+
* copy them locally using `cabal unpack`,
54+
* patch to confirm with your proposal,
55+
* link them from `packages` section of `cabal.project`,
56+
* return to Step 6.
57+
58+
8. When everything finally builds, get back to CLC with a list of packages affected and patches required.
59+
60+
### The clc-stackage exe
61+
62+
Previously, this project was just a single (massive) cabal file that had to be manually updated. Usage was fairly simple: `cabal build clc-stackage --keep-going` to build the project, `--keep-going` so that as many packages as possible are built.
63+
64+
This has been updated so that `clc-stackage` is now an executable that will automatically generate the desired cabal file based on the results of querying stackage directly. This streamlines updates, provides a more flexible build process, and potentially has prettier output (with `--batch` arg):
65+
66+
![demo](example_output.png)
67+
68+
In particular, the `clc-stackage` exe allows for splitting the entire package set into subset groups of size `N` with the `--batch N` option. Each group is then built sequentially. Not only can this be useful for situations where building the entire package set in one go is infeasible, but it also provides a "cache" functionality, that allows us to interrupt the program at any point (e.g. `CTRL-C`), and pick up where we left off. For example:
69+
70+
```
71+
$ ./bin/clc-stackage --batch 100
72+
```
73+
74+
This will split the entire downloaded package set into groups of size 100. Each time a group finishes (success or failure), stdout/err will be updated, and then the next group will start. If the group failed to build and we have `--write-logs save-failures` (the default), then the logs and error output will be in `./output/logs/<pkg>/`, where `<pkg>` is the name of the first package in the group.
75+
76+
See `./bin/clc-stackage --help` for more info.
77+
78+
#### Optimal performance
79+
80+
On the one hand, splitting the entire package set into `--batch` groups makes the output easier to understand and offers a nice workflow for interrupting/restarting the build. On the other hand, there is a question of what the best value of `N` is for `--batch N`, with respect to performance.
81+
82+
In general, the smaller `N` is, the worse the performance. There are several reasons for this:
83+
84+
- The smaller `N` is, the more `cabal build` processes, which adds overhead.
85+
- More packages increase the chances for concurrency gains.
86+
87+
Thus for optimal performance, you want to take the largest group possible, with the upper limit being no `--batch` argument at all, as that puts all packages into the same group.
88+
89+
> [!TIP]
90+
>
91+
> Additionally, the `./output/cache.json` file can be manipulated directly. For example, if you want to try building only `foo`, ensure `foo` is the only entry in the json file's `untested` field.
2992
3093
## Getting dependencies via `nix`
94+
3195
For Linux based systems, there's a provided `flake.nix` and `shell.nix` to get a nix shell
3296
with an approximation of the required dependencies (cabal itself, C libs) to build `clc-stackage`.
3397
3498
Note that it is not actively maintained, so it may require some tweaking to get working, and conversely, it may have some redundant dependencies.
3599
36100
## Misc
37101
38-
* Your custom GHC will need to be on the PATH to build the `stack` library i.e.
102+
* Your custom GHC will need to be on the PATH to build the `stack` library e.g.
39103
40104
```
41-
export PATH=/path/to/custom/ghc/stage1/bin/:$PATH
105+
export PATH=/home/ghc/_build/stage1/bin/:$PATH
42106
```
43107
44-
Nix users can uncomment (and modify) this line in the `flake.nix`.
108+
Nix users can uncomment (and modify) this line in the `flake.nix`.

app/Main.hs

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
module Main (main) where
2+
3+
import CLC.Stackage.Runner qualified as Runner
4+
import CLC.Stackage.Utils.Logging qualified as Logging
5+
import Data.Text qualified as T
6+
import Data.Time.LocalTime qualified as Local
7+
import System.Console.Terminal.Size qualified as TermSize
8+
import System.IO (hPutStrLn, stderr)
9+
10+
main :: IO ()
11+
main = do
12+
mWidth <- (fmap . fmap) TermSize.width TermSize.size
13+
14+
case mWidth of
15+
Just w -> Runner.run $ mkLogger w
16+
Nothing -> do
17+
let hLogger = mkLogger 80
18+
Logging.putTimeInfoStr hLogger False "Failed detecting terminal width"
19+
Runner.run hLogger
20+
where
21+
mkLogger w =
22+
Logging.MkHandle
23+
{ Logging.getLocalTime = Local.zonedTimeToLocalTime <$> Local.getZonedTime,
24+
Logging.logStrErrLn = hPutStrLn stderr . T.unpack,
25+
Logging.logStrLn = putStrLn . T.unpack,
26+
Logging.terminalWidth = w
27+
}

cabal.project

Lines changed: 20 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,24 @@
1-
index-state: 2024-03-27T00:32:46Z
1+
index-state: 2024-10-11T23:26:13Z
22

33
packages: .
44

5-
constraints:
6-
al < 0,
7-
alsa-pcm < 0,
8-
alsa-seq < 0,
9-
ALUT < 0,
10-
btrfs < 0,
11-
fft < 0,
12-
flac < 0,
13-
glpk-headers < 0,
14-
hmatrix-gsl < 0,
15-
hopenssl < 0,
16-
hpqtypes < 0,
17-
hsdns < 0,
18-
hsndfile < 0,
19-
HsOpenSSL < 0,
20-
hw-kafka-client < 0,
21-
jack < 0,
22-
lame < 0,
23-
lapack-ffi < 0,
24-
lmdb < 0,
25-
magic < 0,
26-
mysql < 0,
27-
nfc < 0,
28-
pcre-light < 0,
29-
postgresql-libpq < 0,
30-
primecount < 0,
31-
pthread < 0,
32-
pulse-simple < 0,
33-
rdtsc < 0,
34-
regex-pcre < 0,
35-
re2 < 0,
36-
text-icu < 0,
5+
program-options
6+
ghc-options:
7+
-Wall -Wcompat
8+
-Widentities
9+
-Wincomplete-record-updates
10+
-Wincomplete-uni-patterns
11+
-Wmissing-deriving-strategies
12+
-Wmissing-export-lists
13+
-Wmissing-exported-signatures
14+
-Wmissing-home-modules
15+
-Wmissing-import-lists
16+
-Wpartial-fields
17+
-Wprepositive-qualified-module
18+
-Wredundant-constraints
19+
-Wunused-binds
20+
-Wunused-packages
21+
-Wunused-type-patterns
22+
-Wno-unticked-promoted-constructors
3723

38-
allow-newer:
39-
aura:bytestring,
40-
aura:time
41-
42-
constraints: hlint +ghc-lib
43-
constraints: ghc-lib-parser-ex -auto
44-
constraints: stylish-haskell +ghc-lib
24+
optimization: 2

0 commit comments

Comments
 (0)