The monorepo rewrite story

A short summary of what changed in v1

Issue with initial proposal: https://github.com/apify/apify-shared-js/issues/131
PR with implementation: https://github.com/apify/apify-shared-js/pull/137
Additional (more practical) notes can be also found in CONTRIBUTING.md.
Example PRs of upgrade to v1:

Lerna with NPM 7 workspaces

we could also go with yarn (better support for workspaces and lock files, e.g. local packages are not part of the lock file)
lerna also has its own way of handling workspaces (lerna bootstrap) but that results in each package having its own lock file
results in NPM 7 being required for installing
root package.json is marked as private, and is used mostly for dev dependencies - as those are shared across all packages
child packages are in packages folder, each having its own package.json and TS configs
lerna will handle running commands in topological order (based on how the child packages depend on each other)

TypeScript support

to allow requiring local packages without the need to compile, we need to setup paths mapping - this is done at the root tsconfig.json that is than extended in all the packages

we have two TS configs for each package (and two of them at the root level), one for the general usage/development (e.g. IDE support), one for building

{
  "extends": "./tsconfig.build.json",
  "compilerOptions": {
    "baseUrl": ".",
    "paths": {
      "@apify/*": ["packages/*/src"] // <== here we let TS know that requires to `@apify/...` should be mapped to local files in `packages/*/src`
    }
  }
}

this is the main difference between the general tsconfig.json and tsconfig.build.json - in the build context we want to maintain the requires to @apify/... packages, only during development we want to use the paths mapping

Jest setup

similarly to how we need to let the TS compiler know about the paths mapping, we need to adjust how jest is configure too:

module.exports = {
    // ...
    preset: 'ts-jest',
    moduleNameMapper: {
        '@apify/(.*)': '<rootDir>/packages/$1/src',
    },
    globals: {
        'ts-jest': {
            tsconfig: 'test/tsconfig.json',
        },
    },
};

we use custom tsconfig.json for tests as the files inside test folder do not belong to the root one
if any of the packages use some nonstandard compiler option (like in our case input_schema package using resolveJsonModule: true), we need to enable those options in the test TS config too

Lock files

due to some issues in NPM ang GH actions, we needed to use lockfiles to be able to install in CI environment
this allows for proper node_modules caching, which results in fast install step if no dependencies changed
also allows to split the pipeline into multiple steps (e.g. build/test/lint), sharing the installed dependencies

Reworked CI pipeline and Publishing

previously there were two very similar workflows, one for PRs and the other for commits to master that were automatically publishing beta releases
with the nature of this repository, we wanted to ship the stable release right ahead
to be able to generate changelogs and create GH releases automatically, we use conventional commits, so each commit message tells us what type of change it does (e.g. fix -> patch bump, feat -> minor bump, breaking change -> major bump)
we use lerna publish to handle the orchestration of the release
- checks for changes in packages
- decides what version bump to use
- computes changelogs
- creates the GH releases
- publishes the packages

Build step

we have an NPM script in the root package.json npm run release
it will first build the app via lerna run build, which calls npm run build in each package, in topological order
each package builds the TS files from src folder to dist folder
afterwards copy.ts script is executed, copying package.json and other metafiles into the dist folder and fixes paths inside them
we then publish only the contents of the dist folder

Canary builds vs stable builds

usually the CI is used for shipping canary (dev) builds, but that means only publishing new versions to NPM
here we wanted to ship stable build, which also involves committing to the repository from CI
to allow pushing new commits, we need to use GH personal access token (plus obviously we need NPM publishing token)
- the token needs to belong to user with admin rights to the repository if we want to push to protected branch (especially if we have required reviews enforced)

to allow lerna to compute correct changelogs, we need to fetch the whole repository with all tags:

-   uses: actions/checkout@v2
    with:
        token: ${{ secrets.GH_TOKEN }}
        fetch-depth: 0 # we need to pull everything to allow lerna to detect what packages changed
        ref: master

we need to handle that the commit from CI won't trigger another CI build as we would end up in infinite loop
- done via checking the commit message for [skip ci] fragment

Shared workflow

instead of two very similar workflows, we now have a single one that handles all of build/test/lint/publish jobs
the publish job is conditional, only for master branch, and is dependent on all the previous jobs, that are otherwise ran in parallel
only in the publish job we fetch the whole git repository (fetch-depth: 0)

Root changelog

we decided to use independent versioning mode in lerna, which means that each package has its own independent version, and version bumps are calculated separately for each package
due to this, we do not have a common changelog for everything, we don't even have a shared version to track
each package has its own changelog in its package folder
the shared changelog is now automatically generated after each successful publishing
it contains list of all packages, their versions and links to their changelogs

Root `package-lock.json`

unline with yarn, NPM will include the local packages inside its lock file
unfortunately lerna won't update the root lock file when publishing
we need to run npm install again after successful publish and commit those changes, otherwise we would end up with outdated lock file
we handle this in the very same commit as the one that updates the root changelog file

PR title check

we have merge commits disallowed and linear commit history enforced
PRs should be squash merged, resulting in a single commit in master
the commit message (which is used for infering version bumps and changelogs) is taken from the PR title
to validate the commit message format we need to validate the PR title via action-semantic-pull-request action used in separate workflow

Commit hooks to enforce conventional commits

we use husky to setup commit hooks
one hook checks the commit message format via commitlint package
one hook runs linter, but only on staged files/changes (git add)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The monorepo rewrite story

Lerna with NPM 7 workspaces

TypeScript support

Jest setup

Lock files

Reworked CI pipeline and Publishing

Build step

Canary builds vs stable builds

Shared workflow

Root changelog

Root `package-lock.json`

PR title check

Commit hooks to enforce conventional commits

Clone this wiki locally

The monorepo rewrite story

Lerna with NPM 7 workspaces

TypeScript support

Jest setup

Lock files

Reworked CI pipeline and Publishing

Build step

Canary builds vs stable builds

Shared workflow

Root changelog

Root package-lock.json

PR title check

Commit hooks to enforce conventional commits

Clone this wiki locally

Root `package-lock.json`