Skip to content

Release: develop -> main#178

Open
github-actions[bot] wants to merge 9 commits intomainfrom
develop
Open

Release: develop -> main#178
github-actions[bot] wants to merge 9 commits intomainfrom
develop

Conversation

@github-actions
Copy link
Copy Markdown

@github-actions github-actions Bot commented May 7, 2026

Automatic Release PR

This PR was automatically created after changes were pushed to develop.

Commits: 1 new commit(s)

Checklist

  • Review all changes
  • Verify CI passes
  • Approve and merge when ready for production

…m (ARM64) (#177)

* ci: add lds-api docker pipeline + switch lnbitsapi to lightningdotspacecom (ARM64)

Adds:
- Dockerfile + .dockerignore at repo root for the lds-api NestJS service
- lds-api-{dev,prd}.yaml workflows that build linux/arm64 and push
  lightningdotspacecom/lds-api:{beta,latest} on push to {develop,main}

Updates:
- lnbitsapi-{dev,prd}.yaml: image renamed from dfxswiss/lnbitsapi:{latest,main}
  to lightningdotspacecom/lnbitsapi:{beta,latest}; build pinned to linux/arm64
  via QEMU + buildx; Node bumped from 16.x (EOL) to 18.x to match Dockerfile

The pre-existing api-{dev,prd}.yaml Azure App Service workflows are kept
intact for the migration window — they will be removed once the dfxprd LDS
stack is live.

Docker Hub credentials secret (DOCKER_USERNAME / DOCKER_PASSWORD) must be
set to a token with write access to the lightningdotspacecom org before
the first build runs.

* ci(lds): align workflows with DFX convention (ARM-native, deploy step)

Aligns the lds-api and lnbitsapi build pipelines with the DFX house
style (cf. juicedollarcom/api, deurocom/api):

- runs-on: ubuntu-24.04-arm (native ARM, no QEMU)
- platforms: linux/arm64 (single arch, matches DFX servers)
- Deploy step after build: install cloudflared, SSH via Cloudflare Tunnel
  to dfxdev/dfxprd, invoke deploy.sh with the canonical service name
  (lds-api / lds-lnbitsapi). Matches the case-block added in
  DFXServer/server commit ba6fdf6.

PR test workflows (api-pr.yaml, lnbitsapi-pr.yaml) bumped from Node 16.x
(EOL) to Node 18.x to match the production Dockerfile.

Required secrets per repo (set on top of DOCKER_USERNAME/PASSWORD):
- DEPLOY_DEV_SSH_KEY, DEPLOY_DEV_SSH_KNOWN_HOSTS, DEPLOY_DEV_HOST, DEPLOY_DEV_USER
- DEPLOY_PRD_SSH_KEY, DEPLOY_PRD_SSH_KNOWN_HOSTS, DEPLOY_PRD_HOST, DEPLOY_PRD_USER

The DEV deploy will only succeed once dfxdev:~/lds/docker-compose.yaml is
in place (skeleton currently committed without compose). Until then the
build step pushes to Docker Hub and the deploy step fails — fine, image
is published either way.
* fix(docker): install python3 + build toolchain for node-gyp

The CI build of lds-api fails on 'npm ci' with:
  npm ERR! gyp ERR! stack Error: Could not find any Python installation to use

Several deps (solana, eth-signing-related crates) have native modules
that node-gyp builds at install time. node:18-alpine ships without
Python or a C/C++ toolchain, so install python3 + make + g++ before
the npm step.

* fix(docker,lnbitsapi): same python3 + toolchain fix as lds-api

The lnbitsapi image uses the same node:18-alpine base as the new lds-api
image, and depends on sqlite3 which has a native binding compiled by
node-gyp. Add the same python3 + make + g++ install step proactively so
the next push under infrastructure/lnbitsapi/** doesn't hit the same
build failure.
* fix(docker): copy pruned node_modules from builder stage

Stage 2 was running 'npm ci --omit=dev' from scratch, which triggers
node-gyp on native deps (solana/eth signers) and fails the same way
stage 1 did before — the runtime image base also lacks python/g++.

Fix: do 'npm prune --omit=dev' in the builder (drop dev-only deps from
the existing node_modules tree, keeping the already-compiled native
binaries) and COPY node_modules across to the final stage.

This avoids ever re-running node-gyp at runtime-image-build time and
keeps the runtime base small (no python/toolchain there).

* fix(db): make postgres SSL opt-in via SQL_SSL=true

The hardcoded { rejectUnauthorized: false } SSL config forces a TLS
handshake against the postgres host even when the server doesn't speak
SSL — which breaks the dfxdev/dfxprd setup where lds-api talks to a
local api-postgres container without SSL.

  Error: The server does not support SSL connections

Make it opt-in: SSL only when SQL_SSL=true (the Azure-hosted PostgreSQL
Flexible Server expects it; the new container-postgres does not).
Stage 2 was running 'npm ci --omit=dev' from scratch, which triggers
node-gyp on native deps (solana/eth signers) and fails the same way
stage 1 did before — the runtime image base also lacks python/g++.

Fix: do 'npm prune --omit=dev' in the builder (drop dev-only deps from
the existing node_modules tree, keeping the already-compiled native
binaries) and COPY node_modules across to the final stage.

This avoids ever re-running node-gyp at runtime-image-build time and
keeps the runtime base small (no python/toolchain there).
Adds GET /v1/thunderhub/health which proxies an HTTP check to the
upstream ThunderHub container and returns 200 {"up": true} when
reachable, 503 with the error otherwise.

Why: Uptime Kuma already monitors all 14 LDS services on dev.lightning.space
via lds-api-forwarded paths (/v1/lndhub/getinfo, /v1/swap/v2/...). ThunderHub
was the one service without coverage because it only listens on the internal
nginx port 6123 (127.0.0.1 on dfxdev) and is reachable to operators only via
the SSH-tunneled `thunderhub-remote` script in the server repo. Several
times now a tunnel reconnect has been the only signal that the underlying
container has been down for hours. The Kuma-consistent fix is to expose a
narrow health probe through the same lds-api forwarding pattern the other
13 monitors already use.

Configuration:
- New env var LIGHTNING_THUNDERHUB_URL (default: empty → 503).
  In the DFX server repo's compose, this should be set to
  http://thunderhub:3000 (intra-stack docker DNS, no SSL — same pattern
  as the existing LIGHTNING_LNBITS_*_URL after the recent HTTP-intra-stack
  switch in DFXServer/server PR #133).

Endpoint behavior:
- 200 { "up": true } when ThunderHub responds with any 2xx/3xx/4xx
  (validateStatus rules out 5xx), so login-redirect is treated as up.
- 503 { "up": false, "error": "..." } on connect failure / 5xx / timeout.
- 5s timeout — short enough that Kuma sees DOWN within one probe interval.

No auth on the endpoint by design: it returns only a boolean, no internal
state, no upstream response body. Safe to be public.
LNBits generates LNURLs from the incoming request URL. In the Docker
setup, internal calls arrive as http://lnbits:5000/... which fails
LNURL validation (requires HTTPS or localhost).

Use Config.baseUrl (dev.lightning.space / lightning.space / localhost)
to set Host and X-Forwarded-Proto headers on all LNBits HTTP calls.
The swap stats query selects cs.version from chainSwaps, but this
column does not exist in the Boltz database schema (verified against
both LDS fork and upstream BoltzExchange/boltz-backend v3.13.0).
This causes /v1/support/swaps to fail with "column cs.version does
not exist".
Chain IDs are fixed per chain (Ethereum=1, Polygon=137, etc.),
not deployment-specific secrets. The env vars contained stale
Goerli/Mumbai testnet IDs from the Azure DEV setup, causing the
Alchemy SDK to query deprecated testnet endpoints.
…186)

Alchemy gateway URLs follow a fixed pattern per chain
(e.g. https://eth-mainnet.g.alchemy.com/v2). The env vars
contained stale Goerli/Mumbai testnet URLs causing
NETWORK_ERROR in the ethers.js provider used by
MonitoringService for ERC20 contract calls.

Only ALCHEMY_API_KEY remains as env var (it's a secret).
Citrea keeps its env var (non-Alchemy, deployment-specific RPC).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant