feat: add GIT_AI_INTERNAL_DIR to isolate per-machine git-ai state on shared NFS homes by jwiegley · Pull Request #1650 · git-ai-project/git-ai

jwiegley · 2026-06-24T22:23:30Z

Summary

Adds a GIT_AI_INTERNAL_DIR environment variable so that multiple machines sharing an NFS home directory can each confine git-ai's volatile, lock-prone state to a machine-local directory — eliminating collisions on the SQLite databases, daemon sockets, and lock files when running git-ai (and its background daemon) concurrently across hosts.

Motivation: git-ai keeps per-user runtime state under ~/.git-ai/internal/ — four WAL-mode SQLite DBs (+ -wal/-shm sidecars), the daemon control.sock/trace2.sock/daemon.lock/daemon.pid.json/logs, plus distinct_id/update_check. WAL-mode SQLite, flock, and AF_UNIX sockets are all unsafe over NFS / across hosts, so a shared home means lock contention and potential database corruption.

Behavior

When GIT_AI_INTERNAL_DIR is set to a non-empty value it is git-ai's internal directory (verbatim — no internal segment appended), replacing the default ~/.git-ai/internal. Each host points it at a machine-local path. Unset / empty / whitespace-only preserves current behavior byte-for-byte.

Relocated when set: the 4 SQLite DBs (db, metrics-db, notes-db, transcripts-db), daemon sockets/lock/pid/logs, distinct_id, update_check, checkpoint-debug-logs, debug-self-checks, and the diagnostics trace2 self-check files.

Intentionally stays shared in ~/.git-ai (so login/config carry across machines): config.json, OAuth credentials, and skills/. Per-repo .git/ai/ working logs are unaffected (already per-repo).

Implementation

config::internal_dir_path() honors the var via a single git_ai_internal_dir_env_override() parser (empty / whitespace-only ⇒ unset).
The three DB modules that resolved their path directly off dirs::home_dir() (authorship/internal_db.rs, metrics/db.rs, notes/db.rs) now route through internal_dir_path(); their cfg(test) per-DB path overrides still take precedence. transcripts-db is centralized in DaemonConfig::transcripts_db_path(), used by both the daemon and the tests.
DaemonConfig::from_env_or_default_paths() makes GIT_AI_INTERNAL_DIR the highest-precedence internal-dir source (> GIT_AI_DAEMON_HOME > default); the explicit socket overrides still apply on top.
The detached daemon inherits the var by ordinary environment propagation (the spawn never env_clear()s nor sanitizes it), so the client, git proxy, and daemon resolve identical socket/lock/DB paths.
diagnostics.rs (trace2 self-check) and diagnostic_sentinels.rs (debug_self_check_root) route through internal_dir_path() too.
The test-only GIT_AI_TEST_COMPLETION_LOG_DIR daemon signal is cfg(any(test, feature = "test-support"))-gated so a production release binary cannot be coerced into daemon test mode by an env var (the test daemon is built with --features test-support, so it still honors it).

Tests

tests/integration/internal_dir.rs + per-module unit tests demonstrate the requirement on three levels, all grounded in the production resolvers (internal_dir_path(), DaemonConfig::from_default_paths(), each DB's database_path()) under each machine's GIT_AI_INTERNAL_DIR — so they fail if the variable is ever ignored — and hermetic against ambient per-DB test overrides:

Collision when unset / shared — identical resolved paths, a mutually-exclusive daemon.lock, shared DB files.
Isolation when set per machine — pairwise- and globally-disjoint paths for two and three machines, coexisting locks, isolated DB writes.
End-to-end concurrency — two and three real daemons (simulated hosts), each with its own GIT_AI_INTERNAL_DIR, driving real commit/authorship flows concurrently with correct line-level authorship on each; the collision case where a second daemon on a shared dir cannot start; and the detached-auto-spawn env-inheritance invariant (cross-platform, incl. Windows named pipes).
Per-DB path rerouting + override precedence; empty/whitespace handling.

Also includes a small unrelated fix: the api-key tests in config_fresh_test.rs now clear an ambient GIT_AI_API_KEY so they are hermetic (a pre-existing test-isolation gap surfaced during the full-suite run; it never affected CI's clean environment).

Verification

Full CI matrix green — Ubuntu, macOS, and all six Windows shards, plus Lint / Format / Doc / CodeQL / CLA / Snyk. Local: task build / task fmt / task lint clean; full suite 5151 tests pass.

🤖 Generated with Claude Code

https://claude.ai/code/session_01NQ4uZG1GGWWCTbND93jJuR

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

When multiple machines share an NFS home directory, git-ai's per-user runtime state under ~/.git-ai/internal -- the four WAL-mode SQLite databases, the daemon control/trace sockets, daemon.lock, and pid file -- collides across hosts, risking lock contention and database corruption. Add a GIT_AI_INTERNAL_DIR environment variable: when set to a non-empty value it IS git-ai's internal directory (verbatim), replacing the default ~/.git-ai/internal. Each host points it at a machine-local path so the hosts never share a database, socket, or lock. Unset / empty / whitespace-only preserves the default behavior byte-for-byte. Implementation: - config::internal_dir_path() honors the var via a shared git_ai_internal_dir_env_override() parser (empty/whitespace => unset). - The three DB modules that resolved their path directly off dirs::home_dir() (authorship db, metrics-db, notes-db) now route through internal_dir_path(); their cfg(test) per-DB path overrides still win. - DaemonConfig::from_env_or_default_paths() makes GIT_AI_INTERNAL_DIR the highest-precedence internal-dir source (> GIT_AI_DAEMON_HOME > default); explicit socket overrides still apply on top. - The detached daemon inherits the var by ordinary environment propagation (the spawn never env_clear()s it nor sanitizes it), so the client, git proxy, and daemon resolve identical socket/lock/DB paths. - The diagnostics trace2 self-check and debug-self-check root now route through internal_dir_path() so all in-scope runtime state relocates. Credentials, config.json, and skills stay shared in ~/.git-ai by design, so login and configuration remain shared across machines. Tests (tests/integration/internal_dir.rs + per-module unit tests): - collision when unset/shared: identical resolved paths, a mutually exclusive daemon lock, and shared DB files; - isolation when set per machine: pairwise- and globally-disjoint paths for two AND three machines, coexisting locks, isolated DB writes; - end-to-end concurrency with two and three real daemons driving real attribution flows with correct line-level authorship after every commit; - env inheritance on detached daemon auto-spawn; - per-DB path rerouting and per-DB override precedence. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NQ4uZG1GGWWCTbND93jJuR

…ion resolution A fess honesty-audit flagged that two path-only integration tests did not exercise the real resolver: - test_internal_dir_shared_machines_resolve_identical_colliding_paths asserted f(x) == f(x) (determinism), not that the env var is honored. - test_internal_dir_distinct_db_writes_are_isolated wrote to the test's own hardcoded paths instead of resolving through internal_dir_path(), so it would have passed even if GIT_AI_INTERNAL_DIR were ignored. Rewrite both to drive the production resolvers internal_dir_path() and DaemonConfig::from_default_paths() under the env guard and assert the resolved paths equal the configured machine-local dir, so each test now fails if the variable is ever ignored. Also cover the previously untested whitespace-only value (claimed equivalent to unset) in test_internal_dir_empty_or_whitespace_value_is_treated_as_unset. Test-only change; build / fmt / lint clean, 14 integration + 9 unit tests pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NQ4uZG1GGWWCTbND93jJuR

test_config_fresh_picks_up_api_key_changes and test_api_context_picks_up_api_key_changes build a temp HOME with an empty config and assert api_key().is_none(), but Config resolves the API key from the GIT_AI_API_KEY env var ahead of the on-disk config. On a machine where that var is exported (e.g. a developer's shell) the ambient key leaks in and both tests fail; in CI's clean environment they pass. This is a pre-existing test-isolation gap, unrelated to the GIT_AI_INTERNAL_DIR work. Add an ApiKeyEnvGuard (mirroring the existing HomeEnvGuard) that clears GIT_AI_API_KEY for the duration of these two tests and restores it on drop, so they exercise file-config behavior hermetically regardless of the ambient environment. Verified: both tests now pass with GIT_AI_API_KEY set in the environment; build / fmt / lint clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NQ4uZG1GGWWCTbND93jJuR

…(Windows named pipes) test_internal_dir_env_is_used_verbatim_for_all_runtime_paths asserted that every resolved runtime path is "under the internal dir, or the Unix hashed temp fallback". On Windows the daemon control/trace sockets are named pipes (\\.\pipe\git-ai-<hash>-...) -- derived from a hash of the internal dir but not filesystem paths under it -- so the predicate was false and the test failed deterministically on Windows (CI: Test on windows-latest integration 4/6, 592 passed / 1 failed). This was previously and incorrectly attributed to the unrelated 0xC0000142 process-spawn flake seen in shard 6/6. Accept the Windows named-pipe form as a third valid "derived from the internal dir" shape. The `== canonical` assertions just above already prove the client and daemon agree on the exact derived paths on every platform; this predicate is the looser sanity check that needed the Windows case. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NQ4uZG1GGWWCTbND93jJuR

…production resolvers Addresses three fess-audit findings. #2 (production safety): gate the GIT_AI_TEST_COMPLETION_LOG_DIR test-sync signal behind cfg(any(test, feature = "test-support")) in daemon.rs, so a production RELEASE binary (built without test-support) can no longer be flipped into daemon test mode by an env var. The test daemon is built with --features test-support, so it still honors the signal (verified: e2e + default-mode completion-log sync all pass). Extracted the completion-log-dir resolution into resolve_test_completion_log_dir() (returns None in production). #3 (vacuous / fixture-drift tests): the verbatim test asserted db_paths(&dir)[0] == dir.join("db") -- the test comparing its own arithmetic to itself. Add #[cfg(feature = "test-support")] database_path_for_test() wrappers to InternalDatabase / MetricsDatabase / NotesDatabase and resolve all test DB and runtime paths through the real production resolvers (database_path() + DaemonConfig::from_default_paths()) under each machine's GIT_AI_INTERNAL_DIR, hermetic against ambient per-DB test overrides. The collision / isolation / disjointness tests (two AND three machines) are now env-grounded: they fail if the variable is ever ignored. #4 (fragile / doc): tighten the ApiKeyEnvGuard doc; replace the fragile `git-ai usage` side-effect metrics-db assertion in the e2e tests with the now-deterministic verbatim/unit-test routing proof; fix module-doc drift. build / fmt / lint clean; 14 integration + 9 unit internal_dir tests pass; daemon regression smoke green (one pre-existing 500ms-timeout timing flake, daemon_partial_trace_line_..., passes 3/3 in isolation). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NQ4uZG1GGWWCTbND93jJuR

…-db e2e witness Two residuals from the adversarial review of cd85206: - transcripts-db was still self-compared: the test reconstructed the literal internal_dir.join("transcripts-db") and asserted it equals the same literal, so a production rename would not be caught (unlike the other three DBs, which route through database_path()). Centralize the literal in DaemonConfig::transcripts_db_path(), used by BOTH the daemon's open path (daemon.rs) and the test helpers, so the assertion now fails on a rename. - Restore one end-to-end metrics-db check (single-machine e2e) via `git-ai usage` as the subprocess-env-propagation witness, with a comment documenting the dependency. The deterministic routing proof remains in the verbatim/unit tests, so this is no longer the sole (fragile) proof -- it adds back the client-subprocess coverage the prior commit dropped. build / fmt / lint clean; 14 integration + 9 unit internal_dir tests pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NQ4uZG1GGWWCTbND93jJuR

jwiegley · 2026-07-01T17:34:53Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

devin-ai-integration

Devin Review found 0 new potential issues.

devin-ai-integration Bot reviewed Jun 24, 2026

View reviewed changes

jwiegley requested a review from svarlamov June 24, 2026 22:44

jwiegley and others added 6 commits July 1, 2026 17:34

jwiegley force-pushed the johnw/git-ai-dir branch from a5c2a9f to faf840d Compare July 1, 2026 17:34

devin-ai-integration Bot reviewed Jul 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add GIT_AI_INTERNAL_DIR to isolate per-machine git-ai state on shared NFS homes#1650

feat: add GIT_AI_INTERNAL_DIR to isolate per-machine git-ai state on shared NFS homes#1650
jwiegley wants to merge 6 commits into
mainfrom
johnw/git-ai-dir

jwiegley commented Jun 24, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

jwiegley commented Jul 1, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jwiegley commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Behavior

Implementation

Tests

Verification

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

jwiegley commented Jul 1, 2026

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jwiegley commented Jun 24, 2026 •

edited

Loading