Skip to content

initrd: add TPM DA lockout gating, counter auth fix, bad_auth tooling and documentation#2117

Draft
tlaurion wants to merge 5 commits into
masterfrom
tpm1_fixes
Draft

initrd: add TPM DA lockout gating, counter auth fix, bad_auth tooling and documentation#2117
tlaurion wants to merge 5 commits into
masterfrom
tpm1_fixes

Conversation

@tlaurion
Copy link
Copy Markdown
Collaborator

@tlaurion tlaurion commented May 14, 2026

Summary

Fixes a TPM1 counter auth regression (PR #2068) where increment_tpm_counter was changed from hardcoded -pwdc '' (empty counter auth) to -pwdc "${tpm_passphrase:-}" (owner passphrase), while counters continued to be created with -pwdc "". This caused every increment to compute SHA1(owner_pass) against a counter created with SHA1(""), producing persistent TPM_AUTHFAIL.

Per TCG TPM Main Spec Part 3, TPM_CreateCounter uses owner auth (-pwdo) but TPM_IncrementCounter uses the counter's own authData, not the owner password. The correct design for Heads' rollback counter is empty auth.

The repeated auth failures (3 per boot) triggered TPM 1.2 dictionary-attack lockout (TPM_DEFEND_LOCK_RUNNING), which persisted through forceclear on some implementations.

Adds DA state diagnostics, preflight guard and testing tooling.

Changes

initrd/bin/tpmr.sh — auth fix + DA state + bad_auth

  • tpm1_counter_increment(): detect -pwdc '', call tpm directly (bypass _tpm_auth_retry)
  • tpm1_reset(): detect defend lock, cycle physical presence, retry takeown
  • tpm1_da_state(): TPM1 DA via TPM_CAP_DA_LOGIC, output DA: line
  • tpm2_da_state(): TPM2 DA via getcap properties-variable, unlock estimate
  • tpm1_bad_auth() / tpm2_bad_auth(): deliberate wrong-auth for testing
  • Add 'defend' and '0x98e|0x149' to auth detection patterns

initrd/etc/functions.sh — counter auth + DA preflight

Other

  • initrd/bin/tpm-reset.sh: TPM reset frontend
  • initrd/bin/oem-factory-reset.sh: -pwdc '' for consistency
  • doc/tpm.md: DA diagnosis, testing, escalation, physical presence

Copilot AI review requested due to automatic review settings May 14, 2026 19:09
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Heads’ TPM error handling for TPM1 by correctly classifying tpmtotp “Defend lock running” output as an authorization-related failure (so it doesn’t immediately hard-fail), and adds a recovery path in tpm1_reset() to attempt clearing TPM1 defend-lock after forceclear by cycling physical presence.

Changes:

  • Extend auth-failure grep patterns to include defend (and unify inclusion of TPM2 auth hex codes in the shared retry helper).
  • Enhance tpm1_reset() to detect “defend lock” after takeown and retry after cycling physical presence.
  • Expand TPM documentation to describe tool selection, auth retry detection, and TPM1 defend-lock behavior.

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated 3 comments.

File Description
initrd/bin/tpmr.sh Treat “defend lock” as auth-related and add TPM1 defend-lock recovery logic during reset.
doc/tpm.md Document TPM toolchain selection and the updated auth retry / defend-lock behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread doc/tpm.md Outdated
Comment thread doc/tpm.md Outdated
Comment thread doc/tpm.md
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 2 comments.

Comment thread doc/tpm.md Outdated
Comment thread doc/tpm.md Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.

Comment thread doc/tpm.md Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated no new comments.

@notgivenby
Copy link
Copy Markdown
Contributor

The fix did not work…will copy logs.

@tlaurion
Copy link
Copy Markdown
Collaborator Author

The fix did not work…will copy logs.

Found the bug and where the regression comes from. Damn that one was not easy. Pushing fix and updating the other pr

@tlaurion tlaurion requested a review from Copilot May 16, 2026 01:08
@tlaurion tlaurion changed the title initrd/bin/tpmr.sh: fix TPM1 auth failure detection and defend lock recovery initrd: fix TPM1 counter auth regression and defend lock cascade failure May 16, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

Comment thread initrd/etc/functions.sh
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 4 changed files in this pull request and generated no new comments.

PR #2068 (tpm_reseal_ux-integrity_report-detect_disk_and_tpm_swap,
merged at d3d8053) changed increment_tpm_counter from hardcoded
-pwdc '' (empty counter auth) to -pwdc "${tpm_passphrase:-}" (owner
passphrase from cache/prompt), but left check_tpm_counter using empty
-pwdc when called from kexec-sign-config.sh without a $3 passphrase
argument.  This caused every counter increment to compute
SHA1(owner_pass) while the counter was created with SHA1("") -
persistent TPM_AUTH_FAIL.

Per TCG TPM Main Spec Part 3, TPM_CreateCounter uses owner auth
(-pwdo) but TPM_IncrementCounter uses the counter's own authData,
not the owner password.  The correct design for Heads' rollback
counter is empty auth: rollback security comes from the signed
/boot/kexec_rollback.txt and TPM sealing, not counter access control.

The repeated auth failures (3 per boot x ~5 boots via the
_tpm_auth_retry loop) triggered TPM 1.2 dictionary-attack lockout
(TPM_DEFEND_LOCK_RUNNING), which persisted through forceclear on
some implementations, causing tpm takeown to fail and TPM reset to
abort - a cascade failure from the counter auth mismatch.

Changes:
- initrd/bin/tpmr.sh (_tpm_auth_retry, tpm2_counter_inc, tpm2_seal,
  tpm1_seal): add 'defend' and '0x98e|0x149' to auth detection grep
  patterns so defend lock and TPM2 RC codes are treated as retryable
  auth failures rather than fatal errors
- initrd/bin/tpmr.sh (tpm1_reset): detect defend lock after takeown
  failure and cycle physical presence to clear the lock state before
  retrying; full AC power cycle remains the fallback if software
  presence is insufficient
- initrd/bin/tpmr.sh (tpm1_counter_increment): detect -pwdc '' and
  call tpm directly, bypassing _tpm_auth_retry which injected the
  owner passphrase.  Use || return to survive set -e on expected
  auth failure.
- initrd/etc/functions.sh (check_tpm_counter): pass -pwdc '' instead
  of -pwdc "${tpm_passphrase:-}" so counters use SHA1("") per TCG
  spec.  Document that $3 is intentionally ignored.
- initrd/etc/functions.sh (increment_tpm_counter): try -pwdc '' first
  for TPM1.  If that fails on a readable counter (created by PR #2068
  era code), prompt for owner passphrase and retry as migration
  fallback with clear WARN explaining the one-time migration and
  TPM reset option.
- initrd/etc/functions.sh (increment_tpm_counter): remove the
  TPM1-specific owner-passphrase prompt block added by PR #2068
- initrd/etc/functions.sh (increment_tpm_counter): DIE-path fallback
  counter_create: -pwdc '' for consistency
- initrd/bin/oem-factory-reset.sh: counter_create -pwdc '' for
  consistency with the empty-auth design
- doc/tpm.md: document TPM1 boot chain, tpmtotp tool selection,
  auth retry patterns, defend lock recovery, and physical presence

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…iew fixes

Add preflight dictionary attack (DA) lockout guard to
increment_tpm_counter, querying da_state before every counter
increment. DIE on active lockout, WARN when count nears threshold.

Add tpm1_da_state and tpm2_da_state for unified DA state query:
  - TPM1: reads TPM_CAP_DA_LOGIC (0x19); actionDependValue>0 +
    state=1 = locked; DA: line timer=field maps to actionDependValue
  - TPM2: reads getcap properties-variable (no single DA query);
    count >= max_auth = locked; estimate = (counter - max_auth + 1)
    * interval; DA: line timer=present when locked, absent when clean
  - Both output machine-parsable DA: line for the preflight guard

Fix da_timer sed pattern: use sed -n with /p so da_timer stays
empty when DA: line has no timer= field (TPM2 count < threshold).
Without -n, non-matching lines echoed the full line as da_timer.

Add tpm1_bad_auth and tpm2_bad_auth for testing:
  - Uses NV index auth (-P) not owner auth (-C o -P) for TPM2
    because NV auth failure produces TPM2_RC_AUTH_FAIL (0x98e) and
    increments LOCKOUT_COUNTER; owner auth may not increment on
    some TPM2 implementations
  - Intentionally wrong password TPM_DEFEND_LOCK_TEST_WRONG_PASSWORD
  - Shows DA state before and after; uses || true to survive set -e

Add tpm-reset.sh as a TPM reset frontend via tpmr.sh wrapper.

Add DA documentation in doc/tpm.md covering diagnosis, testing,
escalation, TPM1 vs TPM2 DA parameter configurability.

Review fixes:
  - Fix TPM2 property names: TPM_PT_ -> TPM2_PT_ prefix in doc/tpm.md
  - Remove misplaced design decision comment from tpm2_da_state
    (belongs in tpm2_bad_auth where it already exists)
  - Add DEBUG logging at every decision point across preflight guard
    and all DA state functions for runtime traceability
  - Document design decisions inline: timer logic, estimate formula,
    empty auth retry bypass, NV vs owner auth

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 5 changed files in this pull request and generated 3 comments.

Comment thread initrd/etc/functions.sh Outdated
Comment thread doc/tpm.md Outdated
Comment thread doc/tpm.md Outdated
- Fix TPM2 time remaining table entry: estimate is derived from
  LOCKOUT_COUNTER vs MAX_AUTH_FAIL times LOCKOUT_INTERVAL, not
  LOCKOUT_RECOVERY (which is the lockout-auth-blocked-after-failure
  timer, not the remaining-until-unlock)
- Reword migration WARN: 'older Heads version' not 'older firmware'
  (the migration case is caused by previous Heads code, not platform
  firmware)
- Remove fragile PR #2117 reference from preventing-future-lockouts
  section: describe the fix generically (restoring empty counter auth)
  so the doc is correct regardless of branch context

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
@tlaurion tlaurion changed the title initrd: fix TPM1 counter auth regression and defend lock cascade failure initrd: add TPM DA lockout gating, counter auth fix, bad_auth tooling and documentation May 17, 2026
When TPM1 does not support TPM_CAP_DA_LOGIC (0x19), tpm getcapability
may print raw TSS error text to stdout instead of returning empty.
The empty-string check missed this because error text is non-empty.

Fix: check that the output contains the 'State' field (expected for
valid DA capability data) before echoing. If missing, return
'unavailable' and suppress the raw TSS garbage.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
@tlaurion tlaurion marked this pull request as draft May 17, 2026 18:35
TPM_CAP_DA_LOGIC (0x19) was added late in TPM 1.2 spec rev 103.
Older Infineon TPMs (X230-era SLB9635/9645) and some Atmel chips
do not implement it and return TPM_BAD_MODE (exit 44).

Document this limitation in:
- tpm1_da_state function comment: specific TPM models affected
- increment_tpm_counter preflight guard: note that da_state may
  return unavailable on older TPM1
- doc/tpm.md: explain why some TPM1 hardware shows 'unavailable'

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants