Skip to content

fix(secrets): add standalone Telegram bot token detection rule#656

Draft
mstfash wants to merge 1 commit intomainfrom
fix/telegram-bot-token-standalone-redaction
Draft

fix(secrets): add standalone Telegram bot token detection rule#656
mstfash wants to merge 1 commit intomainfrom
fix/telegram-bot-token-standalone-redaction

Conversation

@mstfash
Copy link
Copy Markdown
Collaborator

@mstfash mstfash commented Mar 11, 2026

Summary

Fixes secret redaction for Telegram bot tokens (e.g. 8797664862:AAF...v4) that are pasted or typed without surrounding config-style context.

Problem

The existing telegram-bot-api-token rule from gitleaks requires:

  1. A contextual keyword like telegr before the token
  2. An assignment operator (=, :, =>, etc.) between the keyword and the value

This means bare tokens are not redacted when:

  • Pasted directly into the input or ask_user custom input
  • Typed character-by-character without surrounding context like TELEGRAM_TOKEN=...
  • Embedded in free-form text without the word "telegram"

The generic-api-key rule also cannot help because the : separator in Telegram tokens is not matched by \w in its capture group, so it splits at the colon.

Why "sometimes works"

The gitleaks rule keywords include generic terms like bot, token, key. When any of those appear elsewhere in the input, the keyword pre-filter passes, but the regex itself still demands (?:telegr) in the prefix — so it only fires when the user writes something like telegram_token = <value>.

Solution

Added a new telegram-bot-token-standalone rule in additional_rules.toml that matches Telegram bot tokens purely by their distinctive format:

regex = '''\b([0-9]{5,16}:A[A-Za-z0-9_\-]{34})\b'''
keywords = []
  • Format-based detection: Matches the well-defined Telegram token structure (5-16 digit bot ID + : + A + 34 base64url chars) without requiring any surrounding context
  • keywords = []: Disables the keyword pre-filter so the rule runs against all input — acceptable because the regex is highly specific to Telegram's token format
  • Coexists with the original rule: The gitleaks telegram-bot-api-token rule is preserved for config-style detection; overlapping matches are deduplicated by the existing logic in redact_secrets()

Test plan

Added 5 test cases across 3 tests:

Test Validates
test_telegram_bot_token_standalone_detection Bare token detection, in-sentence detection, rejects short bot IDs (<5 digits), rejects non-A prefix
test_telegram_bot_token_redaction_bare Full redaction pipeline + round-trip restoration
test_telegram_bot_token_redaction_in_context Token with TELEGRAM_BOT_TOKEN= prefix still detected
cargo test -p stakpak-shared --lib -- telegram
# 3 passed, 0 failed
cargo test -p stakpak-shared --lib -- secrets::
# 54 passed, 0 failed (all existing tests unaffected)

Files changed

File Change
libs/shared/src/secrets/additional_rules.toml New telegram-bot-token-standalone rule
libs/shared/src/secrets/gitleaks.rs Test for detect_secrets() layer
libs/shared/src/secrets/mod.rs Tests for full redact_secrets() pipeline

The existing gitleaks telegram-bot-api-token rule requires contextual
keywords ("telegr") and an assignment operator before the token value.
This means bare tokens like 8797664862:AAF...v4 are not redacted when
pasted or typed without surrounding config-style context.

Add a new telegram-bot-token-standalone rule in additional_rules.toml
that matches Telegram bot tokens purely by their distinctive format
([0-9]{5,16}:A[A-Za-z0-9_-]{34}) with no keyword pre-filter, so they
are detected regardless of surrounding text.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant