Module Reference

All modules in src/gitlab_copilot_agent/, organized by architectural layer.

HTTP Ingestion Layer

`main.py`

Purpose: FastAPI application entrypoint, lifespan management, poller startup.

Key Functions:

lifespan(app: FastAPI) -> AsyncIterator[None]: Initialize telemetry, load settings, build AppContext, start pollers, graceful shutdown
health() -> dict[str, object]: Health check endpoint, includes GitLab poller status if enabled
config_reload(body: RenderedMap, request: Request) -> dict: Hot-reload project registry from new mapping JSON (requires X-Gitlab-Token auth)
_cleanup_stale_repos(clone_dir: str | None) -> None: Remove leftover mr-review-* dirs on startup
_create_executor(backend: str, settings: Settings | None) -> TaskExecutor: Factory for LocalTaskExecutor or RemoteTaskExecutor. Supports dispatch_backend="local" bypass.

Key Globals:

app: FastAPI: FastAPI application instance with lifespan and webhook router

Internal Imports: config/, app_context, telemetry/, gitlab_client, gitlab_poller, jira_client, jira_poller, gitlab_webhook, concurrency/, state, task_executor, coding_pipeline, git/, mapping_models, credential_registry, project_registry, dedup, events

Depended On By: Deployed as uvicorn entrypoint

`gitlab_webhook.py`

Purpose: FastAPI router for GitLab webhooks (merge_request, note). Routes note events to the unified discussion handler via @mention detection.

Key Functions:

webhook(request: Request, background_tasks: BackgroundTasks, x_gitlab_token: str | None) -> dict[str, str]: POST endpoint, validates HMAC, dispatches to background handlers
_validate_webhook_token(received: str | None, expected: str) -> None: HMAC comparison using hmac.compare_digest
_process_review(request: Request, payload: MergeRequestWebhookPayload) -> None: Background task for MR review. Uses get_app_context() for typed access to settings, executor, credential_registry. Resolves per-project resolution_behavior from project registry. Calls run_pipeline(ReviewPipeline(...), ctx).
_is_agent_directed(payload: NoteWebhookPayload, agent_identity: AgentIdentity, request: Request) -> bool: Check if note @mentions the agent
_process_discussion(request: Request, payload: NoteWebhookPayload, agent_identity: AgentIdentity) -> None: Background task for discussion interactions. Uses get_app_context() for typed access. Resolves per-project resolution_behavior from project registry. Calls run_pipeline(DiscussionPipeline(...), ctx).

Key Constants:

HANDLED_ACTIONS = frozenset({"open", "update", "reopen"}): MR actions that trigger review

Internal Imports: models, pipeline, review_pipeline, discussion_pipeline, discussion_models, metrics, app_context, project_registry

Depended On By: main.py (includes router)

`gitlab_poller.py`

Purpose: Background poller that discovers open MRs and @mention notes via GitLab API.

Key Classes:

GitLabPoller: Polls projects on interval, synthesizes webhook payloads, dispatches to handlers
- start() -> None: Start polling loop
- stop() -> None: Cancel polling task
- _poll_once() -> None: Single poll cycle (all projects, MRs, notes)
- _process_mr(project_id: int, mr: MRListItem) -> None: Dispatch MR review
- _process_notes(project_id: int, mrs: list[MRListItem]) -> None: Dispatch @mention interactions
- _watermark: str | None: ISO timestamp of last poll start (updated after each cycle)
- _failures: int: Consecutive failure count for exponential backoff

Key Functions:

_build_note_payload(note: NoteListItem, mr: MRListItem, project_id: int, settings: Settings) -> NoteWebhookPayload: Synthesize webhook payload from API models

Internal Imports: config, gitlab_client, models, pipeline, review_pipeline, discussion_pipeline, concurrency, task_executor, credential_registry

Depended On By: main.py (started in lifespan if gitlab_poll=true)

`jira_poller.py`

Purpose: Background poller that searches Jira for issues in trigger status.

Key Protocols:

CodingTaskHandler: Interface for handling discovered issues
- handle(issue: JiraIssue, project_mapping: ResolvedProject) -> None

Key Classes:

JiraPoller: Polls Jira on interval, dispatches to handler
- start() -> None: Start polling loop
- stop() -> None: Cancel polling task
- _poll_once() -> None: Search all mapped projects, invoke handler for new issues
- _processed_issues: set[str]: Issue keys processed in this run

Internal Imports: config, jira_client, jira_models, project_registry, telemetry

Depended On By: main.py (started in lifespan if Jira configured)

Processing Layer

`discussion_pipeline.py`

Purpose: Discussion interaction pipeline implementation. Handles @mention and thread-reply interactions.

Key Classes:

DiscussionContext(BasePipelineContext): Context for discussion stages
DiscussionPipeline: Implements Pipeline — clone, fetch context, LLM, reply ± commit/push ± resolve

Internal Imports: pipeline, events, git/, gitlab_client, discussion_engine, coding_workflow, telemetry/, concurrency/

Depended On By: gitlab_webhook.py

`review_pipeline.py`

Purpose: MR review pipeline implementation. Orchestrators call run_pipeline(ReviewPipeline(...), ctx).

Key Classes:

ReviewContext(BasePipelineContext): Context for review stages (settings, event, executor, etc.)
ReviewPipeline: Implements Pipeline — clone, review via LLM, parse, post comments

Internal Imports: pipeline, events, git/, gitlab_client, review_engine, comment_parser, comment_poster, telemetry/

Depended On By: gitlab_webhook.py, gitlab_poller.py

`discussion_engine.py`

Purpose: Discussion prompt construction and response parsing for @mention/thread interactions.

Key Constants:

MAX_DIFF_CHARS = 80_000: Max diff characters included in prompt
MAX_OTHER_DISCUSSIONS = 5: Max other threads summarized for context
MAX_OTHER_NOTE_CHARS = 100: Max characters per summarized note

Key Models:

DiscussionResponse: Structured LLM response — reply (text to post), has_code_changes (bool), resolution (optional Resolution for thread resolution)

Key Functions:

build_discussion_prompt(mr_details: MRDetails, discussion_history: DiscussionHistory, triggering_discussion: Discussion) -> str: Build user prompt with MR metadata, triggering thread, diff, and other discussion context
parse_discussion_response(raw: str) -> DiscussionResponse: Extract structured response from LLM output. Detects files_changed JSON blocks for code changes and resolution JSON blocks for thread resolution signals
run_discussion(executor: TaskExecutor, settings: Settings, repo_path: str, repo_url: str, system_prompt: str, user_prompt: str, source_branch: str) -> TaskResult: Execute discussion LLM session via executor
_parse_resolution(data: dict[str, object]) -> Resolution | None: Extract Resolution from parsed JSON if present

Internal Imports: task_executor, config, discussion_models, gitlab_client, comment_parser

Depended On By: discussion_pipeline.py

`review_engine.py`

Purpose: Review prompt construction and execution.

Key Constants:

REVIEW_SYSTEM_PROMPT: str: Review system prompt (re-exported from prompt_defaults.DEFAULT_REVIEW_PROMPT)
MAX_DIFF_CHARS: int: Maximum characters of diff to include in the prompt before truncation
MAX_COMMIT_CHARS: int: Maximum characters of commit messages to include in the prompt before truncation
_SEVERITY_PREFIX_RE: re.Pattern: Compiled regex to strip severity prefixes (e.g., **[WARNING]**) from comments
_SUGGESTION_BLOCK_RE: re.Pattern: Compiled regex to strip suggestion code blocks from comments
_PRIOR_FEEDBACK_RULES: str: Prompt rules instructing the LLM not to duplicate prior feedback
_SUPPRESSED_FEEDBACK_RULES: str: Prompt rules instructing the LLM not to re-raise human-resolved or dismissed items
_DISMISSAL_PATTERNS: list[re.Pattern]: Compiled regexes for dismissal phrase detection (case-insensitive): "won't fix", "intentional", "by design", "not a bug", "false positive", "not an issue", "acceptable risk", "wontfix"
_RESOLUTION_EVAL_INSTRUCTIONS: Prompt instructions for LLM to evaluate whether prior feedback has been addressed

Key Models:

ReviewRequest: MR metadata (title, description, source/target branches, commit_messages)

Key Functions:

build_review_prompt(req: ReviewRequest, diff_text: str | None = None, discussion_history: DiscussionHistory | None = None, is_incremental: bool = False, head_sha: str = "") -> str: Build user prompt; includes commit messages section when available, diff inline when available, injects prior unresolved feedback with outdated position annotations, labels incremental diffs, and appends suppressed feedback section for human-resolved/dismissed items
run_review(executor: TaskExecutor, settings: Settings, repo_path: str, repo_url: str, review_request: ReviewRequest, diff_text: str | None = None, discussion_history: DiscussionHistory | None = None, head_sha: str = "", is_incremental: bool = False) -> TaskResult: Execute review task and return structured result. Appends head_sha to task ID for dedup
_format_prior_feedback(history: DiscussionHistory, current_head_sha: str = "") -> str: Render agent's unresolved inline comments as a prompt section. Includes [discussion: {id}] tags for LLM resolution referencing. Annotates comments whose position.head_sha differs from current_head_sha as outdated
_is_human_resolved(disc: Discussion, agent_user_id: int) -> bool: Returns True when discussion is resolved and the resolver is not the agent (detected via resolved_by_id field)
_is_dismissed(disc: Discussion, agent_user_id: int) -> bool: Returns True when a non-agent note matches any _DISMISSAL_PATTERNS regex
_format_suppressed_feedback(history: DiscussionHistory) -> str: Render human-resolved ([MANUALLY RESOLVED]) and dismissed ([DISMISSED]) items as a "Suppressed Feedback (Do Not Re-Raise)" prompt section. Returns empty string when no items qualify
_file_line(note: DiscussionNote) -> str: Format a note's file path and line number for display
_strip_comment_formatting(body: str) -> str: Remove agent-added severity prefix and suggestion blocks from a comment

Internal Imports: config, prompt_defaults, task_executor, discussion_models (TYPE_CHECKING)

Depended On By: review_engine.py

`coding_engine.py`

Purpose: Coding task prompt construction and .gitignore hygiene.

Key Constants:

CODING_SYSTEM_PROMPT: str: Coding system prompt (re-exported from prompt_defaults.DEFAULT_CODING_PROMPT)
_PYTHON_GITIGNORE_PATTERNS: list[str]: Standard Python ignore patterns

Key Functions:

build_jira_coding_prompt(issue_key: str, summary: str, description: str | None) -> str: Build user prompt from Jira issue
ensure_gitignore(repo_root: str) -> bool: Ensure .gitignore contains Python patterns, returns True if modified
run_coding_task(...) -> str: Ensure .gitignore, execute coding task
parse_agent_output(text: str) -> CodingAgentOutput: Extract structured JSON from agent response (Pydantic-validated summary + files_changed)

Internal Imports: config, prompt_defaults, task_executor

Depended On By: coding_pipeline.py

`prompt_defaults.py`

Purpose: Canonical source of built-in system prompts and configurable prompt resolution.

Key Types:

PromptType = Literal["coding", "review", "discussion"]

Key Constants:

DEFAULT_CODING_PROMPT: str: Built-in coding system prompt
DEFAULT_REVIEW_PROMPT: str: Built-in review system prompt

Key Functions:

get_prompt(settings: Settings, prompt_type: PromptType) -> str: Resolve the effective system prompt for a given type. Resolution: global base (SYSTEM_PROMPT + suffix) → type-specific override or built-in default + suffix → combined result.

Internal Imports: config (TYPE_CHECKING only)

Depended On By: review_engine.py, coding_engine.py, task_runner.py

`coding_workflow.py`

Purpose: Shared helper for applying coding results (diff passback from k8s pods).

Key Functions:

apply_coding_result(result: TaskResult, repo_path: Path) -> None: Validate base_sha, apply patch via git apply --3way if CodingResult has a patch. No-op for local executor (empty patch).

Internal Imports: task_executor, git/, telemetry/

Depended On By: coding_pipeline.py

Execution Layer

`task_executor.py`

Purpose: TaskExecutor protocol and LocalTaskExecutor implementation.

Key Models:

TaskParams: Parameters for a Copilot task (task_type, repo_url, branch, prompts, settings, repo_path)
TaskResult: Union type ReviewResult | CodingResult (return type for all executors)
ReviewResult: summary: str (Pydantic BaseModel, frozen=True)
CodingResult: summary: str, patch: str, base_sha: str (Pydantic BaseModel, frozen=True)

Key Protocols:

TaskExecutor: execute(task: TaskParams) -> TaskResult

Key Classes:

LocalTaskExecutor: Runs copilot_session.py in-process, returns ReviewResult for reviews, CodingResult with empty patch for coding
- Requires task.repo_path to be set

Internal Imports: config

Depended On By: Review, discussion, and coding pipelines; main.py (instantiation)

`remote_executor.py`

Purpose: Unified remote task executor — claim-check dispatch for any KEDA-backed backend (K8s Jobs or ACA Job executions). Replaces the former k8s_executor.py and aca_executor.py.

Key Constants:

_POLL_INTERVAL = 5: Seconds between result blob checks
_LOCK_PREFIX = "remote_exec:": Idempotency lock prefix

Key Functions:

parse_result(raw: str, task_type: str) -> TaskResult: Parse result JSON or wrap raw string. Handles review, coding, and error result types with traceback logging.

Key Classes:

RemoteTaskExecutor: Implements TaskExecutor
- execute(task: TaskParams) -> TaskResult: Check cache, check lock, upload tarball, enqueue, poll for result
- _poll_result(task: TaskParams) -> TaskResult: Poll ResultStore until result or timeout

Internal Imports: task_executor, git/, concurrency/

Depended On By: main.py (instantiation when task_executor=kubernetes)

`copilot_session.py`

Purpose: Copilot SDK wrapper — client init, session config, result extraction.

Key Constants:

_SDK_ENV_ALLOWLIST = frozenset({"PATH", "HOME", "LANG", "TERM", "TMPDIR", "USER"}): Safe env vars for SDK subprocess

Key Functions:

build_sdk_env(github_token: str | None) -> dict[str, str]: Build minimal env dict for SDK subprocess (excludes service secrets)
run_copilot_session(settings: Settings, repo_path: str, system_prompt: str, user_prompt: str, timeout: int, task_type: str, validate_response: Callable[[str], str | None] | None) -> str: Full Copilot session lifecycle
- Creates CopilotClient with minimal env
- Discovers repo config (skills, agents, instructions)
- Injects repo instructions into system prompt
- Creates session with BYOK provider if configured
- Sends user prompt, waits for session.idle
- If validate_response returns a string, sends it as a follow-up (one retry max)
- Returns last assistant message
- Emits copilot_session_duration metric

Internal Imports: config, repo_config, process_sandbox, metrics, telemetry

Depended On By: task_executor.py (LocalTaskExecutor), task_runner.py (K8s Job)

`task_runner.py`

Purpose: K8s Job entrypoint (python -m gitlab_copilot_agent.task_runner).

Key Constants:

VALID_TASK_TYPES = frozenset({"review", "coding", "echo"})
_RESULT_TTL = 3600

Key Functions:

run_task() -> int: Main entry point
- Dequeues task from Azure Storage Queue (or reads env vars for echo tasks)
- Validates task type
- Validates repo_blob_key starts with repos/ prefix
- Downloads repo tarball from blob and extracts to temp dir
- Calls run_copilot_session()
- For coding tasks: calls _build_coding_result() to capture diff
- Stores result in Azure Blob Storage (JSON-encoded TaskResult)
- Returns exit code 0/1
_build_coding_result(response: str, repo_path: Path) -> CodingResult: Parse CodingAgentOutput from response, stage listed files explicitly, capture git diff --cached --binary, validate size ≤ MAX_PATCH_SIZE, validate patch (no ../), return CodingResult
_coding_response_validator(response: str) -> str | None: Validate agent response contains structured JSON; returns retry prompt if missing
_store_result(task_id: str, result: str) -> None: Persist to Azure Blob Storage with TTL
_dequeue_task() -> tuple | None: Dequeue from Azure Storage Queue if configured
_get_required_env(name: str) -> str: Raise if env var missing
_parse_task_payload(raw: str) -> dict[str, str]: Parse JSON payload

Security: Zero GitLab credentials. Repo received via blob transfer from controller.

Internal Imports: config/, copilot_session, git/, coding_engine, prompt_defaults

Depended On By: K8s Job container command

External Service Clients

`gitlab_client.py`

Purpose: Async GitLab REST API client using httpx — fully typed, with retry and pagination.

Key Models:

MRAuthor: id, username
MRListItem: iid, title, description, source/target branches, sha, web_url, state, author, updated_at
NoteListItem: id, body, author, system, created_at
MRDiffRef: base_sha, start_sha, head_sha
MRChange: old_path, new_path, diff, new_file, deleted_file, renamed_file
MRDetails: title, description, diff_refs, changes
MRCommit: id, title, message (frozen, extra="ignore")

Key Protocols:

GitLabClientProtocol: Interface for all GitLab operations

Key Classes:

GitLabClient:
- __init__(url: str, token: str): Initialize httpx async client with PRIVATE-TOKEN header
- aclose() -> None: Close the underlying httpx client
- __aenter__/__aexit__: Async context manager support
- _request(method, path, *, idempotent, **kwargs) -> Response: HTTP request with retry on 429/5xx for idempotent (GET) requests; respects Retry-After header
- _paginate(path, params) -> list[dict]: Fetch all pages of a paginated endpoint
- get_mr_details(project_id, mr_iid) -> MRDetails: Fetch MR changes; retries on null diff_refs (GitLab race)
- clone_repo(clone_url, branch, token, clone_dir) -> Path: Clone repo via git/ package
- cleanup(repo_path) -> None: Remove cloned repo
- create_merge_request(...) -> int: Create MR, return iid
- post_mr_comment(project_id, mr_iid, body) -> None: Post MR note
- create_mr_discussion(project_id, mr_iid, body, position) -> None: Create inline discussion on diff
- list_project_mrs(project_id, state, updated_after) -> list[MRListItem]: List MRs (paginated)
- list_mr_notes(project_id, mr_iid, created_after) -> list[NoteListItem]: List notes (paginated)
- resolve_project(id_or_path) -> int: Resolve project ID (URL-encodes paths)
- list_mr_discussions(project_id, mr_iid) -> list[Discussion]: List discussions (paginated)
- get_current_user() -> AgentIdentity: GET /user for authenticated identity
- resolve_discussion(project_id, mr_iid, discussion_id) -> None: PUT to resolve a thread
- reply_to_discussion(project_id, mr_iid, discussion_id, body) -> None: POST reply to thread
- compare_commits(project_id, from_sha, to_sha) -> list[MRChange]: Compare two commits
- get_mr_commits(project_id, mr_iid) -> list[MRCommit]: Fetch MR commits (paginated)

Internal Imports: git/, discussion_models

Depended On By: gitlab_webhook.py, gitlab_poller.py, main.py

`jira_client.py`

Purpose: Jira REST API v3 client using basic auth.

Key Protocols:

JiraClientProtocol: Interface for Jira operations

Key Classes:

JiraClient:
- __init__(base_url: str, email: str, api_token: str): Initialize httpx client with Basic auth
- close() -> None: Close HTTP client
- search_issues(jql: str) -> list[JiraIssue]: Paginated JQL search
- transition_issue(issue_key: str, target_status: str) -> None: Transition by status name
- add_comment(issue_key: str, body: str) -> None: Add plain-text comment (ADF format)

Internal Imports: jira_models

Depended On By: jira_poller.py, coding_pipeline.py, main.py

Shared Utilities

`git/` package (formerly `git_operations.py`)

Purpose: Git CLI wrappers (clone, branch, commit, push, patch, archive, validation). Split from the former git_operations.py monolith into focused submodules.

Submodules:

clone.py: Repository cloning (git_clone) with URL validation and credential embedding
operations.py: Branch, commit, push operations (git_create_branch, git_unique_branch, git_commit, git_push, git_head_sha)
patches.py: Patch application and staged diff capture (git_apply_patch, git_diff_staged, _validate_patch)
archive.py: Repository archiving utilities for remote executor blob transfer
validation.py: URL and patch validation (_validate_clone_url, _sanitize_url_for_log, MAX_PATCH_SIZE)

Key Constants:

CLONE_DIR_PREFIX = "mr-review-"
MAX_PATCH_SIZE = 10 * 1024 * 1024 (10 MB) — maximum allowed patch size for diff passback

Key Functions (re-exported from git/__init__.py):

git_clone(clone_url: str, branch: str, token: str, clone_dir: str | None) -> Path: Clone repo with embedded credentials, validate URL
git_create_branch(repo_path: Path, branch_name: str) -> None: Create and checkout branch
git_unique_branch(repo_path: Path, base_name: str) -> str: Create branch with collision detection — appends -2, -3, etc. using git ls-remote --heads (works with shallow clones)
git_commit(repo_path: Path, message: str, author_name: str, author_email: str) -> bool: Stage all, commit, return False if nothing to commit
git_push(repo_path: Path, remote: str, branch: str, token: str) -> None: Push with token sanitization
git_apply_patch(repo_path: Path, patch: str) -> None: Apply unified diff with git apply --3way --binary
git_head_sha(repo_path: Path) -> str: Get current HEAD commit SHA
git_diff_staged(repo_path: Path) -> str: Capture staged diff (git diff --cached --binary), preserves trailing whitespace

Internal Imports: telemetry/

Depended On By: gitlab_client.py, review_pipeline.py, discussion_pipeline.py, coding_pipeline.py, task_runner.py, coding_workflow.py

`comment_parser.py`

Purpose: Extract structured review output from Copilot agent response.

Key Models:

ReviewComment: file, line, severity, comment, suggestion, suggestion_start_offset, suggestion_end_offset
Resolution: discussion_id, status (resolved/not_addressed/partial), message — resolution determination for prior feedback
ParsedReview: comments, summary, resolutions

Key Functions:

parse_review(raw: str) -> ParsedReview: Extract JSON object with comments and resolutions arrays from code fence or raw text, parse into models, extract summary

Internal Imports: None

Depended On By: review_engine.py, discussion_engine.py

`comment_poster.py`

Purpose: Post review comments to GitLab MR as inline discussions and summary. Handles resolution actions for prior feedback. Embeds SHA marker in summary note for incremental review tracking. Composes a structured activity summary (posting outcomes + resolution stats) into the summary note.

Key Functions:

post_review(gitlab_client: gl.Gitlab, project_id: int, mr_iid: int, diff_refs: MRDiffRef, review: ParsedReview, changes: list[MRChange], resolution_behavior: str = "suggest", allowed_discussion_ids: frozenset[str] = frozenset(), head_sha: str = "") -> None: Post inline comments + resolve/acknowledge prior feedback + summary with SHA marker and activity section
- Validates comment positions against diff hunks
- Falls back to note with file:line context if position invalid
- Tracks posting outcomes (inline, fallback, skipped) via counters
- Processes resolutions via _handle_resolutions() before posting summary
- When comments or resolutions are nonzero, inserts activity section between summary text and SHA marker
- When head_sha provided, appends format_sha_marker(head_sha) to summary note body
_build_activity_section(posted_inline: int, posted_fallback: int, resolutions: list[Resolution], resolved_count: int) -> str: Compose markdown activity summary from posting outcomes and resolution data. Returns empty string when all counts are zero. Includes: new comments (inline + fallback total), threads resolved, partial resolutions — with singular/plural handling
_handle_resolutions(mr: object, resolutions: list[Resolution], resolution_behavior: str) -> int: Process resolutions per configured behavior (auto-resolve/suggest/off). Returns count of resolved threads
_parse_hunk_lines(diff: str, new_path: str) -> set[tuple[str, int]]: Extract valid (file, line) positions from unified diff
_is_valid_position(file: str, line: int, valid_positions: set[tuple[str, int]]) -> bool: Check if position valid

Internal Imports: comment_parser, gitlab_client, incremental

Depended On By: review_pipeline.py

`incremental.py`

Purpose: SHA marker utilities for incremental MR review. Embeds and extracts a hidden HTML comment in overview notes to track the last-reviewed commit SHA.

Key Functions:

extract_last_reviewed_sha(discussion_history: DiscussionHistory | None) -> str | None: Scans overview notes in reverse chronological order for the SHA marker
format_sha_marker(head_sha: str) -> str: Generates the hidden HTML comment marker

Key Constants:

_SHA_MARKER_RE: Regex matching

Internal Imports: discussion_models (TYPE_CHECKING only)

Depended On By: review_pipeline.py (extraction), comment_poster.py (formatting)

ADR: 0009-incremental-review-sha-marker.md

`repo_config.py`

Purpose: Discover repo-level Copilot configuration (skills, agents, instructions).

Key Constants:

_CONFIG_ROOTS = [".github", ".claude"]
_SKILLS_DIR = "skills"
_AGENTS_DIR = "agents"
_INSTRUCTIONS_DIR = "instructions"
_CONFIG_ROOT_INSTRUCTIONS: dict[str, list[str]]: Root-specific instruction files
_AGENT_SUFFIX = ".agent.md"
_AGENTS_MD = "AGENTS.md"
_CLAUDE_MD = "CLAUDE.md"

Key Models:

AgentConfig: name, prompt, description, tools, display_name, mcp_servers, infer
RepoConfig: skill_directories, custom_agents, instructions

Key Functions:

discover_repo_config(repo_path: str) -> RepoConfig: Discover all skills, agents, instructions
- Scans .github/ and .claude/ for skills, agents, instructions
- Reads AGENTS.md (root, then subdirs)
- Reads CLAUDE.md (if not in .claude/)
- Deduplicates symlinks (resolved paths must stay within repo)
_parse_agent_file(path: Path) -> AgentConfig | None: Parse .agent.md with YAML frontmatter
_resolve_real_path(path: Path, repo_root: Path) -> Path | None: Resolve symlinks, reject paths escaping repo

Internal Imports: None (external: frontmatter, pydantic)

Depended On By: copilot_session.py

Data & Configuration

`models.py`

Purpose: Pydantic models for GitLab webhook payloads.

Key Models:

WebhookUser: id, username
WebhookProject: id, path_with_namespace, git_http_url
MRLastCommit: id (sha), message
MRObjectAttributes: iid, title, description, action, source/target branches, last_commit, url, oldrev
MergeRequestWebhookPayload: object_kind, user, project, object_attributes
NoteObjectAttributes: note, noteable_type
NoteMergeRequest: iid, title, source/target branches
NoteWebhookPayload: object_kind, user, project, object_attributes, merge_request

All use strict=True config.

Internal Imports: None

Depended On By: gitlab_webhook.py, gitlab_poller.py, review_pipeline.py

`jira_models.py`

Purpose: Pydantic models for Jira REST API responses.

Key Models:

JiraUser: account_id, display_name, email_address
JiraStatus: name, id
JiraIssueFields: summary, description, status, assignee, labels
JiraIssue: id, key, fields
- project_key property: extract "PROJ" from "PROJ-123"
JiraSearchResponse: issues, next_page_token, total
JiraTransition: id, name
JiraTransitionsResponse: transitions

All use extra="ignore" config.

Internal Imports: None

Depended On By: jira_client.py, jira_poller.py, coding_pipeline.py

`discussion_models.py`

Purpose: Pydantic models for MR discussion history, shared by review and discussion flows.

Key Models:

DiscussionNote: note_id, author_id, author_username, body, created_at, is_system, resolved, resolvable, position
Discussion: discussion_id, notes, is_resolved, is_inline
AgentIdentity: user_id, username (discovered via GET /user)
DiscussionHistory: discussions, agent

All use frozen=True config.

Dependencies: pydantic

Internal Imports: None

Depended On By: review_pipeline.py, credential_registry.py, discussion_pipeline.py, discussion_engine.py

`mapping_models.py`

Purpose: Pydantic models for YAML source mappings and rendered JSON format (v1 config).

Key Models:

MappingSource: YAML source with defaults + bindings list
RenderedMap: Flat JSON for JIRA_PROJECT_MAP env var and /config/reload body
RenderedBinding: Single binding — repo, target_branch, credential_ref

Depended On By: mapping_cli.py, project_registry.py, main.py

`config_v2.py`

Purpose: GitLab-centric YAML config models (v2). Replaces Jira-keyed mapping_models.py for project configuration.

Key Models:

ConfigFile: Root model with version: 2, gitlab, dispatch, copilot, server, prompts, defaults, projects, integrations
ProjectConfig: Single GitLab project; all fields except repo are optional (fall back to ConfigDefaults)
JiraIntegrationConfig: Jira integration referenced by projects via name

Key Functions:

load_config_file(path: Path | None) -> ConfigFile: Load + validate YAML, audit-log marketplace URLs (S10)

Depended On By: mapping_cli.py, project_registry.py

`app_context.py`

Purpose: Frozen AppContext dataclass replacing app.state service locator. Provides typed dependency injection.

Key Types:

AppContext: Frozen dataclass holding settings, executor, repo_locks, dedup_store, dedup, credential_registry, allowed_project_ids

Key Functions:

get_app_context(request: Request) -> AppContext: FastAPI Depends() accessor

Depended On By: gitlab_webhook.py, main.py

`credential_registry.py`

Purpose: Resolve credential aliases to GitLab tokens from environment. TTL-cached identity resolution via httpx.

Key Methods:

from_env() -> CredentialRegistry: Reads GITLAB_TOKEN + GITLAB_TOKEN__<ALIAS> env vars
resolve(credential_ref: str) -> str: Returns token for alias, raises KeyError if unknown
resolve_identity(credential_ref: str, gitlab_url: str) -> AgentIdentity: TTL-cached identity lookup (default 1hr, time.monotonic()). Uses httpx GET /api/v4/user.

Depended On By: project_registry.py, main.py, gitlab_poller.py

`project_registry.py`

Purpose: Fully resolved project context for runtime use.

Key Types:

ResolvedProject: Frozen Pydantic model — jira_project (optional), repo, gitlab_project_id, clone_url, target_branch, credential_ref, token (masked in repr)
ProjectRegistry: Registry with from_rendered_map() (v1) and from_config() (v2) async factories, get_by_jira(), get_by_project_id(), jira_keys()

Depended On By: jira_poller.py, coding_pipeline.py, main.py

`config/` package (formerly `config.py`)

Purpose: Application configuration via environment variables. Split from config.py into a package for better organization.

Submodules:

settings.py: Settings (BaseSettings) — all env vars (see configuration-reference.md), JiraSettings
runner_settings.py: TaskRunnerSettings for K8s Job entrypoint configuration
base.py: Shared mixins (CopilotSettingsMixin, PromptSettingsMixin, DispatchSettingsMixin) used by both Settings and TaskRunnerSettings
validators.py: Cross-field validators (auth checks, state backend validation, project list validation)

Key Models (re-exported from config/__init__.py):

JiraSettings: url, email, api_token, trigger_status, in_progress_status, poll_interval, project_map_json
Settings (BaseSettings): All env vars (see configuration-reference.md)
- jira property: return JiraSettings if all required fields set, else None
- _check_auth() validator: ensure either GITHUB_TOKEN or COPILOT_PROVIDER_TYPE set; validate REDIS_URL if backend=redis; validate GITLAB_PROJECTS if gitlab_poll=true

Internal Imports: None

Depended On By: All modules

State & Concurrency

`concurrency/` package (formerly `concurrency.py`)

Purpose: In-memory locking and deduplication primitives. Split from concurrency.py into a package.

Submodules:

protocols.py: DistributedLock and DeduplicationStore protocol definitions, TaskQueue, QueueMessage
memory.py: MemoryLock, MemoryDedup in-memory implementations

Key Protocols:

DistributedLock: acquire(key: str, ttl_seconds: int) -> AbstractAsyncContextManager[None], aclose()
DeduplicationStore: is_seen(key: str) -> bool, mark_seen(key: str, ttl_seconds: int) -> None, aclose()

Key Classes:

MemoryLock: Async lock per key with LRU eviction
- acquire(key: str, ttl_seconds: int): Context manager, evicts unlocked entries after release
MemoryDedup: In-memory seen set with size-based eviction
- is_seen(key: str) -> bool, mark_seen(key: str, ttl_seconds: int) -> None

Aliases:

RepoLockManager = MemoryLock (backward compatibility)

Internal Imports: None

Depended On By: main.py, gitlab_webhook.py, gitlab_poller.py, coding_pipeline.py, dedup.py

`dedup.py`

Purpose: Unified deduplication service consolidating all dedup logic into a single interface. Replaces the former ReviewedMRTracker and ProcessedIssueTracker classes.

Key Classes:

DeduplicationService: Wraps a DeduplicationStore with typed helpers for each event kind
- is_review_seen(project_id: int, mr_iid: int, head_sha: str) -> bool
- mark_review(project_id: int, mr_iid: int, head_sha: str) -> None
- is_note_seen(project_id: int, mr_iid: int, note_id: int) -> bool
- mark_note(project_id: int, mr_iid: int, note_id: int) -> None
- is_issue_seen(issue_key: str) -> bool
- mark_issue(issue_key: str) -> None

Internal Imports: concurrency/

Depended On By: gitlab_webhook.py, gitlab_poller.py, jira_poller.py, app_context.py

`events.py`

Purpose: Unified internal event model. Provides TaskEvent Pydantic model that replaces direct webhook payload passing between ingestion and processing layers.

Key Models:

TaskEvent: Pydantic model representing a normalized event from any ingestion source (webhook, poller). Orchestrators and pipelines receive TaskEvent instead of raw webhook payloads.

Internal Imports: models

Depended On By: gitlab_webhook.py, gitlab_poller.py, review_pipeline.py, discussion_pipeline.py, coding_pipeline.py

`pipeline.py`

Purpose: Pipeline protocol and runner for structured multi-stage processing. Defines a 4-stage protocol (prepare, execute, process, cleanup) and a run_pipeline() function that drives any conforming pipeline.

Key Protocols:

Pipeline: Protocol with stages prepare(), execute(), process(), cleanup()

Key Classes:

BasePipelineContext: Base dataclass for pipeline stage context

Key Functions:

run_pipeline(pipeline: Pipeline, context: BasePipelineContext) -> None: Sequential runner that calls each stage, with cleanup in finally block

Internal Imports: None

Depended On By: review_pipeline.py, discussion_pipeline.py, coding_pipeline.py

`review_pipeline.py`

Purpose: MR review pipeline implementation. Orchestrators call run_pipeline(ReviewPipeline(...), ctx).

Key Classes:

ReviewContext(BasePipelineContext): Context for review stages (settings, event, executor, etc.)
ReviewPipeline: Implements Pipeline — clone, review via LLM, parse, post comments

Internal Imports: pipeline, events, git/, gitlab_client, review_engine, comment_parser, comment_poster, telemetry/

Depended On By: gitlab_webhook.py, gitlab_poller.py

`discussion_pipeline.py`

Purpose: Discussion interaction pipeline implementation. Handles @mention and thread-reply interactions.

Key Classes:

DiscussionContext(BasePipelineContext): Context for discussion stages
DiscussionPipeline: Implements Pipeline — clone, fetch context, LLM, reply ± commit/push ± resolve

Internal Imports: pipeline, events, git/, gitlab_client, discussion_engine, coding_workflow, telemetry/, concurrency/

Depended On By: gitlab_webhook.py, gitlab_poller.py

`coding_pipeline.py`

Purpose: Jira coding task pipeline implementation. Handles issue-to-MR workflow.

Key Classes:

CodingContext(BasePipelineContext): Context for coding stages
CodingPipeline: Implements Pipeline — clone, branch, code via LLM, apply result, commit, push, create MR

Internal Imports: pipeline, events, git/, gitlab_client, jira_client, coding_engine, coding_workflow, telemetry/, concurrency/

Depended On By: jira_poller.py (as Jira poller handler)

`state.py`

Purpose: Factory functions for concurrency primitives (lock, dedup, result store, task queue). Uses in-memory implementations for lock/dedup and delegates to Azure Storage for result store and task queue when configured.

Key Functions:

create_lock() -> DistributedLock: Factory — returns MemoryLock (single-controller deployment)
create_dedup() -> DeduplicationStore: Factory — returns MemoryDedup (single-controller deployment)
create_result_store(*, azure_storage_account_url, azure_storage_connection_string, task_blob_container) -> ResultStore: Factory — returns BlobResultStore when Azure Storage is configured, otherwise MemoryResultStore
create_task_queue(*, azure_storage_queue_url, azure_storage_account_url, azure_storage_connection_string, task_queue_name, task_blob_container) -> TaskQueue: Factory — returns AzureStorageTaskQueue when Azure Storage is configured, otherwise MemoryTaskQueue

Internal Imports: concurrency/, azure_storage (lazy)

Depended On By: main.py

Telemetry

`telemetry/` package (formerly `telemetry.py`)

Purpose: OpenTelemetry tracing, metrics, and log export setup. Split from telemetry.py into a package.

Submodules:

tracing.py: TracerProvider setup, get_tracer(), span utilities
logging.py: Structlog configuration, add_trace_context() processor, emit_to_otel_logs() processor
exporters.py: OTLP exporter configuration (gRPC and HTTP/protobuf)
_state.py: Module-level state for provider instances

Key Constants:

_SERVICE_NAME = "gitlab-copilot-agent"

Key Functions (re-exported from telemetry/__init__.py):

init_telemetry() -> None: Configure OTEL providers, exporters, auto-instrumentation (FastAPI, httpx)
- No-op if OTEL_EXPORTER_OTLP_ENDPOINT unset
- Sets up TracerProvider, MeterProvider, LoggerProvider
- Exports via OTLP gRPC
shutdown_telemetry() -> None: Flush and shutdown providers
get_tracer(name: str) -> trace.Tracer: Get tracer instance
add_trace_context(logger, method, event_dict) -> dict: Structlog processor injecting trace_id, span_id
emit_to_otel_logs(logger, method, event_dict) -> dict: Structlog processor re-emitting to stdlib logging for OTLP export

Internal Imports: None

Depended On By: main.py, copilot_session.py, git/, jira_poller.py, review_pipeline.py, discussion_pipeline.py, coding_pipeline.py

`metrics.py`

Purpose: Shared OTel metrics instruments.

Key Constants:

METER_NAME = "gitlab_copilot_agent"

Key Metrics:

reviews_total (Counter): Total MR reviews processed (labels: outcome)
reviews_duration (Histogram): MR review duration in seconds (labels: outcome)
coding_tasks_total (Counter): Total coding tasks processed (labels: outcome)
coding_tasks_duration (Histogram): Coding task duration in seconds (labels: outcome)
webhook_received_total (Counter): Total webhooks received (labels: object_kind)
webhook_errors_total (Counter): Webhook background errors (labels: handler)
copilot_session_duration (Histogram): Copilot session duration in seconds (labels: task_type)

Internal Imports: None

Depended On By: copilot_session.py, gitlab_webhook.py

`process_sandbox.py`

Purpose: Copilot CLI binary resolution.

Key Functions:

_get_real_cli_path() -> str: Resolve bundled Copilot CLI binary path from github-copilot-sdk package

Internal Imports: None

Depended On By: copilot_session.py

`plugin_manager.py`

Purpose: Runtime plugin installation into isolated per-session HOME directories.

Key Functions:

setup_plugins(home_dir, plugins, marketplaces): Install marketplaces and plugins into an isolated HOME
add_marketplace(home_dir, marketplace_url): Register a custom plugin marketplace
install_plugin(home_dir, plugin_spec): Install a single Copilot CLI plugin
_run_cli(args, home_dir, timeout): Execute a copilot CLI command with timeout and kill-on-timeout
_sanitize_url(url): Strip credentials and query params from URLs for safe logging

Internal Imports: process_sandbox.get_real_cli_path Depended On By: copilot_session.py

Summary Table

Module	Layer	LOC (approx)	Key Responsibility
`main.py`	Ingestion	168	FastAPI app, lifespan, pollers
`gitlab_webhook.py`	Ingestion	117	Webhook endpoint, HMAC validation
`gitlab_poller.py`	Ingestion	175	MR/note discovery
`jira_poller.py`	Ingestion	90	Issue discovery
`review_engine.py`	Processing	160	Review prompt construction
`coding_engine.py`	Processing	109	Coding prompt construction
`prompt_defaults.py`	Processing	164	System prompt defaults & resolution
`discussion_engine.py`	Processing	147	Discussion prompt construction & response parsing
`coding_workflow.py`	Processing	~80	Shared helper for applying coding results
`pipeline.py`	Processing	~120	Pipeline protocol + runner
`review_pipeline.py`	Processing	~180	Review pipeline implementation
`discussion_pipeline.py`	Processing	~220	Discussion pipeline implementation
`coding_pipeline.py`	Processing	~150	Coding pipeline implementation
`task_executor.py`	Execution	53	TaskExecutor protocol
`remote_executor.py`	Execution	162	Unified claim-check dispatch (K8s + ACA)
`copilot_session.py`	Execution	143	SDK wrapper
`task_runner.py`	Execution	134	K8s Job entrypoint
`gitlab_client.py`	Clients	224	GitLab API client
`jira_client.py`	Clients	117	Jira API client
`git/`	Utils	~250	Git CLI wrappers (clone, branch, commit, push, patch, archive)
`comment_parser.py`	Utils	75	Review parsing
`comment_poster.py`	Utils	184	Comment posting + activity summary
`repo_config.py`	Utils	184	Repo config discovery
`models.py`	Data	79	Webhook models
`jira_models.py`	Data	87	Jira API models
`discussion_models.py`	Data	71	MR discussion history models
`config/`	Data	~200	Settings (env vars, mixins, validators)
`concurrency/`	State	~150	In-memory locks/dedup protocols + implementations
`dedup.py`	State	~80	Unified DeduplicationService
`events.py`	Data	~80	TaskEvent internal event model
`state.py`	State	79	Factory functions for concurrency primitives
`telemetry/`	Telemetry	~180	OTEL setup (tracing, logging, exporters)
`metrics.py`	Telemetry	52	Metrics instruments
`process_sandbox.py`	Utils	20	CLI path resolution
`plugin_manager.py`	Utils	88	Plugin installation

Total: 35 modules/packages, ~4,500 lines of code

FilesExpand file tree

module-reference.md

Latest commit

History

module-reference.md

File metadata and controls

Module Reference

HTTP Ingestion Layer

main.py

gitlab_webhook.py

gitlab_poller.py

jira_poller.py

Processing Layer

discussion_pipeline.py

review_pipeline.py

discussion_engine.py

review_engine.py

coding_engine.py

prompt_defaults.py

coding_workflow.py

Execution Layer

task_executor.py

remote_executor.py

copilot_session.py

task_runner.py

External Service Clients

gitlab_client.py

jira_client.py

Shared Utilities

git/ package (formerly git_operations.py)

comment_parser.py

comment_poster.py

incremental.py

repo_config.py

Data & Configuration

models.py

jira_models.py

discussion_models.py

mapping_models.py

config_v2.py

app_context.py

credential_registry.py

project_registry.py

config/ package (formerly config.py)

State & Concurrency

concurrency/ package (formerly concurrency.py)

dedup.py

events.py

pipeline.py

review_pipeline.py

discussion_pipeline.py

coding_pipeline.py

state.py

Telemetry

telemetry/ package (formerly telemetry.py)

metrics.py

process_sandbox.py

plugin_manager.py

Summary Table

`main.py`

`gitlab_webhook.py`

`gitlab_poller.py`

`jira_poller.py`

`discussion_pipeline.py`

`review_pipeline.py`

`discussion_engine.py`

`review_engine.py`

`coding_engine.py`

`prompt_defaults.py`

`coding_workflow.py`

`task_executor.py`

`remote_executor.py`

`copilot_session.py`

`task_runner.py`

`gitlab_client.py`

`jira_client.py`

`git/` package (formerly `git_operations.py`)

`comment_parser.py`

`comment_poster.py`

`incremental.py`

`repo_config.py`

`models.py`

`jira_models.py`

`discussion_models.py`

`mapping_models.py`

`config_v2.py`

`app_context.py`

`credential_registry.py`

`project_registry.py`

`config/` package (formerly `config.py`)

`concurrency/` package (formerly `concurrency.py`)

`dedup.py`

`events.py`

`pipeline.py`

`review_pipeline.py`

`discussion_pipeline.py`

`coding_pipeline.py`

`state.py`

`telemetry/` package (formerly `telemetry.py`)

`metrics.py`

`process_sandbox.py`

`plugin_manager.py`