Matas/fix/fix actions #41

MT-superdev · 2026-01-23T07:45:06Z

URL Parsing: Replaced regex and string splitting in src/api/recommendations.py with giturlparse to handle SSH/HTTPS and edge cases correctly.
Logging: Replaced standard logging with structlog across the repository_analysis_agent module for structured, queryable logs.
Error Handling: Removed broad try/except Exception blocks in nodes.py; implemented specific handling for httpx.HTTPStatusError and openai.OpenAIError.
Type Safety: Resolved MyPy errors by removing Any types and enforcing strict Pydantic models (AnalysisState, PRSignal) for all node inputs/outputs.
Data Models: Created PRSignal and HygieneMetrics models in models.py to track historical patterns (e.g., unlinked issues, PR size).
GitHub Integration: Added fetch_recent_pull_requests to GitHubClient to retrieve the last 30 merged PRs via GraphQL.
Logic Flow: Implemented the fetch_pr_signals node and wired it into the LangGraph workflow (fetch_metadata -> fetch_pr_signals -> generate_rules).
Prompt Engineering: Updated LLM prompts to utilize hygiene metrics and reason about "AI Spam" defense strategies (e.g., recommending require_linked_issue if hygiene is poor).
Rate Limiting: Enforced limits of 5 requests/hour (anonymous) and 100 requests/hour (authenticated) in settings.py.
Timeouts: Added a hard 60-second execution timeout to the analysis agent to prevent hanging processes.
Cleanup: Removed stale artifacts, including src/agents/repository_analysis_agent/test_agent.py.
Production Config: Set USE_MOCK_DATA=False by default for the analysis agent.

Summary by CodeRabbit

New Features
- Anonymous public repo analysis enabled; new /api/v1/rules/recommend endpoint and PR-proceed stub.
- Per-IP/per-user rate limiting with 429 + Retry-After for exceeded limits.
- Backend now produces structured governance recommendations and hygiene metrics; frontend connected to real backend.
Documentation
- Added "Unauthenticated Analysis & Rate Limiting" docs.
Tests
- New integration and unit tests covering recommendations and repo-analysis flows.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-23T07:45:31Z

Caution

Review failed

The pull request is closed.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Introduces public repository analysis with unauthenticated and authenticated access, per-IP/per-user rate limiting, a graph-based repository analysis agent backed by GitHub integrations and LLM-driven rule generation, widespread Ruff-centric tooling consolidation, and expanded API, models, and tests for the new analysis flow.

Changes

Cohort / File(s)	Summary
CI/CD & Tooling `\`.github/workflows/docs.yaml``,` `.pre-commit-config.yaml``,` `pyproject.toml``,` `linting_commands.txt``	Docs deploy gate tightened to specific repo; pre-commit hooks bumped and Ruff consolidated; pyproject migrated to Ruff-centric tooling and tightened mypy/pytest settings; added linting commands.
Documentation `\`README.md``,` `docs/features.md``,` `docs/benchmarks.md``	Added "Unauthenticated Analysis & Rate Limiting" docs (limits, 429 + Retry-After, private repo auth) and a minor typographic fix.
Agent Core `\`src/agents/base.py``,` `src/agents/feasibility_agent/agent.py``	Typing and default timeout adjustments in base agent; minor whitespace change in feasibility agent.
Repository Analysis Agent `\`src/agents/repository_analysis_agent/.py``,` `src/agents/repository_analysis_agent/``	Major refactor to graph-driven agent: new AnalysisState, HygieneMetrics, PRSignal, RuleRecommendation, prompts rewritten for rule generation, async nodes (fetch_metadata, fetch_pr_signals, generate_rule_recommendations), structured logging/error handling; tests replaced/removed.
API: Auth & Rate Limiting `\`src/api/dependencies.py``,` `src/api/rate_limit.py``	Added DI for GitHubService; optional/strict auth deps (Bearer parsing); in-memory rate limiter per-IP or per-user with configurable limits and 429 + Retry-After.
API: Recommendations `\`src/api/recommendations.py``,` `src/api/init.py``	New POST `/api/v1/rules/recommend` using RepositoryAnalysisAgent, parse_repo_from_url helper, rate limiting and optional auth; added stub proceed-with-pr endpoint and updated package exports.
GitHub Integrations `\`src/integrations/github/.py``,` `src/integrations/github/``	Added GitHubClient, GraphQL client, schemas, and higher-level GitHubService; support for anonymous access, session/jwt handling, repo/content endpoints, PR fetches, and hygiene queries.
Core Models & Config `\`src/core/models.py``,` `src/core/config/settings.py``,` `src/core/errors.py``	New User model, EventType extended to str-enum, added config flags (use_mock_data, anonymous_rate_limit, authenticated_rate_limit, analysis_timeout), added core GitHub-related exceptions.
Rules System `\`src/rules/models.py``,` `src/rules/utils/contributors.py``,` `src/rules/validators.py``	Added RuleCategory enum and HygieneMetrics model; simplified contributors logic and minor validator comment/refactors.
Webhooks & Tasking `\`src/webhooks/.py``,` `src/tasks/.py``,` `src/main.py``,` `src/init.py``	Switched logging to structlog across handlers; clarified router behavior for unsupported events (202); lifecycle and scheduler startup now use structured logging and updated API prefix `/api/v1`; health endpoint includes workers.
Caching / Logging Utils `\`src/core/utils/caching.py``,` `src/core/utils/logging.py``	Simplified cache handling, consolidated typing imports and TYPE_CHECKING guards.
Tests `\`tests/conftest.py``,` `tests/integration/test_recommendations.py``,` `tests/integration/test_repo_analysis.py``,` `tests/unit/*``	Expanded test fixtures, added integration tests for recommendations and repo analysis (mocked GitHub/OpenAI), added unit tests for parse_repo_from_url and new models; removed old agent test module.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant API as API Endpoint
    participant Auth as Auth Dependency
    participant RateLimit as Rate Limiter
    participant Agent as Repository Analysis Agent
    participant GitHub as GitHub API
    participant LLM as LLM Provider
    participant Response as Response Builder

    Client->>API: POST /api/v1/rules/recommend (repo_url, optional token)
    API->>Auth: get_current_user_optional()
    Auth-->>API: User | None
    API->>RateLimit: rate_limiter(request, user)
    alt Over limit
        RateLimit-->>API: HTTP 429 + Retry-After
        API-->>Client: 429 Rate Limited
    else Under limit
        RateLimit-->>API: Proceed
        API->>Agent: execute(repo_full_name, ...)
        Agent->>GitHub: fetch_repository_metadata()
        GitHub-->>Agent: repo signals (files, languages, workflows)
        Agent->>GitHub: fetch_pr_signals()
        GitHub-->>Agent: PR data
        Agent->>Agent: compute_hygiene_metrics()
        Agent->>LLM: generate_rule_recommendations()
        LLM-->>Agent: RuleRecommendation[]
        Agent-->>API: AnalysisState (recommendations, metadata)
        API->>Response: Build AnalysisResponse
        Response-->>Client: 200 {recommendations, features, metadata}
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through graphs and prompts anew,
Guards of rate-limits keep the pathways true,
Anonymous peeks, tokens for the rest,
Ruff tidied the burrow, tests put to the test,
Watchflow blooms—a rabbit-approved view! 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.22% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Matas/fix/fix actions' is vague and does not clearly convey the main purpose of the changeset, which involves comprehensive refactoring of the repository analysis agent, adding structured logging, rate limiting, type safety improvements, and data models.	Consider renaming to something more descriptive like 'Refactor repository analysis agent with structured logging and rate limiting' or 'Add AI immune system features with hygiene metrics and error handling improvements'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 02276f3 and 963abb2.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (26)

pyproject.toml
src/agents/base.py
src/agents/repository_analysis_agent/__init__.py
src/agents/repository_analysis_agent/models.py
src/agents/repository_analysis_agent/nodes.py
src/api/__init__.py
src/api/recommendations.py
src/core/config/settings.py
src/core/errors.py
src/integrations/github/api.py
src/integrations/github/client.py
src/integrations/github/graphql_client.py
src/integrations/github/schemas.py
src/integrations/github/service.py
src/main.py
src/rules/__init__.py
src/rules/models.py
src/tasks/scheduler/deployment_scheduler.py
src/tasks/task_queue.py
src/webhooks/auth.py
src/webhooks/dispatcher.py
src/webhooks/handlers/check_run.py
src/webhooks/handlers/deployment.py
src/webhooks/handlers/deployment_protection_rule.py
tests/integration/test_recommendations.py
tests/integration/test_repo_analysis.py

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-01-23T07:46:16Z

Summary of Changes

Hello @MT-superdev, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a significant overhaul of the repository analysis and rule recommendation system. The primary goal is to enhance the robustness, maintainability, and intelligence of the platform, particularly through the introduction of an 'AI Immune System' capable of detecting and mitigating low-quality or AI-generated contributions. This is achieved by refactoring the core agent to use a LangGraph-based state machine, improving GitHub API interactions, strengthening type safety, and implementing comprehensive error handling and structured logging. Additionally, the API now supports unauthenticated access for public repositories with appropriate rate limiting, making the service more accessible while maintaining stability.

Highlights

Enhanced Repository Analysis Agent: The core RepositoryAnalysisAgent has been completely refactored to use LangGraph's StateGraph for a more structured and robust workflow, including new nodes for fetching PR signals and generating rule recommendations.
AI Immune System Feature: Introduced PRSignal and HygieneMetrics models to analyze historical PR data, detect potential AI-generated spam or low-quality contributions, and inform rule recommendations.
Improved GitHub Integration: The GitHubClient now supports anonymous access for public repositories, uses aiohttp for async requests, and includes new methods for fetching recent pull requests with detailed metadata.
Robust Error Handling & Logging: Replaced broad try/except blocks with specific exception handling (e.g., httpx.HTTPStatusError, openai.OpenAIError) and migrated to structlog for structured, queryable logging across the repository_analysis_agent module.
Type Safety & Code Quality: Addressed MyPy errors by enforcing strict Pydantic models, removing Any types, and updated pre-commit hooks to use Ruff as a single tool for linting, formatting, and import sorting, replacing black, isort, flake8, and pyupgrade.
API & Rate Limiting: Implemented new FastAPI dependencies for optional/strict user authentication and an in-memory rate limiter for API endpoints, distinguishing between anonymous (5 req/hr) and authenticated (100 req/hr) users.
URL Parsing & Timeout: Integrated giturlparse for more robust GitHub URL parsing and added a hard 60-second execution timeout to the analysis agent to prevent hanging processes.
Documentation Updates: Updated README.md and docs/features.md to reflect the new unauthenticated analysis and rate limiting policies.

Ignored Files

Ignored by pattern: .github/workflows/** (1)
- .github/workflows/docs.yaml

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

codecov-commenter · 2026-01-23T07:49:02Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 63.62434% with 275 lines in your changes missing coverage. Please review.

❌ Your patch status has failed because the patch coverage (63.6%) is below the target coverage (80.0%). You can increase the patch coverage or adjust the target coverage.
❌ Your project status has failed because the head coverage (43.6%) is below the target coverage (80.0%). You can increase the head coverage or adjust the target coverage.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@           Coverage Diff            @@
##            main     #41      +/-   ##
========================================
+ Coverage   32.8%   43.6%   +10.7%     
========================================
  Files         85     101      +16     
  Lines       5162    6104     +942     
========================================
+ Hits        1698    2666     +968     
+ Misses      3464    3438      -26

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f5a3be1...02276f3. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

gemini-code-assist

Code Review

This pull request introduces several significant improvements to the Watchflow project, including enhanced URL parsing, structured logging, specific error handling, improved type safety, new data models, GitHub integration enhancements, rate limiting, and execution timeouts. The code review focuses on identifying potential issues related to correctness and maintainability, particularly concerning error handling and adherence to best practices.

gemini-code-assist · 2026-01-23T07:49:10Z

.pre-commit-config.yaml

+      # Linter (Fixes imports, modernizes syntax, checks bugs)
      - id: ruff
-        name: "Linting and import sorting"
-        args: ["--fix"]
-
-  # Pyupgrade: Check and fix Python version incompatibilities and outdated syntax
-  - repo: https://github.com/asottile/pyupgrade
-    rev: v3.15.2
-    hooks:
-      - id: pyupgrade
-        name: "Upgrade syntax for Python 3.12+"
-        args: [--py312-plus]
-        # Auto-fixes outdated syntax to Python 3.12+ compatible code
+        args: [--fix, --exit-non-zero-on-fix]
+      # Formatter (Replaces Black)
+      - id: ruff-format


The check-ast hook has been removed. While Ruff generally covers AST checks, ensure that the removal of this specific hook doesn't inadvertently reduce the scope of syntax validation, potentially missing edge cases that check-ast might have caught. Consider adding a Ruff rule that specifically covers the checks previously done by check-ast.

gemini-code-assist · 2026-01-23T07:49:10Z

pyproject.toml

+    "ruff>=0.1.0", # Replaces black, isort, flake8
 ]


The comment indicates that ruff replaces black, isort, and flake8. Ensure that the Ruff configuration adequately covers all the checks and formatting rules previously enforced by these tools to maintain code quality and consistency.

gemini-code-assist · 2026-01-23T07:49:10Z

src/agents/repository_analysis_agent/agent.py

+        except Exception as e:
+            # Catching Exception here is only for the top-level orchestration safety
+            logger.exception("agent_execution_failed", agent="repository_analysis", error=str(e))
+            return AgentResult(success=False, message=str(e))


The except Exception as e block is catching all exceptions. While this prevents the agent from crashing, it might mask specific, actionable exceptions. Consider logging the exception type and message for better debugging and potentially handling specific exception types differently.

gemini-code-assist · 2026-01-23T07:49:10Z

src/agents/repository_analysis_agent/nodes.py

+from src.integrations.github.api import github_client
+from src.integrations.providers.factory import get_chat_model

-import logging
-from typing import Any
+logger = structlog.get_logger()


The comment mentions that a single broken workflow file shouldn't stop analysis. However, consider implementing a mechanism to track and report these failures (e.g., adding them to the AnalysisState.error list) so that they can be addressed without being silently ignored.

gemini-code-assist · 2026-01-23T07:49:11Z

src/agents/repository_analysis_agent/nodes.py

+    except openai.OpenAIError as e:
+        logger.error("rule_generation_failed", repo=repo_name, error=str(e), error_type="llm_provider_error")
+        fallback_reason = f"AI provider error: {e.__class__.__name__}"
+    except pydantic.ValidationError as e:
+        logger.error("rule_generation_failed", repo=repo_name, error=str(e), error_type="schema_mismatch")
+        fallback_reason = "AI model returned data in an unexpected format."
+    except Exception as e:
+        logger.error("rule_generation_failed", repo=repo_name, error=str(e), error_type="unknown_error", exc_info=True)
+        fallback_reason = f"An unexpected error occurred: {str(e)}"


The except openai.OpenAIError as e, except pydantic.ValidationError as e, and except Exception as e blocks are catching all exceptions. While this prevents the agent from crashing, it might mask specific, actionable exceptions. Consider logging the exception type and message for better debugging and potentially handling specific exception types differently.

gemini-code-assist · 2026-01-23T07:49:11Z

src/api/recommendations.py

+        logger.warning(f"Invalid URL provided by {client_ip}: {repo_url_str}")
+        raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY, detail=str(e)) from e
+
+    # Step 2: Rate limiting—TODO: use Redis. For now, agent handles GitHub 429s internally.


The comment mentions rate limiting, but it's a TODO to use Redis. Ensure that the in-memory rate limiting solution is sufficient for production and consider the implications of using a single process for rate limiting across multiple instances of the application.

gemini-code-assist · 2026-01-23T07:49:11Z

src/integrations/github/api.py

+            logger.error(f"Error listing PRs for {repo}: {e}")
+            return []
+
+    async def _get_session(self) -> aiohttp.ClientSession:


The except Exception as e block is catching all exceptions. While this prevents the function from crashing, it might mask specific, actionable exceptions. Consider logging the exception type and message for better debugging and potentially handling specific exception types differently.

gemini-code-assist · 2026-01-23T07:49:11Z

src/integrations/github/api.py

+            # Fallback: ensure we have a valid session
+            self._session = aiohttp.ClientSession()
+        return self._session


The except Exception block is catching all exceptions. While this prevents the function from crashing, it might mask specific, actionable exceptions. Consider logging the exception type and message for better debugging and potentially handling specific exception types differently.

coderabbitai

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/rules/validators.py (1)

622-630: Exception handling may fail if pytz is not installed.

The exception clause references pytz.exceptions.UnknownTimeZoneError, but if pytz is not installed, this will raise a NameError when Python evaluates the exception tuple—before catching ImportError.

🐛 Suggested fix

         try:
             import pytz
 
             tz = pytz.timezone(timezone_str)
             current_time = datetime.now(tz)
-        except (ImportError, pytz.exceptions.UnknownTimeZoneError):
-            # Fallback to UTC if pytz is not available or timezone is invalid
-            logger.warning(f"Invalid timezone '{timezone_str}', using UTC")
-            current_time = datetime.now()
+        except ImportError:
+            logger.warning(f"pytz not available, using local time for timezone '{timezone_str}'")
+            current_time = datetime.now()
+        except Exception as e:
+            # Catches UnknownTimeZoneError and other pytz errors
+            logger.warning(f"Invalid timezone '{timezone_str}': {e}, using local time")
+            current_time = datetime.now()

Alternatively, move the import to the top of the file and handle gracefully:

try:
    import pytz
    PYTZ_AVAILABLE = True
except ImportError:
    PYTZ_AVAILABLE = False

src/api/rules.py (1)

14-27: Missing error handling for agent execution.

The comment on line 19 notes that "agent may throw if rule malformed," but there's no try/except block. An unhandled exception would result in a 500 error with potentially sensitive stack trace information in the response.

Consider adding explicit error handling:

Proposed fix

 `@router.post`("/rules/evaluate")
 async def evaluate_rule(request: RuleEvaluationRequest):
     # Agent: uses central config—change here affects all rule evals.
     agent = get_agent("feasibility")

-    # Async call—agent may throw if rule malformed.
-    result = await agent.execute(rule_description=request.rule_text)
-
-    # Output: keep format stable for frontend. Brittle if agent changes keys.
-    return {
-        "supported": result.data.get("is_feasible", False),
-        "snippet": result.data.get("yaml_content", ""),
-        "feedback": result.message,
-    }
+    try:
+        result = await agent.execute(rule_description=request.rule_text)
+        return {
+            "supported": result.data.get("is_feasible", False),
+            "snippet": result.data.get("yaml_content", ""),
+            "feedback": result.message,
+        }
+    except Exception as e:
+        return {
+            "supported": False,
+            "snippet": "",
+            "feedback": f"Evaluation failed: {str(e)}",
+        }

🤖 Fix all issues with AI agents

In `@docs/benchmarks.md`:
- Line 9: The phrase "70 + enterprise policies" contains an extra space and
should be normalized to "70+ enterprise policies" to match the later usage;
update the string "Our analysis of 70 + enterprise policies..." to remove the
space between 70 and + so it reads "70+ enterprise policies" (leave surrounding
text and "85% of real-world governance policies require context" unchanged).

In `@src/agents/feasibility_agent/agent.py`:
- Line 2: Fix the double space in the module/class docstring: locate the
module-level docstring at the top of src/agents/feasibility_agent/agent.py (and
any docstring on the FeasibilityAgent class if present) that currently reads
"error handling  and retry logic." and remove the extra space so it reads "error
handling and retry logic."

In `@src/agents/repository_analysis_agent/models.py`:
- Around line 6-17: Replace the manual string hacks in
parse_github_repo_identifier with giturlparse parsing: call
giturlparse.parse(identifier), and if the result is valid use
f"{parsed.owner}/{parsed.repo}" (or parsed.owner + "/" + parsed.name if repo is
empty) to normalize SSH and HTTPS forms; if giturlparse reports an invalid URL,
fall back to the existing strip/remove logic (remove protocol/domain, ".git",
and trailing slashes) or raise a clear ValueError to surface bad input. Ensure
you update parse_github_repo_identifier to import and use giturlparse so SSH
URLs like "[email protected]:owner/repo.git" and https URLs are handled
consistently.

In `@src/agents/repository_analysis_agent/nodes.py`:
- Around line 111-113: The calls to github_client.get_file_content and similar
methods in nodes.py are hardcoding installation_id=None which prevents access to
private repos; change those calls (e.g., the get_file_content invocation and the
other occurrences around the blocks you flagged) to pass the actual
installation_id available in the current scope (or a provided parameter like
installation_id) instead of None so authenticated requests are used for private
repos; update every instance you noted (lines ~111, ~126, ~145, ~195 ranges) to
forward the real installation_id variable to github_client methods.

In `@src/api/dependencies.py`:
- Around line 34-41: The bearer parsing currently accepts any token and returns
a fixed User; update the logic around auth_header / scheme / token (the block
that does scheme, token = auth_header.split() and returns User(...,
github_token=token)) so that you do not grant a fixed authenticated identity for
arbitrary strings: either validate the token with the real IdP or GitHub (verify
JWT signature or call the appropriate token-introspect/verify endpoint) and
populate User with the real identity on success, or treat unverified tokens as
anonymous and return None/anonymous user instead of User(id=123,...). Replace
the TODO stub with a call to a verification helper (e.g., verify_idp_token or
verify_github_token) and only construct/return User when that verification
passes.

In `@src/api/rate_limit.py`:
- Around line 21-28: The rate_limiter dependency always receives user=None
because it uses Depends(lambda: None); replace that with
Depends(get_current_user_optional) (and add an import for
get_current_user_optional if missing) so the rate_limiter function can detect
authenticated users by inspecting the user argument and correctly switch between
AUTH_LIMIT and ANON_LIMIT (refer to the rate_limiter function and
get_current_user_optional dependency).

In `@src/api/recommendations.py`:
- Around line 88-99: Replace raw PII in logs by hashing or redacting the user
identifier and client IP before logging: compute a one-way hash (e.g., sha256)
of client_ip and user.email (fallback values like "unknown"/"Anonymous" can be
hashed too) and use those hashed values in the logger.info and logger.warning
messages around repo_url_str and parse_repo_from_url; ensure the new variables
(hashed_user_id, hashed_client_ip) are used in place of user_id/client_ip so no
raw email or IP is written to logs.
- Line 5: The model currently imports and uses pydantic.HttpUrl which blocks
SSH/Git URLs; change the affected model fields from HttpUrl to plain str (remove
HttpUrl usage and its import) so SSH/git@... URLs are accepted, and rely on the
existing parse_repo_from_url function to validate/parse those strings (update
any model or field declarations that reference HttpUrl—e.g., the request model
fields at the top of src/api/recommendations.py and the fields around lines
27-29—to be str and keep Field(...) metadata unchanged).
- Around line 46-62: The host validation in parse_repo_from_url is too
permissive and allows lookalike domains; update the check to first ensure p.host
is not None and then perform a case-insensitive exact equality comparison
against "github.com" (e.g., compare p.host.lower() == "github.com") instead of
using `"github.com" in p.host`, leaving the rest of the ValueError behavior
intact; reference parse_repo_from_url and the p.host field when making this
change.

In `@src/core/models.py`:
- Around line 4-18: Change the User.github_token field to use pydantic's
SecretStr and mark it excluded from serialization; specifically import SecretStr
and update the github_token declaration in class User to use type SecretStr with
Field(default=None, description="OAuth token for GitHub API access",
exclude=True, repr=False) so the secret is stored securely and omitted from
model_dump()/serialization and reprs.

In `@src/integrations/github/api.py`:
- Around line 950-953: Inside the function that performs HTTP requests in
src/integrations/github/api.py remove the local "import httpx" and the except
block catching httpx.HTTPStatusError (this is dead because the code uses
aiohttp); instead catch aiohttp.ClientResponseError (or aiohttp.ClientError for
broader coverage) in the same except block and log the caught exception via the
existing logger (structlog.get_logger()) so the error details are recorded;
locate the try/except around the aiohttp request and replace the httpx-specific
handler with the aiohttp exception type and remove the unused import.

In `@src/integrations/github/service.py`:
- Around line 66-74: The current loop treating any non-200 from the contents API
as "file missing" hides repo-level errors (auth, rate limits, missing repo); add
a preflight repo validation before iterating files: call the repo endpoint
(e.g., f"{self.BASE_URL}/repos/{owner}/{repo}") or an authenticated HEAD on the
repo to check for 200 and explicitly handle 401/403/404/rate-limit responses
(raise or return an error) rather than appending to found_files; keep the
existing per-file checks only after the preflight succeeds and reuse headers and
the AsyncClient used around files_to_check to avoid duplicate clients.

In `@src/main.py`:
- Around line 88-96: The current CORSMiddleware setup (app.add_middleware with
CORSMiddleware and allow_origins=["*"], allow_credentials=True) violates the
CORS spec; either replace the wildcard origins with an explicit list of allowed
origins (set allow_origins to the exact host(s) you expect) when keepng
allow_credentials=True, or if you must keep wildcard origins set
allow_credentials=False; update the CORSMiddleware config accordingly in the
app.add_middleware call to use explicit origins or disable credentials so
browsers will accept the responses.

In `@tests/conftest.py`:
- Around line 5-48: The hard-coded base64 literal in the mock_settings fixture
(PRIVATE_KEY_BASE64_GITHUB) triggers secret scanners; change it to compute the
value at runtime instead (e.g., base64-encode a non-secret byte string) so tests
still get a valid value without a literal. Update the mock_settings fixture to
import/use Python's base64 (or encode b"mock-key" and decode to str) and set
PRIVATE_KEY_BASE64_GITHUB to that computed string rather than the literal
"bW9jay1rZXk=".

🧹 Nitpick comments (11)

src/core/utils/logging.py (1)
148-149: Comment is slightly verbose but harmless.

The comment on line 148 restates what the TYPE_CHECKING block already conveys. Consider simplifying or removing it, but this is purely cosmetic.
Optional: simplify the comment
-    # Callable is now only needed for typing, so it's safe to use the string name or handled by __future__
+    # Type annotation only; Callable imported under TYPE_CHECKING
     log_fn: Callable[..., Any] = getattr(logger_obj, level, logger_obj.info)
src/core/utils/caching.py (1)
161-163: Minor edge case with ttl=0.

The ternary refactor is clean. Note that ttl=0 would be falsy, causing TTLCache with ttl=3600 to be created instead of AsyncCache with zero TTL. This is unlikely to be a real issue since a zero-TTL cache has no practical use, but consider using ttl is not None if you want to allow explicit zero values in the future.
-        cache = AsyncCache(maxsize=maxsize, ttl=ttl) if ttl else TTLCache(maxsize=maxsize, ttl=ttl or 3600)
+        cache = AsyncCache(maxsize=maxsize, ttl=ttl) if ttl is not None else TTLCache(maxsize=maxsize, ttl=3600)
src/rules/validators.py (3)
201-206: Consider using the existing _compile_glob function.

There are two glob-to-regex conversion implementations in this file:

_compile_glob (line 13) - handles ** for multi-directory matching and uses caching

_glob_to_regex (line 202) - simpler, doesn't distinguish * from **

The top-level _compile_glob is more robust and already used by _matches_any. Consider consolidating to avoid inconsistent behavior.
♻️ Suggested refactor
     `@staticmethod`
     def _glob_to_regex(glob_pattern: str) -> str:
         """Converts a glob pattern to a regex pattern."""
-        # Simple glob→regex—fragile, production needs better.
-        regex = glob_pattern.replace(".", "\\.").replace("*", ".*").replace("?", ".")
-        return f"^{regex}$"
+        # Reuse the module-level implementation for consistency.
+        return _compile_glob(glob_pattern).pattern
229-243: Consider consolidating ApprovalCountCondition with MinApprovalsCondition.

ApprovalCountCondition (always returns True) duplicates the intent of MinApprovalsCondition (lines 279-302) which has a working implementation. Having both in the registry under different names (has_min_approvals and min_approvals) could cause confusion.

Consider either:

Removing ApprovalCountCondition and aliasing has_min_approvals to MinApprovalsCondition

Delegating ApprovalCountCondition.validate to MinApprovalsCondition

189-199: _get_changed_files always returns an empty list.

Both the pull_request and push branches return empty lists, making FilePatternCondition effectively non-functional—it will always return False (line 171-173). The TODOs acknowledge this, but consider adding a more prominent warning or raising NotImplementedError to make the placeholder status explicit during testing.
♻️ Optional: Make placeholder status explicit
     def _get_changed_files(self, event: dict[str, Any]) -> list[str]:
         """Extracts the list of changed files from the event."""
         event_type = event.get("event_type", "")
         if event_type == "pull_request":
             # TODO: Pull request—fetch changed files via GitHub API. Placeholder for now.
+            logger.warning("FilePatternCondition: Changed files extraction not implemented for pull_request")
             return []
         elif event_type == "push":
             # Push event—files in commits, not implemented.
+            logger.warning("FilePatternCondition: Changed files extraction not implemented for push")
             return []
         else:
             return []
src/tasks/task_queue.py (1)

47-112: Deduplication infrastructure is unused.

The _create_event_hash method (lines 47-112) and the event_hashes dict are fully implemented, but event_hash is always set to None at line 126 with the comment "No deduplication for now". This means duplicate webhook events will be processed multiple times.

If deduplication is not intended for this release, consider either:

Adding a TODO comment explaining when this will be enabled, or

Removing the unused code to reduce maintenance burden.

Would you like me to open an issue to track enabling the deduplication feature?

Also applies to: 126-126

.pre-commit-config.yaml (1)

17-24: Ruff version is significantly outdated.

The Ruff version v0.3.0 is from early 2024. The latest release is v0.14.13 (January 2026). Consider updating to a recent version for improved linting coverage, bug fixes, and new rules.

src/integrations/github/api.py (1)

243-250: Session lifecycle pattern doesn't align with aiohttp best practices.

Accessing self._session._loop (a private attribute) is fragile, but the deeper issue is that maintaining a long-lived session and trying to detect event loop changes contradicts aiohttp's design. According to aiohttp documentation, sessions should be created and managed within their intended event loop context—typically using async with aiohttp.ClientSession() or ensuring the session lifecycle is tied to a proper async startup/shutdown hook. The suggested alternative using connector.closed still relies on implementation details.

Instead of detecting loop changes, align with aiohttp's recommendation: manage the session's lifecycle relative to the running event loop, or accept it as a parameter from the caller.
src/integrations/github/service.py (1)
107-115: Unify GitHub URL parsing to handle SSH/.git consistently.

split("/") fails for SSH URLs and preserves .git suffixes, which can generate invalid API paths if those forms ever reach this layer. Consider reusing the same robust parsing strategy used elsewhere (e.g., giturlparse) or move it into a shared utility.
♻️ Suggested refactor (giturlparse)
-import logging
-from typing import Any
+import logging
+from typing import Any
+
+from giturlparse import parse as parse_git_url
 ...
     def _parse_url(self, url: str) -> tuple[str, str]:
         """
         Extracts owner and repo from https://github.com/owner/repo
         """
-        clean_url = str(url).rstrip("/")
-        parts = clean_url.split("/")
-        if len(parts) < 2:
-            raise ValueError("Invalid GitHub URL")
-        return parts[-2], parts[-1]
+        p = parse_git_url(str(url))
+        if not p.valid or not p.owner or not p.repo or "github.com" not in (p.host or ""):
+            raise ValueError("Invalid GitHub URL")
+        return p.owner, p.repo
src/api/dependencies.py (1)
8-9: Remove duplicate logger initialization.

This is harmless but redundant.
♻️ Suggested cleanup
-logger = logging.getLogger(__name__)  # Logger: keep at module level for reuse.
-logger = logging.getLogger(__name__)
+logger = logging.getLogger(__name__)  # Logger: keep at module level for reuse.
src/main.py (1)
121-125: Snapshot tasks list to keep counts consistent.

Multiple passes over a live view can drift if tasks mutate mid-request; take a snapshot once.
♻️ Suggested refactor
-    tasks = task_queue.tasks.values()
-    pending_count = sum(1 for t in tasks if t.status.value == "pending")
-    running_count = sum(1 for t in tasks if t.status.value == "running")
-    completed_count = sum(1 for t in tasks if t.status.value == "completed")
-    failed_count = sum(1 for t in tasks if t.status.value == "failed")
+    tasks = list(task_queue.tasks.values())
+    pending_count = sum(1 for t in tasks if t.status.value == "pending")
+    running_count = sum(1 for t in tasks if t.status.value == "running")
+    completed_count = sum(1 for t in tasks if t.status.value == "completed")
+    failed_count = sum(1 for t in tasks if t.status.value == "failed")
@@
-            "total": len(tasks),
+            "total": len(tasks),
Also applies to: 135-135

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f5a3be1 and 02276f3.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (45)

.github/workflows/docs.yaml
.pre-commit-config.yaml
README.md
docs/benchmarks.md
docs/features.md
linting_commands.txt
pyproject.toml
src/__init__.py
src/agents/base.py
src/agents/feasibility_agent/agent.py
src/agents/repository_analysis_agent/agent.py
src/agents/repository_analysis_agent/models.py
src/agents/repository_analysis_agent/nodes.py
src/agents/repository_analysis_agent/prompts.py
src/agents/repository_analysis_agent/test_agent.py
src/api/dependencies.py
src/api/rate_limit.py
src/api/recommendations.py
src/api/rules.py
src/core/config/settings.py
src/core/models.py
src/core/utils/caching.py
src/core/utils/logging.py
src/integrations/github/api.py
src/integrations/github/service.py
src/integrations/providers/bedrock_provider.py
src/integrations/providers/factory.py
src/main.py
src/rules/models.py
src/rules/utils/contributors.py
src/rules/validators.py
src/tasks/scheduler/deployment_scheduler.py
src/tasks/task_queue.py
src/webhooks/auth.py
src/webhooks/dispatcher.py
src/webhooks/handlers/deployment.py
src/webhooks/handlers/deployment_status.py
src/webhooks/handlers/issue_comment.py
src/webhooks/handlers/pull_request.py
src/webhooks/router.py
tests/conftest.py
tests/integration/test_recommendations.py
tests/integration/test_rules_api.py
tests/unit/agents/test_repository_analysis_models.py
tests/unit/api/test_recommendations.py

💤 Files with no reviewable changes (1)

src/agents/repository_analysis_agent/test_agent.py

🧰 Additional context used

🧬 Code graph analysis (16)

tests/unit/agents/test_repository_analysis_models.py (1)

src/agents/repository_analysis_agent/models.py (4)

HygieneMetrics (119-141)

PRSignal (94-116)

RepositoryAnalysisRequest (20-37)

parse_github_repo_identifier (6-17)

src/webhooks/handlers/deployment.py (1)

src/core/models.py (1)

repo_full_name (56-58)

tests/integration/test_recommendations.py (1)

src/core/utils/caching.py (1)

get (44-66)

src/agents/repository_analysis_agent/agent.py (3)

src/agents/base.py (5)

AgentResult (16-29)

BaseAgent (32-119)

_build_graph (55-60)

execute (114-119)

_execute_with_timeout (93-111)

src/agents/repository_analysis_agent/nodes.py (3)

fetch_repository_metadata (58-175)

fetch_pr_signals (178-258)

generate_rule_recommendations (261-336)

src/core/models.py (1)

repo_full_name (56-58)

tests/unit/api/test_recommendations.py (1)

src/api/recommendations.py (1)

parse_repo_from_url (46-62)

src/agents/base.py (4)

src/agents/repository_analysis_agent/agent.py (2)

_build_graph (22-40)

execute (42-76)

src/agents/feasibility_agent/agent.py (2)

_build_graph (34-55)

execute (57-123)

src/agents/acknowledgment_agent/agent.py (2)

_build_graph (36-52)

execute (212-217)

src/agents/engine_agent/agent.py (2)

_build_graph (54-73)

execute (75-173)

src/api/rate_limit.py (2)

src/core/models.py (1)

User (7-18)

tests/integration/test_rules_api.py (1)

client (21-23)

src/api/dependencies.py (3)

src/core/models.py (1)

User (7-18)

src/integrations/github/service.py (1)

GitHubService (19-115)

src/core/utils/caching.py (1)

get (44-66)

src/integrations/providers/factory.py (2)

src/core/config/provider_config.py (2)

get_max_tokens_for_agent (57-63)

get_temperature_for_agent (65-71)

tests/unit/test_feasibility_agent.py (1)

agent (18-31)

src/main.py (3)

src/webhooks/handlers/deployment_protection_rule.py (1)

DeploymentProtectionRuleEventHandler (10-33)

src/tasks/scheduler/deployment_scheduler.py (2)

get_deployment_scheduler (364-369)

stop (40-48)

src/tasks/task_queue.py (1)

stop_workers (143-149)

src/tasks/task_queue.py (3)

src/webhooks/handlers/issue_comment.py (1)

event_type (19-20)

src/core/models.py (1)

repo_full_name (56-58)

src/core/utils/caching.py (1)

get (44-66)

src/webhooks/router.py (2)

src/webhooks/handlers/issue_comment.py (1)

event_type (19-20)

src/core/models.py (1)

EventType (24-38)

src/core/utils/caching.py (4)

src/core/utils/logging.py (1)

decorator (86-128)

src/core/utils/metrics.py (1)

decorator (83-143)

src/core/utils/retry.py (2)

decorator (45-76)

wrapper (47-74)

src/core/utils/timeout.py (2)

decorator (65-73)

wrapper (66-71)

src/agents/repository_analysis_agent/models.py (1)

src/core/models.py (1)

repo_full_name (56-58)

src/webhooks/dispatcher.py (9)

src/webhooks/handlers/deployment.py (1)

handle (17-46)

src/webhooks/handlers/deployment_status.py (1)

handle (17-50)

src/webhooks/handlers/issue_comment.py (2)

handle (25-147)

event_type (19-20)

src/webhooks/handlers/pull_request.py (1)

handle (18-45)

src/webhooks/handlers/base.py (1)

handle (16-26)

src/webhooks/handlers/deployment_protection_rule.py (1)

handle (16-33)

src/webhooks/handlers/deployment_review.py (1)

handle (16-33)

src/webhooks/handlers/push.py (1)

handle (16-29)

src/webhooks/handlers/check_run.py (1)

handle (16-29)

src/integrations/github/api.py (2)

src/core/models.py (1)

repo_full_name (56-58)

src/core/utils/caching.py (1)

get (44-66)

🪛 Gitleaks (8.30.0)

src/integrations/github/api.py

[high] 104-104: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

[high] 120-120: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

tests/conftest.py

[high] 42-42: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🔇 Additional comments (63)

src/core/utils/logging.py (1)

8-18: LGTM! Clean typing refactor using TYPE_CHECKING pattern.

The combination of from __future__ import annotations with the TYPE_CHECKING guard for Callable is the correct approach. With PEP 563, annotations are stored as strings and not evaluated at runtime, so Callable is safely available only for static type checkers without any runtime import overhead.

src/rules/utils/contributors.py (1)

61-65: LGTM! Clean refactor combining nested conditionals.

The short-circuit evaluation of and ensures _has_recent_activity() is only called when contributions >= min_contributions, preserving the original behavior while improving readability.

Minor nit: The # SIM102: Combined nested if statements comment could be removed—linter rule references are better suited for commit messages than permanent code comments.

src/core/utils/caching.py (2)

169-170: LGTM!

Clean ternary simplification for cache key generation. The logic is preserved correctly.

173-175: LGTM!

Good unification of the .get() call. Both AsyncCache and cachetools.TTLCache have compatible .get() interfaces that return None for missing/expired keys.

src/rules/validators.py (4)

85-91: LGTM!

The clarifying comment about validator metadata is helpful for maintainability.

138-147: LGTM!

The TODO comment appropriately flags the placeholder implementation, and the code correctly returns False for unknown teams by defaulting to an empty list.

223-226: LGTM!

The TODO comment appropriately flags the placeholder implementation with hardcoded contributors.

246-260: LGTM with a note on timezone handling.

The comment correctly flags that datetime.now() uses local time, which may behave inconsistently across deployments. For consistency with AllowedHoursCondition (which accepts a timezone parameter), consider adding timezone support here in a future iteration.

src/tasks/scheduler/deployment_scheduler.py (1)

2-47: LGTM: cancellation handling is cleaner with contextlib.suppress.

src/__init__.py (1)

1-1: LGTM: comment clarifies intent.

tests/conftest.py (2)

11-29: LGTM: test helpers and src path setup.

54-61: LGTM: dedicated session event loop fixture.

.github/workflows/docs.yaml (1)

71-71: LGTM: deploy gated to main repo.

src/webhooks/auth.py (1)

32-45: LGTM: clearer signature verification notes + explicit success return.

.pre-commit-config.yaml (1)

14-14: Good addition of check-added-large-files hook.

This hook helps prevent accidentally committing large files to the repository, which is a good safeguard for repository hygiene.

src/agents/base.py (2)

54-60: LGTM - pragmatic typing for polymorphic return.

The Any return type is appropriate here since subclasses return different compiled graph types (e.g., StateGraph in acknowledgment_agent, Any in repository_analysis_agent). The docstring clearly documents the rationale.

113-119: LGTM - explicit kwargs typing for MyPy strict mode.

The **kwargs: Any typing satisfies MyPy strict mode requirements while maintaining flexibility for subclass implementations with varying parameter needs.

src/integrations/providers/factory.py (2)

8-9: Good addition of future annotations.

The from __future__ import annotations import enables PEP 563 postponed annotation evaluation, which is beneficial for forward references and aligns with modern Python practices.

78-79: LGTM - cleaner ternary expressions.

The refactored ternary expressions are more concise while maintaining the same precedence logic: explicit params > agent config > global config.

src/tasks/task_queue.py (1)

35-35: LGTM - improved documentation clarity.

The updated comments better explain the purpose of deduplication fields and event-specific hash components.

Also applies to: 43-43, 49-49, 57-57, 65-68, 89-89

src/rules/models.py (1)

20-28: LGTM - well-structured category enum.

The RuleCategory enum is well-documented with clear inline comments explaining each category's purpose. The HYGIENE category aligns with the PR's "AI Immune System" features for AI spam detection and contribution governance.

src/integrations/providers/bedrock_provider.py (2)

151-167: LGTM: inference-profile matching is clearer and case-insensitive.
The explicit provider checks read well and reduce ambiguity.

178-181: LGTM.
Localized import is fine; just ensure the optional dependency is present when this path is used.

src/core/config/settings.py (2)

114-118: LGTM: centralized analysis feature settings.
Defaults look consistent with the intended limits/timeouts.

142-156: LGTM: provider validation is clearer.
Unified Bedrock/Vertex checks should improve config guardrails.

pyproject.toml (6)

28-29: Dependency additions look aligned with feature changes.
Please verify the minimum versions are sufficient for your usage.

33-46: LGTM: dev/docs dependency updates.
Verify the tool versions align with your CI expectations.

51-85: Ruff configuration consolidation looks solid.
Please confirm the configured options are supported by your minimum Ruff version.

91-94: LGTM: stricter mypy settings.
Ensure the team is ready for the stricter typing enforcement.

98-108: LGTM: pytest options remain consistent.
Verify these flags still match your CI coverage expectations.

117-118: LGTM: uv dev-dependencies update.
Please confirm the same min versions are used across your package managers.

linting_commands.txt (1)

1-9: LGTM: concise lint/type-check workflow.
Verify the commands align with your pinned tool versions.

src/core/models.py (2)

24-38: LGTM: str-backed EventType improves serialization ergonomics.

41-63: LGTM: helper docstrings are clearer.

src/webhooks/handlers/issue_comment.py (3)

35-143: LGTM: clearer command routing and explicit ignore path.

155-169: LGTM: more flexible acknowledgment parsing.

193-194: LGTM: any() simplifies help detection.

src/webhooks/dispatcher.py (1)

16-17: LGTM - Comment improvements add useful context.

The updated comments clarify the registry's purpose (hot-swap, test injection), dispatch behavior, and thread-safety caveats. No functional changes.

Also applies to: 52-53, 60-60

src/webhooks/handlers/deployment_status.py (1)

27-30: LGTM - Useful fragility annotations.

The comments accurately document the coupling to GitHub's payload structure and potential queue failures. Good defensive documentation.

Also applies to: 36-42

src/webhooks/handlers/deployment.py (1)

27-29: LGTM - Consistent documentation pattern.

Matches the fragility annotations in deployment_status.py, maintaining consistency across handlers.

Also applies to: 35-38

src/webhooks/router.py (1)

31-34: Good defensive handling for unknown events.

Returning 202 for unsupported event types prevents GitHub from retrying and avoids alert noise. The pattern of raising HTTPException(status_code=202) and catching it in the endpoint (lines 62-65) works but is unconventional—consider returning a sentinel or using a custom exception class for clearer intent.

src/agents/repository_analysis_agent/models.py (2)

94-141: Excellent documentation on PRSignal and HygieneMetrics.

The detailed docstrings explaining the "AI Immune System" feature, signal purposes, and how metrics trigger defensive rules provide valuable context for future maintainers.

144-171: Clean blackboard pattern for AnalysisState.

Well-structured state model with clear separation of inputs, collected signals, outputs, and execution metadata. The Field(default_factory=list) usage is correct for mutable defaults.

src/integrations/github/api.py (3)

37-65: Clean priority-based authentication flow.

The three-tier auth priority (user token → installation token → anonymous) is well-structured and the allow_anonymous flag cleanly enables public repo analysis.

Note: The static analysis warnings about "Generic API Key" on lines referencing allow_anonymous are false positives—they're flagging the parameter name, not actual secrets.

1044-1060: Good heuristic for issue reference detection.

The regex patterns cover the most common linking conventions (#123, fixes #123, `closes `#456). For a heuristic used in hygiene analysis, this coverage is appropriate.

99-130: Clean public repo access methods.

get_repository and list_directory_any_auth properly leverage the new allow_anonymous=True flow, enabling Phase 1 public analysis without requiring authentication.

src/integrations/github/service.py (1)

32-49: Clean 404/rate-limit handling for metadata fetch.

Nice to see explicit checks before raise_for_status().

README.md (1)

239-244: Docs update matches the new public/anonymous rate‑limit behavior.

Clear and concise.

tests/unit/api/test_recommendations.py (1)

6-54: Solid coverage of URL parsing edge cases.

These cases align well with the updated parser behavior.

docs/features.md (1)

282-287: Documentation aligns with new rate‑limit semantics.

Looks good.

tests/integration/test_rules_api.py (1)

28-28: Cleaner env flag check.

The direct inequality reads clearer while keeping behavior the same.

Also applies to: 71-71, 106-106

tests/unit/agents/test_repository_analysis_models.py (1)

1-43: Good coverage for new model types and normalization.

The added tests exercise URL normalization and the new hygiene/PR signal models well.

src/main.py (3)

16-18: Deployment protection rule handler is fully wired.

Import and dispatcher registration are consistent with existing handlers.

Also applies to: 54-63

38-38: Startup/shutdown logging is consistent and concise.

Also applies to: 65-65, 70-70, 78-78

103-103: Recommendations router now aligned with /api/v1.

tests/integration/test_recommendations.py (1)

1-145: Great end-to-end coverage for public/private/auth cases.

src/agents/repository_analysis_agent/prompts.py (1)

1-66: Prompt constraints are explicit and well-scoped.

src/agents/repository_analysis_agent/nodes.py (2)

261-336: Structured output handling + fallback path look solid.

17-54: The PR payload mapping is correct. fetch_recent_pull_requests explicitly populates all required fields: number, has_issue_ref, author_association, and lines_changed. The GraphQL/REST response properly computes lines_changed (additions + deletions) and has_issue_ref via _detect_issue_references, with sensible defaults for missing fields.

src/agents/repository_analysis_agent/agent.py (2)

22-40: Graph wiring is clear and linear.

Node registration and edges align with the intended flow.

42-76: Timeout + error handling look solid.

The hard timeout and final state normalization keep execution robust.

src/api/recommendations.py (2)

33-40: Response schema is straightforward.

Clear, stable shape for clients.

149-177: Stub endpoint validation is clear.

Required-field checks make the stub predictable for Phase 1.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-23T07:56:52Z

docs/benchmarks.md

 ### Context Dependency in Enterprise Policies

-Our analysis of 70+ enterprise policies from major tech companies revealed a critical insight: **85% of real-world governance policies require context** and cannot be effectively enforced with traditional static rules.
+Our analysis of 70 + enterprise policies from major tech companies revealed a critical insight: **85% of real-world governance policies require context** and cannot be effectively enforced with traditional static rules.


⚠️ Potential issue | 🟡 Minor

Fix spacing around “70+”.
The added space reads like arithmetic and is inconsistent with later “70+ enterprise policies”.

✏️ Proposed fix

-Our analysis of 70 + enterprise policies from major tech companies revealed a critical insight: **85% of real-world governance policies require context** and cannot be effectively enforced with traditional static rules. +Our analysis of 70+ enterprise policies from major tech companies revealed a critical insight: **85% of real-world governance policies require context** and cannot be effectively enforced with traditional static rules.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Our analysis of 70 + enterprise policies from major tech companies revealed a critical insight: **85% of real-world governance policies require context** and cannot be effectively enforced with traditional static rules.

Our analysis of 70+ enterprise policies from major tech companies revealed a critical insight: **85% of real-world governance policies require context** and cannot be effectively enforced with traditional static rules.

🤖 Prompt for AI Agents

In `@docs/benchmarks.md` at line 9, The phrase "70 + enterprise policies" contains an extra space and should be normalized to "70+ enterprise policies" to match the later usage; update the string "Our analysis of 70 + enterprise policies..." to remove the space between 70 and + so it reads "70+ enterprise policies" (leave surrounding text and "85% of real-world governance policies require context" unchanged).

coderabbitai · 2026-01-23T07:56:52Z

src/agents/feasibility_agent/agent.py

@@ -1,5 +1,5 @@
 """
-Rule Feasibility Agent implementation with error handling and retry logic.
+Rule Feasibility Agent implementation with error handling  and retry logic.


⚠️ Potential issue | 🟡 Minor

Fix minor typo in docstring (double space).

There’s an extra space between “handling” and “and.”

✏️ Proposed fix

-Rule Feasibility Agent implementation with error handling and retry logic. +Rule Feasibility Agent implementation with error handling and retry logic.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Rule Feasibility Agent implementation with error handling and retry logic.

Rule Feasibility Agent implementation with error handling and retry logic.

🤖 Prompt for AI Agents

In `@src/agents/feasibility_agent/agent.py` at line 2, Fix the double space in the module/class docstring: locate the module-level docstring at the top of src/agents/feasibility_agent/agent.py (and any docstring on the FeasibilityAgent class if present) that currently reads "error handling and retry logic." and remove the extra space so it reads "error handling and retry logic."

coderabbitai · 2026-01-23T07:56:52Z

src/agents/repository_analysis_agent/models.py

+def parse_github_repo_identifier(identifier: str) -> str:
    """
-    Normalize a GitHub repository identifier.
-
-    Accepts:
-    - owner/repo
-    - https://github.com/owner/repo
-    - https://github.com/owner/repo.git
-    - owner/repo/
+    Normalizes various GitHub identifiers into 'owner/repo' format.
+    Used by tests to verify repository strings.
    """
-    raw = (value or "").strip()
-    if not raw:
-        return ""
-
-    if raw.startswith("https://") or raw.startswith("http://"):
-        parts = raw.split("/")
-        try:
-            gh_idx = parts.index("github.com")
-        except ValueError:
-            # Could be enterprise; keep as-is and let API validation fail.
-            return raw.rstrip("/").removesuffix(".git")
-
-        owner = parts[gh_idx + 1] if len(parts) > gh_idx + 1 else ""
-        repo = parts[gh_idx + 2] if len(parts) > gh_idx + 2 else ""
-        return f"{owner}/{repo}".rstrip("/").removesuffix(".git")
+    # Remove protocol and domain
+    clean_id = identifier.replace("https://github.com/", "").replace("http://github.com/", "")

-    return raw.rstrip("/").removesuffix(".git")
+    # Remove .git suffix and trailing slashes
+    clean_id = clean_id.replace(".git", "").strip("/")

+    return clean_id


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if giturlparse is used anywhere in the codebase rg -n "giturlparse" --type py

Repository: warestack/watchflow

Length of output: 182

🏁 Script executed:

#!/bin/bash # Check if giturlparse is in project dependencies fd -e "requirements" -e "txt" -e "toml" -e "cfg" -e "ini" | xargs grep -l "giturlparse" 2>/dev/null || echo "giturlparse not found in dependency files"

Repository: warestack/watchflow

Length of output: 77

🏁 Script executed:

#!/bin/bash # Find and examine the parse_github_repo_identifier function usage rg -n "parse_github_repo_identifier" --type py -A 2 -B 2

Repository: warestack/watchflow

Length of output: 1850

🏁 Script executed:

#!/bin/bash # Check for SSH URL patterns that might be handled rg -n "git@github" --type py

Repository: warestack/watchflow

Length of output: 146

🏁 Script executed:

#!/bin/bash # Check pyproject.toml for giturlparse dependency cat pyproject.toml | grep -A 50 "dependencies" | head -60

Repository: warestack/watchflow

Length of output: 1566

🏁 Script executed:

#!/bin/bash # Check the test file to understand expected behavior cat -n tests/unit/agents/test_repository_analysis_models.py | head -30

Repository: warestack/watchflow

Length of output: 1296

🏁 Script executed:

#!/bin/bash # Look at the full models.py file to see the complete context cat -n src/agents/repository_analysis_agent/models.py | head -50

Repository: warestack/watchflow

Length of output: 2507

Use giturlparse to handle SSH and HTTPS URLs consistently.

The function has access to giturlparse (a project dependency), but the implementation only handles HTTPS URLs. SSH URLs like [email protected]:owner/repo.git will not be parsed correctly and would be returned unparsed. Since giturlparse is already a dependency, it should be used to properly handle both URL formats and edge cases.

Suggested implementation

+from giturlparse import parse as parse_git_url + + def parse_github_repo_identifier(identifier: str) -> str: """ Normalizes various GitHub identifiers into 'owner/repo' format. Used by tests to verify repository strings. """ - # Remove protocol and domain - clean_id = identifier.replace("https://github.com/", "").replace("http://github.com/", "") - - # Remove .git suffix and trailing slashes - clean_id = clean_id.replace(".git", "").strip("/") - - return clean_id + parsed = parse_git_url(identifier) + if parsed.owner and parsed.repo: + return f"{parsed.owner}/{parsed.repo}" + # Fallback: assume already in owner/repo format + return identifier.replace(".git", "").strip("/")

🤖 Prompt for AI Agents

In `@src/agents/repository_analysis_agent/models.py` around lines 6 - 17, Replace the manual string hacks in parse_github_repo_identifier with giturlparse parsing: call giturlparse.parse(identifier), and if the result is valid use f"{parsed.owner}/{parsed.repo}" (or parsed.owner + "/" + parsed.name if repo is empty) to normalize SSH and HTTPS forms; if giturlparse reports an invalid URL, fall back to the existing strip/remove logic (remove protocol/domain, ".git", and trailing slashes) or raise a clear ValueError to surface bad input. Ensure you update parse_github_repo_identifier to import and use giturlparse so SSH URLs like "[email protected]:owner/repo.git" and https URLs are handled consistently.

@MT-superdev is this a leftover?

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

src/agents/repository_analysis_agent/nodes.py

coderabbitai · 2026-01-23T07:56:53Z

src/api/dependencies.py

+        # Token extraction—fragile if header format changes.
+        scheme, token = auth_header.split()
+        if scheme.lower() != "bearer":
+            return None
+
+        # TODO: Wire to real IdP (Supabase/Auth0). For now, fake user if token present. WARNING: Must verify signature in prod.
+        return User(id=123, username="authenticated_user", email="[email protected]", github_token=token)
+    except Exception as e:


⚠️ Potential issue | 🟠 Major

Bearer parsing currently grants “authenticated” status to any token.

Any string becomes a fixed User (same id/email), which (a) lets callers bypass anonymous rate limits and (b) collapses all “authenticated” users into one bucket. If the intent is real auth, please validate the token (IdP/GitHub) and populate a real identity; otherwise, treat unverified tokens as anonymous to avoid elevating privileges by default.

🤖 Prompt for AI Agents

In `@src/api/dependencies.py` around lines 34 - 41, The bearer parsing currently accepts any token and returns a fixed User; update the logic around auth_header / scheme / token (the block that does scheme, token = auth_header.split() and returns User(..., github_token=token)) so that you do not grant a fixed authenticated identity for arbitrary strings: either validate the token with the real IdP or GitHub (verify JWT signature or call the appropriate token-introspect/verify endpoint) and populate User with the real identity on success, or treat unverified tokens as anonymous and return None/anonymous user instead of User(id=123,...). Replace the TODO stub with a call to a verification helper (e.g., verify_idp_token or verify_github_token) and only construct/return User when that verification passes.

coderabbitai · 2026-01-23T07:56:53Z

src/core/models.py

+from pydantic import BaseModel, Field

-class EventType(Enum):
-    """Supported GitHub event types."""
+
+class User(BaseModel):
+    """
+    Represents an authenticated user in the system.
+    Used for dependency injection in API endpoints.
+    """
+
+    id: int
+    username: str
+    email: str | None = None
+    avatar_url: str | None = None
+    # storing the token allows the service layer to make requests on behalf of the user
+    github_token: str | None = Field(None, description="OAuth token for GitHub API access")


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's read the complete User model definition cat -n src/core/models.py

Repository: warestack/watchflow

Length of output: 2374

🏁 Script executed:

# Search for model_dump() calls on User instances rg "model_dump" --type py -B 2 -A 2

Repository: warestack/watchflow

Length of output: 45

🏁 Script executed:

# Search for places where User model is serialized or logged rg "User" --type py -B 1 -A 1 | grep -E "(model_dump|json|serialize|log)" | head -20

Repository: warestack/watchflow

Length of output: 201

🏁 Script executed:

# Check if there are any logging patterns that might serialize the User model rg "logger|print|log" src/core/models.py --type py -B 2 -A 2

Repository: warestack/watchflow

Length of output: 225

🏁 Script executed:

# Search for token-related fields in the codebase to see if there's a pattern rg "token" --type py | grep -i "field\|str" | head -20

Repository: warestack/watchflow

Length of output: 2211

🏁 Script executed:

# Search for User model being used in endpoints or response models rg "response_model.*User|User.*response_model" --type py

Repository: warestack/watchflow

Length of output: 45

🏁 Script executed:

# Search for endpoints that return User or use User in routes rg "def.*\(.*User|-> User" --type py -B 2 -A 5

Repository: warestack/watchflow

Length of output: 1881

🏁 Script executed:

# Check for FastAPI route definitions that might serialize User rg "@.*\(|@router" --type py -A 5 | grep -E "(def|User|response_model)" | head -30

Repository: warestack/watchflow

Length of output: 2627

🏁 Script executed:

# Look for any JSON serialization patterns with User rg "json|serialize|dump" src/ --type py | grep -i user

Repository: warestack/watchflow

Length of output: 45

🏁 Script executed:

# Check the main application file to see how User is used find . -name "main.py" -o -name "app.py" -o -name "__main__.py" | head -5

Repository: warestack/watchflow

Length of output: 76

🏁 Script executed:

# Search for dependency injection patterns where User is provided rg "Depends\|get_current_user" --type py -B 2 -A 2

Repository: warestack/watchflow

Length of output: 45

🏁 Script executed:

# Check the api/dependencies.py file to understand User creation cat -n src/api/dependencies.py

Repository: warestack/watchflow

Length of output: 2193

🏁 Script executed:

# Search for all function definitions in api directory to see return types rg "async def|def " src/api/ --type py -A 1 | grep -E "(def |-> )" | head -30

Repository: warestack/watchflow

Length of output: 992

🏁 Script executed:

# Check if User is ever returned as response_model in any endpoint rg "response_model|return.*User|User\(" src/api/ --type py -B 1 -A 1

Repository: warestack/watchflow

Length of output: 569

🏁 Script executed:

# Search for any serialization or logging of User objects rg "str\(user|user\.|\.model_dump|\.dict\(" src/api/ --type py -B 1 -A 1

Repository: warestack/watchflow

Length of output: 466

🏁 Script executed:

# Check where github_token is actually used from the User object rg "user\.github_token|github_token" src/api/ --type py -B 2 -A 2

Repository: warestack/watchflow

Length of output: 503

🏁 Script executed:

# Check if User objects are ever returned in any endpoint responses rg "return user|return.*User" src/api/ --type py -B 3 -A 1

Repository: warestack/watchflow

Length of output: 728

🏁 Script executed:

# Verify if FastAPI response_model includes User anywhere rg "response_model" src/api/ --type py -B 2 -A 2

Repository: warestack/watchflow

Length of output: 357

🏁 Script executed:

# Check if there's any JSON response serialization of User rg "JSONResponse|json\.dumps" src/api/ --type py -B 2 -A 2

Repository: warestack/watchflow

Length of output: 45

🏁 Script executed:

# Check all endpoints to see their return types find src/api -name "*.py" -exec grep -l "async def\|def " {} \; | xargs -I {} sh -c 'echo "=== {} ===" && grep -A 3 "async def\|def " {}'

Repository: warestack/watchflow

Length of output: 2278

🏁 Script executed:

# Check if there's any logging that might include User objects rg "logger\.|print\(" src/api/ --type py -B 1 -A 1 | grep -C 2 "user"

Repository: warestack/watchflow

Length of output: 237

🏁 Script executed:

# Check the actual endpoint that uses user parameter to see what happens with it cat -n src/api/recommendations.py | head -50

Repository: warestack/watchflow

Length of output: 1834

🏁 Script executed:

# Search for any debugger or pdb usage that might print User objects rg "pdb|breakpoint|print\(.*user" --type py

Repository: warestack/watchflow

Length of output: 45

🏁 Script executed:

# Check all files in the codebase to see if github_token is accessed from User objects rg "\.github_token|user\.github_token" --type py

Repository: warestack/watchflow

Length of output: 45

🏁 Script executed:

# Check recommendations.py in detail to see the full endpoint implementation cat -n src/api/recommendations.py | sed -n '50,100p'

Repository: warestack/watchflow

Length of output: 2281

🏁 Script executed:

# Check if User object is ever passed to any logging or debugging functions rg "logger\.(debug|info|warning|error)\(" src/api/ --type py -B 1 -A 1 | head -30

Repository: warestack/watchflow

Length of output: 707

🏁 Script executed:

# Verify Pydantic version to confirm SecretStr and exclude behavior rg "pydantic" pyproject.toml requirements.txt setup.py 2>/dev/null || echo "No version file found"

Repository: warestack/watchflow

Length of output: 122

🏁 Script executed:

# Check if there are any other models that might handle secrets differently rg "SecretStr|exclude=True" --type py

Repository: warestack/watchflow

Length of output: 45

Secure github_token using SecretStr with exclude=True.
While github_token is not currently serialized in API responses, it will be included in model_dump() by default if called. Use SecretStr and exclude it from serialization to follow Pydantic security best practices and prevent accidental exposure through future logging or debugging.

🛡️ Proposed fix

-from pydantic import BaseModel, Field +from pydantic import BaseModel, Field, SecretStr @@ - github_token: str | None = Field(None, description="OAuth token for GitHub API access") + github_token: SecretStr | None = Field( + None, + description="OAuth token for GitHub API access", + exclude=True, + repr=False, + )

🤖 Prompt for AI Agents

In `@src/core/models.py` around lines 4 - 18, Change the User.github_token field to use pydantic's SecretStr and mark it excluded from serialization; specifically import SecretStr and update the github_token declaration in class User to use type SecretStr with Field(default=None, description="OAuth token for GitHub API access", exclude=True, repr=False) so the secret is stored securely and omitted from model_dump()/serialization and reprs.

coderabbitai · 2026-01-23T07:56:53Z

src/integrations/github/api.py

+        import httpx
+        import structlog
+
+        logger = structlog.get_logger()


⚠️ Potential issue | 🟡 Minor

Dead code: httpx.HTTPStatusError handler will never trigger.

This method uses aiohttp for HTTP requests but catches httpx.HTTPStatusError. Since aiohttp raises aiohttp.ClientResponseError (not httpx exceptions), lines 1031-1039 are dead code.

Also, importing httpx inside the function (line 950) is unnecessary and misleading.

🐛 Proposed fix

- import httpx import structlog logger = structlog.get_logger() # ... rest of method ... - except httpx.HTTPStatusError as e: - logger.error( - "pr_fetch_http_error", - repo=repo_full_name, - status_code=e.response.status_code, - error_type="network_error", - error=str(e), - ) - return [] except Exception as e: logger.error("pr_fetch_unexpected_error", repo=repo_full_name, error_type="unknown_error", error=str(e)) return []

Also applies to: 1031-1039

🤖 Prompt for AI Agents

In `@src/integrations/github/api.py` around lines 950 - 953, Inside the function that performs HTTP requests in src/integrations/github/api.py remove the local "import httpx" and the except block catching httpx.HTTPStatusError (this is dead because the code uses aiohttp); instead catch aiohttp.ClientResponseError (or aiohttp.ClientError for broader coverage) in the same except block and log the caught exception via the existing logger (structlog.get_logger()) so the error details are recorded; locate the try/except around the aiohttp request and replace the httpx-specific handler with the aiohttp exception type and remove the unused import.

coderabbitai · 2026-01-23T07:56:53Z

src/integrations/github/service.py

+        async with httpx.AsyncClient() as client:
+            for filepath in files_to_check:
+                # Tricky: Public repos can be read without auth, Private need auth
+                # We use the 'contents' API
+                check_url = f"{self.BASE_URL}/repos/{owner}/{repo}/contents/{filepath}"
+                resp = await client.get(check_url, headers=headers)
+                if resp.status_code == 200:
+                    found_files.append(filepath)
+


⚠️ Potential issue | 🟠 Major

Surface repo-level errors instead of silently treating them as missing files.

Right now any non‑200 (invalid token, rate limit, or missing repo) is interpreted as “file missing,” which can return misleading recommendations and hide real failures. Consider a preflight repo check (or explicit 401/403 handling) before the file loop so repo/rate-limit issues are surfaced.

🛠️ Suggested fix (preflight repo validation)

async with httpx.AsyncClient() as client: + repo_check_url = f"{self.BASE_URL}/repos/{owner}/{repo}" + repo_resp = await client.get(repo_check_url, headers=headers) + if repo_resp.status_code == 404: + raise GitHubResourceNotFoundError(f"Repo {owner}/{repo} not found") + if repo_resp.status_code == 403 and "rate limit" in repo_resp.text.lower(): + raise GitHubRateLimitError("GitHub API rate limit exceeded") + repo_resp.raise_for_status() + for filepath in files_to_check: # Tricky: Public repos can be read without auth, Private need auth # We use the 'contents' API check_url = f"{self.BASE_URL}/repos/{owner}/{repo}/contents/{filepath}" resp = await client.get(check_url, headers=headers) if resp.status_code == 200: found_files.append(filepath)

🤖 Prompt for AI Agents

In `@src/integrations/github/service.py` around lines 66 - 74, The current loop treating any non-200 from the contents API as "file missing" hides repo-level errors (auth, rate limits, missing repo); add a preflight repo validation before iterating files: call the repo endpoint (e.g., f"{self.BASE_URL}/repos/{owner}/{repo}") or an authenticated HEAD on the repo to check for 200 and explicitly handle 401/403/404/rate-limit responses (raise or return an error) rather than appending to found_files; keep the existing per-file checks only after the preflight succeeds and reuse headers and the AsyncClient used around files_to_check to avoid duplicate clients.

coderabbitai · 2026-01-23T07:56:53Z

src/main.py

+# We explicitly allow all origins ("*") to prevent the browser from blocking requests
+# from your local file system or different localhost ports.

 app.add_middleware(
    CORSMiddleware,
-    allow_origins=config.cors.origins,
+    allow_origins=["*"],  # Explicitly allow all for development
    allow_credentials=True,
    allow_methods=["*"],
-    allow_headers=config.cors.headers,
+    allow_headers=["*"],


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

CORS wildcard origins credentials allow_credentials browser specification

💡 Result:

You cannot combine Access-Control-Allow-Origin: * with credentialed CORS requests. If the browser sends a CORS request with credentials (cookies, HTTP auth, TLS client certs—e.g. fetch(..., { credentials: "include" }) / xhr.withCredentials = true), then the response must not use the wildcard * for Access-Control-Allow-Origin; browsers will block JS access to the response and surface a CORS error. [1], [2]

What the spec requires instead (credentialed CORS):

Respond with one explicit origin: Access-Control-Allow-Origin: <value of the request Origin header>

And include: Access-Control-Allow-Credentials: true (the only valid value) [1], [3]

Preflight note: CORS preflight (OPTIONS) requests “must never include credentials”; the server indicates whether the actual request may include credentials by returning Access-Control-Allow-Credentials: true. [4]

Sources:
[1] W3C CORS (2020 snapshot): wildcard * “cannot be used for a resource that supports credentials” and credentialed responses must echo Origin + Access-Control-Allow-Credentials: true.
[2] MDN CORS error: credentials not supported when Access-Control-Allow-Origin is *.
[3] MDN Access-Control-Allow-Credentials: credentials allowed only when server sends Access-Control-Allow-Credentials: true.
[4] MDN CORS guide: preflight requests must never include credentials; server must return Access-Control-Allow-Credentials: true to allow credentialed actual request.

🏁 Script executed:

# Check if there's more context in the repository about this CORS configuration find . -type f -name "*.py" | head -20 | xargs grep -l "CORSMiddleware" 2>/dev/null | head -5

Repository: warestack/watchflow

Length of output: 45

🏁 Script executed:

# Get the actual file context around the lines mentioned if [ -f "src/main.py" ]; then head -n 100 src/main.py | tail -n 30 fi

Repository: warestack/watchflow

Length of output: 773

🏁 Script executed:

# Search for FastAPI/Starlette CORSMiddleware implementation or behavior with wildcard + credentials rg -i "allow_credentials.*allow_origins|allow_origins.*allow_credentials" --type py -A 5 -B 5

Repository: warestack/watchflow

Length of output: 45

🌐 Web query:

FastAPI Starlette CORSMiddleware allow_origins wildcard allow_credentials=True behavior

💡 Result:

In Starlette/FastAPI CORSMiddleware, setting:

allow_origins=["*"] allow_credentials=True

is effectively “self-contradictory” for browsers: credentialed CORS responses must not use Access-Control-Allow-Origin: *; they must echo a specific origin. Browsers will block credentialed responses if * is used. [3], [4]

What Starlette actually does (current behavior)

Starlette computes:

allow_all_origins = True (because * is in allow_origins)

preflight_explicit_allow_origin = True (because allow_credentials=True) [2]

Preflight (OPTIONS with Origin + Access-Control-Request-Method):

Starlette echoes the request origin into Access-Control-Allow-Origin: <requested_origin> (since all origins are “allowed” when * is configured), and includes Access-Control-Allow-Credentials: true. [2]

Non-preflight (“simple”) responses:

Starlette initially sets Access-Control-Allow-Origin: * and Access-Control-Allow-Credentials: true. [2]

But if the request includes a Cookie header, Starlette overrides Access-Control-Allow-Origin to the explicit request origin and adds Vary: Origin. [2]

Practical implication

If your credentialed requests rely on cookies, Starlette will typically echo the origin (because it detects Cookie). [2]

If your credentialed requests are “credentialed” in the browser sense but don’t include a Cookie header (e.g., you rely on other credential modes / auth behaviors), you can still end up with Access-Control-Allow-Origin: * on the actual response, which browsers reject for credentialed CORS. [3], [4]

Recommended configuration

If you want credentials, don’t use *—list explicit origins (or use an allowlist regex), per Starlette’s own docs. [1], [4]

Sources: Starlette CORSMiddleware docs [1], FastAPI/Starlette CORSMiddleware source listing [2], MDN CORS credentialed-wildcard rules [4] and related error explanation [3].

Wildcard origins + credentials violates CORS spec and will break credentialed requests.

Browsers reject Access-Control-Allow-Origin: * when credentials are allowed (per CORS spec). Starlette's CORSMiddleware with this configuration behaves unpredictably: it echoes the origin for preflight and cookie-based requests but may still respond with * for non-cookie credentialed requests, which browsers will block. Use explicit origins for credentialed requests or disable credentials.

🔒 Suggested fix (choose one)

app.add_middleware( CORSMiddleware, - allow_origins=["*"], # Explicitly allow all for development - allow_credentials=True, + allow_origins=["*"], # dev only; use explicit origins in prod + allow_credentials=False, allow_methods=["*"], allow_headers=["*"], )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# We explicitly allow all origins ("*") to prevent the browser from blocking requests

# from your local file system or different localhost ports.

app.add_middleware(

CORSMiddleware,

allow_origins=config.cors.origins,

allow_origins=["*"], # Explicitly allow all for development

allow_credentials=True,

allow_methods=["*"],

allow_headers=config.cors.headers,

allow_headers=["*"],

# We explicitly allow all origins ("*") to prevent the browser from blocking requests

# from your local file system or different localhost ports.

app.add_middleware(

CORSMiddleware,

allow_origins=["*"], # dev only; use explicit origins in prod

allow_credentials=False,

allow_methods=["*"],

allow_headers=["*"],

)

🤖 Prompt for AI Agents

In `@src/main.py` around lines 88 - 96, The current CORSMiddleware setup (app.add_middleware with CORSMiddleware and allow_origins=["*"], allow_credentials=True) violates the CORS spec; either replace the wildcard origins with an explicit list of allowed origins (set allow_origins to the exact host(s) you expect) when keepng allow_credentials=True, or if you must keep wildcard origins set allow_credentials=False; update the CORSMiddleware config accordingly in the app.add_middleware call to use explicit origins or disable credentials so browsers will accept the responses.

coderabbitai · 2026-01-23T07:56:53Z

tests/conftest.py

+import os
 import sys
 from pathlib import Path

+import pytest
+
+# 1. Ensure src is in path
 ROOT = Path(__file__).resolve().parent.parent
 SRC = ROOT / "src"
-
 if str(SRC) not in sys.path:
    sys.path.insert(0, str(SRC))
+
+
+# 2. Helper for environment mocking
+class Helpers:
+    @staticmethod
+    def mock_env(env_vars):
+        from unittest.mock import patch
+
+        return patch.dict(os.environ, env_vars)
+
+
+@pytest.fixture
+def helpers():
+    return Helpers
+
+
+# 3. Mock Environment Variables (Security First)
+# We do this BEFORE importing app code to ensure no real secrets are read
+@pytest.fixture(autouse=True)
+def mock_settings():
+    """Forces the test environment to use dummy values."""
+    with Helpers.mock_env(
+        {
+            "APP_CLIENT_ID_GITHUB": "mock-client-id",
+            "APP_CLIENT_SECRET_GITHUB": "mock-client-secret",
+            "WEBHOOK_SECRET_GITHUB": "mock-webhook-secret",
+            "PRIVATE_KEY_BASE64_GITHUB": "bW9jay1rZXk=",  # "mock-key" in base64
+            "AI_PROVIDER": "openai",
+            "OPENAI_API_KEY": "sk-mock-key",
+            "ENVIRONMENT": "test",
+        }
+    ):
+        yield


⚠️ Potential issue | 🟡 Minor

Avoid hard-coded base64 key to prevent secret-scanner false positives.
The literal on Line 42 is already flagged by Gitleaks; computing it at runtime avoids this while keeping validity.

🔧 Proposed fix

-import os +import base64 +import os @@ - "PRIVATE_KEY_BASE64_GITHUB": "bW9jay1rZXk=", # "mock-key" in base64 + "PRIVATE_KEY_BASE64_GITHUB": base64.b64encode(b"mock-key").decode(),

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

import os

import sys

from pathlib import Path

import pytest

# 1. Ensure src is in path

ROOT = Path(__file__).resolve().parent.parent

SRC = ROOT / "src"

if str(SRC) not in sys.path:

sys.path.insert(0, str(SRC))

# 2. Helper for environment mocking

class Helpers:

@staticmethod

def mock_env(env_vars):

from unittest.mock import patch

return patch.dict(os.environ, env_vars)

@pytest.fixture

def helpers():

return Helpers

# 3. Mock Environment Variables (Security First)

# We do this BEFORE importing app code to ensure no real secrets are read

@pytest.fixture(autouse=True)

def mock_settings():

"""Forces the test environment to use dummy values."""

with Helpers.mock_env(

{

"APP_CLIENT_ID_GITHUB": "mock-client-id",

"APP_CLIENT_SECRET_GITHUB": "mock-client-secret",

"WEBHOOK_SECRET_GITHUB": "mock-webhook-secret",

"PRIVATE_KEY_BASE64_GITHUB": "bW9jay1rZXk=", # "mock-key" in base64

"AI_PROVIDER": "openai",

"OPENAI_API_KEY": "sk-mock-key",

"ENVIRONMENT": "test",

}

):

yield

import base64

import os

import sys

from pathlib import Path

import pytest

# 1. Ensure src is in path

ROOT = Path(__file__).resolve().parent.parent

SRC = ROOT / "src"

if str(SRC) not in sys.path:

sys.path.insert(0, str(SRC))

# 2. Helper for environment mocking

class Helpers:

`@staticmethod`

def mock_env(env_vars):

from unittest.mock import patch

return patch.dict(os.environ, env_vars)

`@pytest.fixture`

def helpers():

return Helpers

# 3. Mock Environment Variables (Security First)

# We do this BEFORE importing app code to ensure no real secrets are read

`@pytest.fixture`(autouse=True)

def mock_settings():

"""Forces the test environment to use dummy values."""

with Helpers.mock_env(

{

"APP_CLIENT_ID_GITHUB": "mock-client-id",

"APP_CLIENT_SECRET_GITHUB": "mock-client-secret",

"WEBHOOK_SECRET_GITHUB": "mock-webhook-secret",

"PRIVATE_KEY_BASE64_GITHUB": base64.b64encode(b"mock-key").decode(),

"AI_PROVIDER": "openai",

"OPENAI_API_KEY": "sk-mock-key",

"ENVIRONMENT": "test",

}

):

yield

🧰 Tools

🪛 Gitleaks (8.30.0)

[high] 42-42: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🤖 Prompt for AI Agents

In `@tests/conftest.py` around lines 5 - 48, The hard-coded base64 literal in the mock_settings fixture (PRIVATE_KEY_BASE64_GITHUB) triggers secret scanners; change it to compute the value at runtime instead (e.g., base64-encode a non-secret byte string) so tests still get a valid value without a literal. Update the mock_settings fixture to import/use Python's base64 (or encode b"mock-key" and decode to str) and set PRIVATE_KEY_BASE64_GITHUB to that computed string rather than the literal "bW9jay1rZXk=".

…sponse structure

MT-superdev added 12 commits January 21, 2026 07:46

Matas: Minor overall refactor and fixed bugs

4bfd1c6

fix: Resolve all pytest failures (imports, api, mocks, async_client)

407e34e

fix: add respx to dev dependencies for CI compatibility

6ae016c

triggering ci cd

cfaae12

chore: update uv.lock with respx dependency

527d626

chore: triggering docs workflow

2c92405

fix: add docs dependencies for mkdocs build in CI

58931d7

chore: triggering docs workflow

8a647ca

fix: restrict docs deployment to upstream repo only

f088920

chore: triggering docs workflow

a28fedb

feat: Enhance Repository Analysis Agent with AI Immune System features

e851eb6

fix: updated uv.lock file

02276f3

MT-superdev requested a review from dkargatzis as a code owner January 23, 2026 07:45

gemini-code-assist bot reviewed Jan 23, 2026

View reviewed changes

coderabbitai bot reviewed Jan 23, 2026

View reviewed changes

feat: Enhance repository analysis with new hygiene metrics and API re…

963abb2

…sponse structure

MT-superdev closed this Jan 23, 2026

	Our analysis of 70 + enterprise policies from major tech companies revealed a critical insight: 85% of real-world governance policies require context and cannot be effectively enforced with traditional static rules.
	Our analysis of 70+ enterprise policies from major tech companies revealed a critical insight: 85% of real-world governance policies require context and cannot be effectively enforced with traditional static rules.

	Rule Feasibility Agent implementation with error handling and retry logic.
	Rule Feasibility Agent implementation with error handling and retry logic.

Matas/fix/fix actions #41

Matas/fix/fix actions #41

Uh oh!

Conversation

MT-superdev commented Jan 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

gemini-code-assist bot commented Jan 23, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

codecov-commenter commented Jan 23, 2026

Codecov Report

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dkargatzis Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

MT-superdev commented Jan 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 23, 2026 •

edited

Loading

coderabbitai bot Jan 23, 2026 •

edited

Loading