Skip to content

Conversation

@miroslavpojer
Copy link
Collaborator

@miroslavpojer miroslavpojer commented Nov 6, 2025

Release Notes:

  • Fixed CWE-295: Improper Certificate Validation by reimplementing the target method solution.

Summary by CodeRabbit

  • Refactor

    • Improved GitHub API integration for authenticated requests, added robust error handling and caching to reduce repeated network calls.
  • Documentation

    • Updated developer setup with local-run initialization and additional debug output.
  • Tests

    • Updated tests and mocks to support the new requester-based flow and to validate caching and error scenarios.

Fixes #206

…tificate Validation

- Code refactored to different solution.
…tificate Validation

- Code refactored to different solution.
@coderabbitai
Copy link

coderabbitai bot commented Nov 6, 2025

Walkthrough

Replaces direct HTTP GraphQL calls with a requester-based interface and injects a Requester into get_issues_for_pr; updates call sites and tests to pass the requester. Adds small DEVELOPER.md environment-init/debug lines for local action runs.

Changes

Cohort / File(s) Summary
Documentation Setup
DEVELOPER.md
Adds PROJECT_ROOT initialization, appends it to PYTHONPATH, and prints PYTHONPATH and cwd in the "Run Action Locally" snippet.
GraphQL Request Abstraction
release_notes_generator/utils/pull_request_utils.py
Replaces direct HTTP calls with requester.graphql_query(...); changes get_issues_for_pr signature to accept requester: Requester; adds GithubException handling and stricter payload validation and parsing.
Factory Caller Update
release_notes_generator/record/factory/default_record_factory.py
Passes requester=self._github.requester to get_issues_for_pr where PR-linked issues are retrieved.
Test Fixtures
tests/unit/conftest.py
Updates mock signature to mock_get_issues_for_pr(pull_number: int, requester: Requester) and adds Requester import.
Factory Unit Tests
tests/unit/release_notes_generator/record/factory/test_default_record_factory.py
Adjusts mocks to accept the requester parameter and imports Requester.
Pull Request Utils Tests
tests/unit/release_notes_generator/utils/test_pull_request_utils.py
Reworks tests to use a MockRequester, validates query/headers forwarded, asserts caching behavior and error handling for the requester-based flow.

Sequence Diagram(s)

sequenceDiagram
    participant Factory as DefaultRecordFactory
    participant Utils as pull_request_utils.get_issues_for_pr
    participant Requester as Requester
    participant GH as GitHub GraphQL API

    rect rgb(240,250,240)
    Note over Factory,Utils: New requester-injection flow
    Factory->>Utils: get_issues_for_pr(pr_number, requester)
    Utils->>Requester: graphql_query(query, headers)
    Requester->>GH: POST /graphql
    GH-->>Requester: (headers, payload)
    Requester-->>Utils: (headers, payload)
    alt payload valid
        Utils->>Utils: parse closingIssuesReferences
        Utils-->>Factory: set[string] of issues
    else GithubException from Requester
        Utils->>Utils: wrap as RuntimeError (with status/data)
        Utils-->>Factory: raise RuntimeError
    end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas needing extra attention:
    • release_notes_generator/utils/pull_request_utils.py — verify GraphQL query shape, payload validation, and correct use of GithubException.
    • Call sites (factory) and tests — ensure requester is passed consistently and caching semantics remain correct.
    • Tests in tests/unit/release_notes_generator/utils/test_pull_request_utils.py — confirm MockRequester behavior matches real Requester contract.

Possibly related PRs

Suggested reviewers

  • Zejnilovic
  • benedeki

Poem

🐇 I hopped through code to lend a paw,
Swapped raw requests for a Requester law.
Query hops in, payload hops out,
Cached answers saved from needless rout.
A tiny rabbit, neat and spry—code runs smooth, hop by hop! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 4.76% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description check ❓ Inconclusive The description references the linked issue and mentions the specific CWE fix, but lacks detailed information and does not include the provided checklist or context sections from the template. Consider expanding the description to include the checklist items and additional context sections as specified in the repository template for clarity and completeness.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main objective of fixing issues detected by Aquasec scanner, specifically addressing CWE-295 improper certificate validation.
Linked Issues check ✅ Passed The pull request successfully addresses issue #206 by replacing direct HTTP GraphQL calls with a requester-based flow that enforces proper certificate validation, eliminating the insecure verify=false pattern.
Out of Scope Changes check ✅ Passed All changes directly support the security objective of eliminating improper certificate validation. The test updates and mock adjustments are necessary to support the refactored implementation.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/206-Fix-issues-detected-by-Aquasec-scanner

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
release_notes_generator/utils/pull_request_utils.py (1)

67-75: Stop passing headers as GraphQL variables

github.Requester.Requester.graphql_query takes (query, variables)—right now we hand it the auth header dictionary. Because the query declares no variables, GitHub rejects the call and we surface a GithubException, so every PR/issue lookup fails. Please pass an empty (or real) variables map and rely on the authenticated Requester for headers, then drop the manual header block.

-    headers = {
-        "Authorization": f"Bearer {ActionInputs.get_github_token()}",
-        "Content-Type": "application/json",
-    }
-
-    try:
-        headers, payload = requester.graphql_query(query, headers)
+    try:
+        _, payload = requester.graphql_query(query, {})
🧹 Nitpick comments (1)
tests/unit/release_notes_generator/utils/test_pull_request_utils.py (1)

153-166: Clarify or remove the confusing comment about status and data attributes.

The comment on line 157 states "Do not set self.status or self.data directly" but doesn't explain why or what happens if you do. Since the test expects status=500 to appear in the error message (line 166), these attributes must be accessible—presumably set by the parent GithubException.__init__.

Consider either removing this comment or expanding it to explain the reasoning (e.g., "Parent class sets these via init" or document any issues encountered with direct assignment).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dcf920a and 6ffbe5c.

📒 Files selected for processing (6)
  • DEVELOPER.md (1 hunks)
  • release_notes_generator/record/factory/default_record_factory.py (1 hunks)
  • release_notes_generator/utils/pull_request_utils.py (3 hunks)
  • tests/unit/conftest.py (2 hunks)
  • tests/unit/release_notes_generator/record/factory/test_default_record_factory.py (2 hunks)
  • tests/unit/release_notes_generator/utils/test_pull_request_utils.py (5 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-09-28T14:13:38.979Z
Learnt from: miroslavpojer
Repo: AbsaOSS/generate-release-notes PR: 173
File: release_notes_generator/record/factory/default_record_factory.py:109-126
Timestamp: 2025-09-28T14:13:38.979Z
Learning: In release_notes_generator/record/factory/default_record_factory.py, there is a cross-repo fetch bug where parent_issue_id in format "owner/repo#number" is fetched from the wrong repository because the code only queries data.repository instead of parsing the owner/repo from the ID and using the GitHub client to fetch from the correct repository.

Applied to files:

  • release_notes_generator/record/factory/default_record_factory.py
  • tests/unit/release_notes_generator/record/factory/test_default_record_factory.py
📚 Learning: 2025-09-28T14:09:54.325Z
Learnt from: miroslavpojer
Repo: AbsaOSS/generate-release-notes PR: 173
File: tests/release_notes/record/factory/test_default_record_factory.py:0-0
Timestamp: 2025-09-28T14:09:54.325Z
Learning: In tests/release_notes/record/factory/test_default_record_factory.py, the mock_get_issues_for_pr_no_issues helper function parameter should remain as `str` type (not changed to `int`) because changing it to `int` causes test failures, even though the parameter is unused and flagged by linters.

Applied to files:

  • release_notes_generator/record/factory/default_record_factory.py
  • tests/unit/release_notes_generator/utils/test_pull_request_utils.py
  • tests/unit/release_notes_generator/record/factory/test_default_record_factory.py
  • tests/unit/conftest.py
📚 Learning: 2025-09-28T13:07:31.341Z
Learnt from: miroslavpojer
Repo: AbsaOSS/generate-release-notes PR: 173
File: release_notes_generator/record/factory/default_record_factory.py:0-0
Timestamp: 2025-09-28T13:07:31.341Z
Learning: In release_notes_generator/record/factory/default_record_factory.py, the user miroslavpojer prefers to keep unused parameters like `skip_rec` in the `register_pull_request` function for future usage, even when flagged by linters as unused.

Applied to files:

  • release_notes_generator/record/factory/default_record_factory.py
  • tests/unit/release_notes_generator/record/factory/test_default_record_factory.py
📚 Learning: 2025-09-24T07:24:03.693Z
Learnt from: miroslavpojer
Repo: AbsaOSS/generate-release-notes PR: 168
File: release_notes_generator/utils/pull_request_utils.py:69-70
Timestamp: 2025-09-24T07:24:03.693Z
Learning: In release_notes_generator/utils/pull_request_utils.py, the verify=False parameter in the requests.post call for GitHub API access should be kept as per the repository maintainer's explicit requirement, despite security best practices suggesting otherwise.

Applied to files:

  • release_notes_generator/record/factory/default_record_factory.py
  • tests/unit/release_notes_generator/utils/test_pull_request_utils.py
  • release_notes_generator/utils/pull_request_utils.py
🧬 Code graph analysis (4)
release_notes_generator/record/factory/default_record_factory.py (1)
release_notes_generator/utils/pull_request_utils.py (1)
  • get_issues_for_pr (57-94)
tests/unit/release_notes_generator/utils/test_pull_request_utils.py (1)
release_notes_generator/utils/pull_request_utils.py (1)
  • get_issues_for_pr (57-94)
release_notes_generator/utils/pull_request_utils.py (1)
tests/unit/release_notes_generator/utils/test_pull_request_utils.py (6)
  • graphql_query (105-120)
  • graphql_query (135-144)
  • graphql_query (160-161)
  • graphql_query (173-174)
  • graphql_query (186-196)
  • graphql_query (210-222)
tests/unit/conftest.py (1)
tests/unit/release_notes_generator/record/factory/test_default_record_factory.py (1)
  • mock_get_issues_for_pr (169-174)
🪛 Ruff (0.14.3)
tests/unit/release_notes_generator/utils/test_pull_request_utils.py

135-135: Unused method argument: query

(ARG002)


160-160: Unused method argument: query

(ARG002)


160-160: Unused method argument: headers

(ARG002)


173-173: Unused method argument: query

(ARG002)


186-186: Unused method argument: query

(ARG002)

tests/unit/release_notes_generator/record/factory/test_default_record_factory.py

341-341: Unused function argument: pull_number

(ARG001)


341-341: Unused function argument: requester

(ARG001)

release_notes_generator/utils/pull_request_utils.py

76-76: Avoid specifying long messages outside the exception class

(TRY003)


79-79: Prefer TypeError exception for invalid type

(TRY004)


79-79: Avoid specifying long messages outside the exception class

(TRY003)


84-84: Avoid specifying long messages outside the exception class

(TRY003)


87-87: Avoid specifying long messages outside the exception class

(TRY003)

tests/unit/conftest.py

60-60: Unused function argument: pull_number

(ARG001)


60-60: Unused function argument: requester

(ARG001)

🔇 Additional comments (6)
tests/unit/release_notes_generator/utils/test_pull_request_utils.py (6)

100-128: LGTM! Well-structured test for the new Requester interface.

The MockRequester correctly simulates the GraphQL query flow and validates all key aspects: query formatting, header injection, and response structure.


134-147: LGTM! Correct handling of empty nodes case.

The static analysis warning about the unused query parameter is a false positive—mock methods must match the expected signature.


172-178: LGTM! Correctly tests malformed response handling.

The test validates that missing nested keys in the payload raise a KeyError, which aligns with the production code's dictionary access pattern.


185-202: LGTM! Caching behavior correctly validated.

The test confirms that @lru_cache works as expected with the new requester parameter—identical PR numbers result in cache hits and avoid redundant network calls.


209-229: LGTM! Cache isolation correctly tested.

The test cleverly extracts the PR number from the formatted query to return PR-specific results, confirming that different PR numbers create distinct cache entries and trigger separate network calls.


135-135: Static analysis warnings are false positives for mock implementations.

The unused parameter warnings for query and headers in various MockRequester.graphql_query methods are expected—mock implementations must match the expected signature even when they don't use all parameters. This is a standard testing pattern and can be safely ignored.

Also applies to: 160-160, 173-173, 186-186

DEVELOPER.md Outdated

# Debugging statements
echo "PYTHONPATH: ${PYTHONPATH}"
echo "Current working directory: $(pwd)"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just my nitpick, I would use here the PROJECT_ROOT, since constant assignment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not, changed.

if "data" not in payload:
raise RuntimeError("Malformed GraphQL response: missing 'data'")

test_set = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this naming is missleading.

Copy link
Collaborator

@tmikula-dev tmikula-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please respond to my comments. Good job tho!

Copy link
Collaborator

@tmikula-dev tmikula-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. Nice.

@miroslavpojer miroslavpojer merged commit 15623af into master Nov 6, 2025
6 of 7 checks passed
@miroslavpojer miroslavpojer deleted the feature/206-Fix-issues-detected-by-Aquasec-scanner branch November 6, 2025 12:40
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
release_notes_generator/utils/pull_request_utils.py (2)

72-87: Consider using TypeError for type validation.

The error handling is comprehensive and correctly addresses both HTTP-level and GraphQL-level errors. However, line 79 should use TypeError rather than RuntimeError for type validation to better reflect the nature of the error.

Apply this diff:

     if not isinstance(payload, dict):
-        raise RuntimeError(f"Malformed response payload type: {type(payload)}")
+        raise TypeError(f"Expected dict payload, got {type(payload).__name__}")

Note: The static analysis tool also flags long error messages (TRY003), but these are acceptable given the need for clear diagnostics in API error handling.


89-92: Add defensive checks for nested GraphQL response structure.

The code accesses deeply nested keys without validation. If the response structure is incomplete (e.g., missing repository, pullRequest, or closingIssuesReferences), a KeyError will be raised at runtime.

Add defensive access:

     if "data" not in payload:
         raise RuntimeError("Malformed GraphQL response: missing 'data'")

+    try:
+        nodes = payload["data"]["repository"]["pullRequest"]["closingIssuesReferences"]["nodes"]
+    except KeyError as e:
+        raise RuntimeError(f"Malformed GraphQL response structure: missing key {e}") from e
+
     return {
         f"{owner}/{name}#{node['number']}"
-        for node in payload["data"]["repository"]["pullRequest"]["closingIssuesReferences"]["nodes"]
+        for node in nodes
     }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ffbe5c and e1e2f10.

📒 Files selected for processing (2)
  • DEVELOPER.md (1 hunks)
  • release_notes_generator/utils/pull_request_utils.py (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-24T07:24:03.693Z
Learnt from: miroslavpojer
Repo: AbsaOSS/generate-release-notes PR: 168
File: release_notes_generator/utils/pull_request_utils.py:69-70
Timestamp: 2025-09-24T07:24:03.693Z
Learning: In release_notes_generator/utils/pull_request_utils.py, the verify=False parameter in the requests.post call for GitHub API access should be kept as per the repository maintainer's explicit requirement, despite security best practices suggesting otherwise.

Applied to files:

  • release_notes_generator/utils/pull_request_utils.py
🧬 Code graph analysis (1)
release_notes_generator/utils/pull_request_utils.py (1)
tests/unit/release_notes_generator/utils/test_pull_request_utils.py (6)
  • graphql_query (105-120)
  • graphql_query (135-144)
  • graphql_query (160-161)
  • graphql_query (173-174)
  • graphql_query (186-196)
  • graphql_query (210-222)
🪛 Ruff (0.14.3)
release_notes_generator/utils/pull_request_utils.py

76-76: Avoid specifying long messages outside the exception class

(TRY003)


79-79: Prefer TypeError exception for invalid type

(TRY004)


79-79: Avoid specifying long messages outside the exception class

(TRY003)


84-84: Avoid specifying long messages outside the exception class

(TRY003)


87-87: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (4)
DEVELOPER.md (3)

179-180: Environment setup looks good.

The PROJECT_ROOT assignment and PYTHONPATH append correctly prepare the local environment for action testing, preserving any pre-existing PYTHONPATH entries.


182-185: Debug output aids local testing.

The echo statements help developers verify the environment is initialized correctly. This is a helpful addition for debugging local test runs.


179-185: Minor context note: DEVELOPER.md changes are environmental setup only.

These changes enhance the local development experience by ensuring PYTHONPATH is correctly configured and providing visibility into the environment. However, they don't directly address the PR's primary objective (CWE-295 certificate validation fix). The actual remediation should be in the main codebase where HTTP/GraphQL requests are made. Confirm that the requester-based interface (mentioned in the AI summary) properly validates certificates in production.

release_notes_generator/utils/pull_request_utils.py (1)

24-25: LGTM! Appropriate imports for the refactored implementation.

The PyGithub imports replace the previous requests library usage, addressing the CWE-295 security concern.

Comment on lines 56 to +57
@lru_cache(maxsize=1024)
def get_issues_for_pr(pull_number: int) -> set[str]:
def get_issues_for_pr(pull_number: int, requester: Requester) -> set[str]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: lru_cache incompatible with Requester parameter.

The @lru_cache decorator requires all parameters to be hashable to generate cache keys. The Requester object is likely unhashable, which will cause a TypeError at runtime. Even if it were hashable, caching based on the requester instance doesn't make semantic sense—the same PR number should return the same results regardless of which requester instance is used.

Refactor to exclude the requester from the cache key:

-@lru_cache(maxsize=1024)
-def get_issues_for_pr(pull_number: int, requester: Requester) -> set[str]:
-    """Fetch closing issue numbers for a PR via GitHub GraphQL and return them as a set."""
+def get_issues_for_pr(pull_number: int, requester: Requester) -> set[str]:
+    """Fetch closing issue numbers for a PR via GitHub GraphQL and return them as a set."""
+    return _get_issues_for_pr_cached(pull_number, requester)
+
+
+@lru_cache(maxsize=1024)
+def _get_issues_for_pr_cached(pull_number: int, requester: Requester) -> set[str]:
+    """Internal cached implementation."""
     owner = ActionInputs.get_github_owner()
     name = ActionInputs.get_github_repo_name()

Wait, that doesn't solve the problem. Better approach:

+@lru_cache(maxsize=1024)
+def _fetch_issues_from_graphql(pull_number: int) -> set[str]:
+    """Fetch closing issue numbers for a PR via GitHub GraphQL and return them as a set."""
+    # This function cannot accept requester as it must remain cacheable
+    # Instead, we'll need to refactor to either:
+    # 1. Remove caching entirely if requester varies
+    # 2. Pass requester globally or through a different mechanism
+    # 3. Cache at a higher level where requester is consistent
+    pass
+
 @lru_cache(maxsize=1024)
 def get_issues_for_pr(pull_number: int, requester: Requester) -> set[str]:

Actually, let me provide a cleaner solution that removes the requester from the cache signature:

-@lru_cache(maxsize=1024)
 def get_issues_for_pr(pull_number: int, requester: Requester) -> set[str]:
     """Fetch closing issue numbers for a PR via GitHub GraphQL and return them as a set."""
+    # Delegate to cached implementation that only caches based on pull_number
+    return _get_issues_for_pr_impl(pull_number, requester)
+
+
+# Cache key is based on (pull_number, owner, name) tuple for the query
+_cache = {}
+
+
+def _get_issues_for_pr_impl(pull_number: int, requester: Requester) -> set[str]:
+    """Internal implementation with caching based on pull_number only."""
     owner = ActionInputs.get_github_owner()
     name = ActionInputs.get_github_repo_name()
+    
+    cache_key = (pull_number, owner, name)
+    if cache_key in _cache:
+        return _cache[cache_key]
+    
     query = ISSUES_FOR_PRS.format(
         number=pull_number,
         owner=owner,
         name=name,
         first=LINKED_ISSUES_MAX,
     )
     headers = {
         "Authorization": f"Bearer {ActionInputs.get_github_token()}",
         "Content-Type": "application/json",
     }

     try:
         headers, payload = requester.graphql_query(query, headers)
     except GithubException as e:
         # e.status (int), e.data (dict/str) often contains useful details
         raise RuntimeError(f"GitHub HTTP error {getattr(e, 'status', '?')}: {getattr(e, 'data', e)}") from e

     if not isinstance(payload, dict):
         raise RuntimeError(f"Malformed response payload type: {type(payload)}")

     # GraphQL-level errors come inside a successful HTTP 200
     if "errors" in payload:
         messages = "; ".join(err.get("message", str(err)) for err in payload["errors"])
         raise RuntimeError(f"GitHub GraphQL errors: {messages}")

     if "data" not in payload:
         raise RuntimeError("Malformed GraphQL response: missing 'data'")

-    return {
+    result = {
         f"{owner}/{name}#{node['number']}"
         for node in payload["data"]["repository"]["pullRequest"]["closingIssuesReferences"]["nodes"]
     }
+    
+    _cache[cache_key] = result
+    return result

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In release_notes_generator/utils/pull_request_utils.py around lines 56-57, the
function is decorated with @lru_cache but accepts a Requester object
(unhashable) which will raise TypeError and incorrectly couples cache keys to
the requester; refactor by removing requester from the cached function signature
— create a cached helper that takes only pull_number and returns the issue set,
and have a thin non-cached wrapper that accepts (pull_number, requester) and
calls the cached helper after using the requester to fetch or inject necessary
data; update type hints accordingly and remove @lru_cache from any function that
still accepts Requester.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix issues detected by Aquasec scanner - CWE-295: Improper Certificate Validation

3 participants