-
Notifications
You must be signed in to change notification settings - Fork 0
Clone bugfix/st synthetic named expr in match #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ic-named-expr-in-match
…ic-named-expr-in-match
…ic-named-expr-in-match
…ic-named-expr-in-match
…ic-named-expr-in-match
Summary of ChangesHello @visz11, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a bug in Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
WalkthroughIntroduces unique dummy-name generation and a helper to consistently name match subjects in TypeChecker, refactors visit_match_stmt to use it, and expands Python 3.10 pattern-matching tests with additional scenarios and previously xfailed case enabled. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant TC as TypeChecker
participant MS as MatchStmt
participant Expr as Subject Expression
participant Var as Dummy Var/NameExpr
TC->>MS: visit_match_stmt(s)
TC->>TC: _make_named_statement_for_match(s)
alt Subject already named or dummy provided
TC-->>Expr: Return existing named subject
else Need a named subject
TC->>TC: new_unique_dummy_name("match_subject")
TC->>Var: Create Var + NameExpr
TC->>MS: Set s.subject_dummy
TC-->>Expr: Return dummy NameExpr
end
TC->>TC: Perform case inference using named subject
TC-->>MS: Complete type checking for cases
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
/refacto-visz |
Refacto is reviewing this PR. Please wait for the review comments to be posted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors and extends the logic for handling complex subjects in match
statements. By creating a synthetic named expression for subjects that are not simple literals or variables, it enables more accurate type narrowing and exhaustiveness checking. The changes are well-implemented and include a comprehensive set of new tests that cover various new scenarios. My only suggestion is a minor improvement for maintainability.
expressions_to_preserve = ( | ||
# Already named - we should infer type of it as given | ||
NameExpr, | ||
AssignmentExpr, | ||
# Primitive literals - their type is known, no need to name them | ||
IntExpr, | ||
StrExpr, | ||
BytesExpr, | ||
FloatExpr, | ||
ComplexExpr, | ||
EllipsisExpr, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better maintainability and a minor performance improvement, consider defining expressions_to_preserve
as a constant at the module or class level, rather than recreating it on each call to _make_named_statement_for_match
.
For example, you could define it at the module level:
_MATCH_SUBJECT_EXPRS_TO_PRESERVE: Final = (
# Already named - we should infer type of it as given
NameExpr,
AssignmentExpr,
# Primitive literals - their type is known, no need to name them
IntExpr,
StrExpr,
BytesExpr,
FloatExpr,
ComplexExpr,
EllipsisExpr,
)
class TypeChecker(...):
...
def _make_named_statement_for_match(self, s: MatchStmt) -> Expression:
subject = s.subject
if isinstance(subject, _MATCH_SUBJECT_EXPRS_TO_PRESERVE):
...
Code Review: Source Location Preservation in Match Statements👍 Well Done
📁 Selected files for review (2)
🎯 Custom Instructions
📝 Additional Comments
|
name = self.new_unique_dummy_name("match") | ||
v = Var(name) | ||
named_subject = NameExpr(name) | ||
named_subject.node = v | ||
s.subject_dummy = named_subject | ||
return named_subject |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Source Location Preservation
The synthetic NameExpr creation does not preserve source location information from the original subject expression. This violates the repository guideline requiring preservation of source-location info when synthesising dummy subjects. Missing location data can impact debugging and error reporting accuracy.
name = self.new_unique_dummy_name("match")
v = Var(name)
named_subject = NameExpr(name)
named_subject.node = v
# Preserve source location from original subject
named_subject.line = s.subject.line
named_subject.column = s.subject.column
s.subject_dummy = named_subject
return named_subject
Commitable Suggestion
name = self.new_unique_dummy_name("match") | |
v = Var(name) | |
named_subject = NameExpr(name) | |
named_subject.node = v | |
s.subject_dummy = named_subject | |
return named_subject | |
name = self.new_unique_dummy_name("match") | |
v = Var(name) | |
named_subject = NameExpr(name) | |
named_subject.node = v | |
# Preserve source location from original subject | |
named_subject.line = s.subject.line | |
named_subject.column = s.subject.column | |
s.subject_dummy = named_subject | |
return named_subject |
Standards
- Repo-Guideline-Preserve source‑location info when synthesising a dummy subject.
- CWE-209
expressions_to_preserve = ( | ||
# Already named - we should infer type of it as given | ||
NameExpr, | ||
AssignmentExpr, | ||
# Primitive literals - their type is known, no need to name them | ||
IntExpr, | ||
StrExpr, | ||
BytesExpr, | ||
FloatExpr, | ||
ComplexExpr, | ||
EllipsisExpr, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enhanced Pattern Coverage
Expression preservation tuple could benefit from additional literal types like ListExpr, DictExpr, SetExpr for comprehensive literal coverage. Current implementation handles basic literals but may create unnecessary dummy names for compound literals with known types.
Standards
- SOLID-OCP
- Clean-Code-Functions
- Design-Pattern-Strategy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
mypy/checker.py (2)
431-447
: Reset the unique-id counter in reset()For determinism across runs and to avoid unbounded growth when reusing TypeChecker in FG incremental mode, reset the counter.
Apply this diff within reset():
self.expr_checker.reset() + + # Reset unique dummy-name counter for determinism across runs + self._unique_id = 0
5506-5535
: Use identifier‐safe dummy names and preserve source location
- In
_make_named_statement_for_match
, copy the subject’s line/column:@@ def _make_named_statement_for_match(self, s: MatchStmt) -> Expression: - named_subject.node = v + named_subject.node = v + named_subject.line = subject.line + named_subject.column = subject.column
- In
new_unique_dummy_name
, replace hyphens with underscores:@@ def new_unique_dummy_name(self, namespace: str) -> str: - name = f"dummy-{namespace}-{self._unique_id}" + name = f"__mypy_{namespace}_subject_{self._unique_id}"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
mypy/checker.py
(7 hunks)test-data/unit/check-python310.test
(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
mypy/checker.py (1)
mypy/nodes.py (23)
ComplexExpr
(1783-1797)MatchStmt
(1640-1667)Expression
(201-207)NameExpr
(1875-1895)AssignmentExpr
(2074-2087)IntExpr
(1696-1710)StrExpr
(1719-1733)BytesExpr
(1740-1763)FloatExpr
(1766-1780)EllipsisExpr
(1800-1806)name
(234-235)name
(353-354)name
(532-533)name
(571-577)name
(809-810)name
(907-908)name
(1068-1069)name
(2287-2288)name
(2572-2573)name
(3205-3207)name
(3752-3753)name
(3864-3865)Var
(973-1118)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
- GitHub Check: mypyc runtime tests with py39-macos
- GitHub Check: Test suite with py39-windows-64
- GitHub Check: Test suite with py311-ubuntu, mypyc-compiled
- GitHub Check: Type check our own code (py39-windows-64)
- GitHub Check: Formatting and code style with Black + ruff
- GitHub Check: Type check our own code (py39-ubuntu)
- GitHub Check: Test suite with py312-ubuntu, mypyc-compiled
- GitHub Check: Test suite with py310-ubuntu
- GitHub Check: Test suite with py313-ubuntu, mypyc-compiled
- GitHub Check: Test suite with py39-ubuntu, mypyc-compiled
- GitHub Check: Run mypy_primer (2)
- GitHub Check: Run mypy_primer (1)
- GitHub Check: Run mypy_primer (4)
- GitHub Check: Run mypy_primer (3)
- GitHub Check: Run mypy_primer (0)
- GitHub Check: Test mypyc suite with 32-bit Python
🔇 Additional comments (8)
test-data/unit/check-python310.test (6)
1305-1305
: LGTM! Test case enabled.Removing the
-xfail
suffix indicates this test now passes, which aligns with the PR's fix for synthetic named expressions in match statements.
2589-2601
: LGTM! Comprehensive function call matching test.This test case correctly verifies type narrowing when matching on function return values and appropriately flags unreachable code.
2602-2615
: LGTM! Attribute matching test is well-structured.This test case appropriately verifies type narrowing when matching on object attributes and correctly expects unreachable code detection.
2616-2644
: LGTM! Operations matching test covers key scenarios.This test case comprehensively verifies type narrowing for match subjects involving unary, binary, and comparison operations, with appropriate unreachable code detection.
2645-2659
: LGTM! Dict item matching test is correct.This test case appropriately verifies type narrowing when matching on dictionary item access and includes proper unreachable code detection.
2661-2671
: LGTM! Edge case test for literal matching.This test case appropriately verifies the pathological scenario of matching a literal value, including the expected type representation and unreachable code detection.
mypy/checker.py (2)
70-70
: Import of ComplexExpr is correctNeeded by expressions_to_preserve in _make_named_statement_for_match.
5421-5459
: Match subject naming and type-map propagation look solidRefactoring to a helper and propagating types from the synthetic name back to s.subject avoids losing narrowing for complex subjects. Guard handling is preserved. LGTM.
def new_unique_dummy_name(self, namespace: str) -> str: | ||
"""Generate a name that is guaranteed to be unique for this TypeChecker instance.""" | ||
name = f"dummy-{namespace}-{self._unique_id}" | ||
self._unique_id += 1 | ||
return name | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Identifier safety for dummy names
Hyphens in "dummy-match-" are non-identifier. Use an identifier-safe pattern to avoid surprises.
Apply:
- name = f"dummy-{namespace}-{self._unique_id}"
+ name = f"__mypy_{namespace}_subject_{self._unique_id}"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def new_unique_dummy_name(self, namespace: str) -> str: | |
"""Generate a name that is guaranteed to be unique for this TypeChecker instance.""" | |
name = f"dummy-{namespace}-{self._unique_id}" | |
self._unique_id += 1 | |
return name | |
def new_unique_dummy_name(self, namespace: str) -> str: | |
"""Generate a name that is guaranteed to be unique for this TypeChecker instance.""" | |
name = f"__mypy_{namespace}_subject_{self._unique_id}" | |
self._unique_id += 1 | |
return name |
🤖 Prompt for AI Agents
In mypy/checker.py around lines 7915-7920, the generated dummy name uses hyphens
which are not valid Python identifiers; change the format to produce
identifier-safe names (e.g. use underscores or alphanumeric-only chars),
sanitize the namespace by replacing non-identifier characters with underscores,
ensure the name does not start with a digit (prefix with a letter or underscore
if needed), and keep the uniqueness mechanism (self._unique_id) intact so the
function returns something like "dummy_<sanitized_namespace>_<unique_id>".
Diff from mypy_primer, showing the effect of this PR on open source code: discord.py (https://github.com/Rapptz/discord.py)
- discord/ext/commands/hybrid.py:836: error: Overlap between argument names and ** TypedDict items: "name", "with_app_command" [misc]
+ discord/ext/commands/hybrid.py:836: error: Overlap between argument names and ** TypedDict items: "with_app_command", "name" [misc]
- discord/ext/commands/hybrid.py:860: error: Overlap between argument names and ** TypedDict items: "name", "with_app_command" [misc]
+ discord/ext/commands/hybrid.py:860: error: Overlap between argument names and ** TypedDict items: "with_app_command", "name" [misc]
- discord/ext/commands/hybrid.py:885: error: Overlap between argument names and ** TypedDict items: "name", "with_app_command" [misc]
+ discord/ext/commands/hybrid.py:885: error: Overlap between argument names and ** TypedDict items: "with_app_command", "name" [misc]
- discord/ext/commands/hybrid.py:937: error: Overlap between argument names and ** TypedDict items: "name", "with_app_command" [misc]
+ discord/ext/commands/hybrid.py:937: error: Overlap between argument names and ** TypedDict items: "with_app_command", "name" [misc]
- discord/ext/commands/bot.py:290: error: Overlap between argument names and ** TypedDict items: "name", "with_app_command" [misc]
+ discord/ext/commands/bot.py:290: error: Overlap between argument names and ** TypedDict items: "with_app_command", "name" [misc]
- discord/ext/commands/bot.py:314: error: Overlap between argument names and ** TypedDict items: "name", "with_app_command" [misc]
+ discord/ext/commands/bot.py:314: error: Overlap between argument names and ** TypedDict items: "with_app_command", "name" [misc]
|
(Explain how this PR changes mypy.)
Summary by CodeRabbit
Bug Fixes
Tests