Skip to content

fix: INHERITS edges missing for Java extends+implements#279

Merged
DeusData merged 1 commit intoDeusData:mainfrom
loaychlih:fix/extract-base-classes-inherits-edges
May 9, 2026
Merged

fix: INHERITS edges missing for Java extends+implements#279
DeusData merged 1 commit intoDeusData:mainfrom
loaychlih:fix/extract-base-classes-inherits-edges

Conversation

@loaychlih
Copy link
Copy Markdown
Contributor

Problem

extract_base_classes() in internal/cbm/extract_defs.c had two bugs
that caused most INHERITS edges to be missing for Java (and likely other
languages using similar AST field names).

Bug 1 — Early return loses multiple inheritance targets

The field loop returned immediately on the first match:

for (const char **f = fields; *f; f++) {
    TSNode super = ts_node_child_by_field_name(node, *f, ...);
    if (!ts_node_is_null(super)) {
        return make_single_base(...);  // ← returns here, never checks interfaces
    }
}

For class Foo extends Bar implements Baz, it found superclass → returned
immediately → never processed super_interfaces. Result: 1 edge instead of 2.

Bug 2 — cbm_node_text returns full node text including keywords

Calling cbm_node_text on the superclass field node returned
"extends Bar" instead of "Bar". On super_interfaces it returned
"implements Baz, Qux" instead of the individual names.

Fix

Added collect_bases_from_field() which:

  • Walks into child AST nodes to extract type_identifier / generic_type /
    qualified_name text directly (skips keyword nodes like extends/implements)
  • Handles type_list / interface_type_list children for multiple interfaces
  • Strips generic args at < (e.g. List<String>List)
  • Falls back to raw cbm_node_text for languages where the field IS the type name

The field loop now collects from all matching fields before returning.

Result

Tested on Modelio
(Java codebase, 253 files):

Before After
INHERITS edges 3 116

Example edges now correctly emitted:

  • DefaultLinkToolDefaultDiagramTool (extends)
  • DefaultLinkToolILinkTool (implements)

Two bugs in extract_base_classes() caused most INHERITS edges to be
missing for Java (and maybe other languages also):

Bug 1 - Early return on first match: the field loop returned immediately
on the first matching field (e.g. superclass), never processing subsequent
fields (e.g. super_interfaces). Classes with both extends and implements
only produced one INHERITS edge instead of multiple.

Bug 2 - cbm_node_text returned full node text including keywords: calling
cbm_node_text on the superclass field node returned 'extends Bar' instead
of 'Bar', and on super_interfaces returned 'implements Baz, Qux' instead
of the individual names.

Fix: add collect_bases_from_field() which walks into child AST nodes to
extract type_identifier/generic_type/qualified_name text directly, handles
type_list children for multiple interfaces, and strips generic args at '<'.
The field loop now collects from all matching fields before returning.

Result on modelio codebase (Java): 3 -> 116 INHERITS edges.
@DeusData DeusData added bug Something isn't working parsing/quality Graph extraction bugs, false positives, missing edges labels May 4, 2026
@DeusData
Copy link
Copy Markdown
Owner

DeusData commented May 9, 2026

Thanks @loaychlih — clean diagnosis (two distinct bugs in the same function, with a nice repro on Modelio showing 3 → 116 INHERITS edges) and a bounds-safe fix that drops in cleanly alongside the existing arena-pool pattern in this file. Verified: max writes ≤ MAX_BASES_MINUS_1 with the per-iteration count < out_cap checks, no new system/network/file calls, the count == 0 fallback correctly preserves behavior for non-Java grammars where the field IS the type name. Merging now and pushing a small follow-up that adds a regression test for the extends+implements case so this doesn't silently regress.

@DeusData DeusData merged commit 323c68e into DeusData:main May 9, 2026
DeusData added a commit that referenced this pull request May 9, 2026
…es (#279)

Pins the bug-fix from #279 with a Java class declaring both extends and
implements. Asserts that base_classes contains:
  - the superclass name (DefaultDiagramTool)
  - every implements interface (ILinkTool, Closeable)
  - bare type names only — no 'extends' / 'implements' keyword text
    leaking into any entry

Without the fix, the field loop returned on the first match (only the
extends parent emitted) and cbm_node_text on the field returned the full
literal 'extends Bar' / 'implements Baz, Qux' string.
DeusData added a commit that referenced this pull request May 9, 2026
Resolved conflict in Makefile.cbm: keep both TEST_STACK_OVERFLOW_SRCS
(from main, #217) and the new py_lsp test variables (TEST_SCOPE_SRCS,
TEST_TYPE_REP_SRCS, TEST_PY_LSP_SRCS, TEST_PY_LSP_BENCH_SRCS,
TEST_PY_LSP_STRESS_SRCS, TEST_PY_LSP_SCALE_SRCS) in ALL_TEST_SRCS.

Other auto-merged files: internal/cbm/extract_defs.c (PR #279),
tests/test_main.c (multiple suite registrations on each side).

Brings in 28 commits from main since the branch was forked at 8fbdb0f
(#207 thread safety): #208 decorator USAGE, #209 memory helpers, #210
refactor, #217 traversal stacks, #224 Svelte/Vue imports, #231
search_graph default limit, #243 path aliases, #249 GH Actions shell
injection, #251 incremental destructive overwrite, #257 temporal
properties, #265 Nix flake, #267-270/#289 dependabot, #273 Pine Script,
#278 AUR docs, #279 INHERITS edges, #281 get_architecture wiring +
follow-up, codeql revert.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working parsing/quality Graph extraction bugs, false positives, missing edges

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants