Skip to content

Docs: clarify VLE semantics and cost model #2349

@gregfelice

Description

@gregfelice

Summary

The variable-length edge (VLE) expansion algorithm used for patterns like [*1..3] exhibits O(n^k) time complexity because it does not perform cycle detection or deduplication during recursive expansion. On graphs with cycles (which are common in real-world data), this causes exponential path explosion.

Reproduction

Setup: 100K nodes with edges forming natural cycles.

-- VLE query: find all nodes reachable within 1-3 hops
SELECT * FROM cypher('benchmark', $$
  MATCH (n:Node {bench_id: 42})-[*1..3]->(m)
  RETURN count(m)
$$) AS (cnt agtype);

On graphs with cycles, the expansion visits the same nodes exponentially many times because UNION ALL semantics are used internally (no deduplication).

Performance Data

Benchmark W06 (variable-length edge traversal):

Approach 10K scale 100K scale
Cypher VLE (before SDK workaround) ~23ms ~23ms
Recursive CTE with UNION (dedup) 0.08ms 0.34ms

The SDK workaround uses a recursive CTE with UNION (not UNION ALL) which provides implicit cycle detection through set deduplication, preventing exponential blowup:

WITH RECURSIVE vle AS (
  SELECT e.end_id AS node_id, 1 AS depth
  FROM benchmark."EDGE" e
  WHERE e.start_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 42)
  UNION  -- implicit dedup = cycle detection
  SELECT e.end_id, v.depth + 1
  FROM vle v
  JOIN benchmark."EDGE" e ON e.start_id = v.node_id
  WHERE v.depth < 3
)
SELECT count(*) FROM vle WHERE depth >= 1;

Suggested Fix

Add a visited-node set to the VLE expansion in the executor. When expanding [*min..max], track which (node_id, depth) pairs have already been explored and skip duplicates. This is equivalent to using UNION instead of UNION ALL in the recursive expansion.

Environment

  • PostgreSQL 18.2
  • Apache AGE 1.7.0 (built from source, branch release/PG18/1.7.0)
  • 32-core, 64GB RAM, Linux 6.17

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions