Skip to content

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Aug 20, 2025

📄 214% (2.14x) speedup for find_cycle_vertices in src/dsa/nodes.py

⏱️ Runtime : 48.9 milliseconds 15.6 milliseconds (best of 170 runs)

📝 Explanation and details

The optimized code replaces the expensive nx.simple_cycles() call with nx.strongly_connected_components(), delivering a 214% speedup by fundamentally changing the algorithm approach.

Key Optimization:

  • Original: Enumerates all simple cycles explicitly using nx.simple_cycles() - computationally expensive as it must find and traverse every possible cycle path
  • Optimized: Uses strongly connected components (SCCs) to identify cycle vertices - leverages Tarjan's algorithm which runs in O(V+E) time

Why This Works:
A vertex participates in a cycle if and only if:

  1. It's in an SCC with multiple vertices (multi-vertex cycles), OR
  2. It's in a single-vertex SCC with a self-loop

Performance Analysis:
From the line profiler, the original spends 89.4% of time in nx.simple_cycles(), while the optimized version distributes work across SCC analysis (65.5%) and component processing. The SCC approach scales much better - it processes components once rather than enumerating all possible cycle paths.

Test Case Performance:

  • Best gains on complex graphs with overlapping cycles (410-521% faster) where cycle enumeration is most expensive
  • Consistent speedup across all cycle types: simple cycles (241-267% faster), disconnected cycles (310% faster), large single cycles (435-438% faster)
  • One exception: Large graphs with many self-loops show 34% slower performance due to the overhead of checking graph.has_edge(vertex, vertex) for each single-vertex SCC

The optimization is particularly effective for real-world graphs with complex cycle structures where the original algorithm's cycle enumeration becomes prohibitively expensive.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 24 Passed
🌀 Generated Regression Tests 55 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_dsa_nodes.py::test_complex_graph 151μs 35.9μs 321%✅
test_dsa_nodes.py::test_cycle_with_extra_nodes_edges 85.2μs 28.6μs 198%✅
test_dsa_nodes.py::test_figure_eight 120μs 23.5μs 411%✅
test_dsa_nodes.py::test_multiple_disjoint_cycles 100.0μs 24.2μs 312%✅
test_dsa_nodes.py::test_multiple_overlapping_cycles 123μs 23.9μs 417%✅
test_dsa_nodes.py::test_no_cycles_dag 34.3μs 19.6μs 75.3%✅
test_dsa_nodes.py::test_self_loop 22.1μs 14.6μs 51.0%✅
test_dsa_nodes.py::test_simple_triangle_cycle 69.4μs 18.8μs 270%✅
test_dsa_nodes.py::test_simple_two_node_cycle 59.3μs 17.2μs 244%✅
test_dsa_nodes.py::test_string_vertices 89.0μs 29.2μs 205%✅
🌀 Generated Regression Tests and Runtime
import networkx as nx  # for the function to test
# imports
import pytest  # used for our unit tests
from src.dsa.nodes import find_cycle_vertices

# unit tests

# 1. Basic Test Cases

def test_no_edges():
    # Graph with no edges and no nodes
    codeflash_output = find_cycle_vertices([]) # 18.6μs -> 9.88μs (88.6% faster)

def test_single_edge_no_cycle():
    # One edge, no cycle possible
    codeflash_output = find_cycle_vertices([(1, 2)]) # 28.5μs -> 16.1μs (77.2% faster)

def test_two_nodes_cycle():
    # Two nodes forming a cycle
    codeflash_output = find_cycle_vertices([(1, 2), (2, 1)]) # 58.2μs -> 17.0μs (241% faster)

def test_three_node_cycle():
    # Three nodes in a cycle
    codeflash_output = find_cycle_vertices([(1, 2), (2, 3), (3, 1)]) # 69.5μs -> 19.0μs (266% faster)

def test_three_node_path_no_cycle():
    # Three nodes in a path, no cycle
    codeflash_output = find_cycle_vertices([(1, 2), (2, 3)]) # 32.0μs -> 18.4μs (73.8% faster)

def test_disconnected_cycles():
    # Two disconnected cycles
    edges = [(1, 2), (2, 1), (3, 4), (4, 3)]
    codeflash_output = find_cycle_vertices(edges) # 87.2μs -> 21.2μs (310% faster)

def test_cycle_and_noncycle():
    # One cycle, one path
    edges = [(1, 2), (2, 3), (3, 1), (4, 5)]
    codeflash_output = find_cycle_vertices(edges) # 75.0μs -> 23.2μs (223% faster)

def test_self_loop():
    # Single node with a self-loop
    codeflash_output = find_cycle_vertices([(1, 1)]) # 22.0μs -> 14.4μs (52.3% faster)

def test_multiple_self_loops():
    # Multiple nodes with self-loops
    edges = [(1, 1), (2, 2), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 32.2μs -> 21.2μs (52.4% faster)

def test_overlapping_cycles():
    # Overlapping cycles: 1-2-3-1 and 2-3-4-2
    edges = [(1, 2), (2, 3), (3, 1), (3, 4), (4, 2)]
    codeflash_output = find_cycle_vertices(edges) # 110μs -> 21.7μs (410% faster)

# 2. Edge Test Cases

def test_empty_graph():
    # No nodes, no edges
    codeflash_output = find_cycle_vertices([]) # 18.7μs -> 9.79μs (90.6% faster)

def test_single_node_no_edges():
    # One node, no edges
    codeflash_output = find_cycle_vertices([]) # 18.2μs -> 9.67μs (87.9% faster)

def test_single_node_self_loop():
    # One node, self-loop
    codeflash_output = find_cycle_vertices([(0, 0)]) # 22.1μs -> 14.6μs (51.0% faster)

def test_large_cycle():
    # Large cycle of 10 nodes
    edges = [(i, i+1) for i in range(10)]
    edges.append((10, 0))
    codeflash_output = find_cycle_vertices(edges) # 154μs -> 36.5μs (323% faster)

def test_cycle_with_tail():
    # Cycle with a tail node leading into it
    edges = [(0, 1), (1, 2), (2, 0), (3, 0)]
    codeflash_output = find_cycle_vertices(edges) # 73.1μs -> 21.5μs (239% faster)

def test_cycle_with_exit():
    # Cycle with an edge leaving the cycle
    edges = [(1, 2), (2, 3), (3, 1), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 73.1μs -> 21.4μs (241% faster)

def test_multiple_components_some_with_cycles():
    # Multiple components, some cyclic, some not
    edges = [(1, 2), (2, 3), (3, 1), (4, 5), (6, 7), (7, 6), (8, 9)]
    codeflash_output = find_cycle_vertices(edges) # 109μs -> 31.4μs (250% faster)

def test_duplicate_edges():
    # Duplicate edges in the input
    edges = [(1, 2), (2, 3), (3, 1), (1, 2), (2, 3)]
    codeflash_output = find_cycle_vertices(edges) # 69.3μs -> 20.1μs (245% faster)

def test_graph_with_isolated_nodes():
    # Nodes with no edges should not appear in output
    edges = [(1, 2), (2, 1)]
    # nodes 3, 4, 5 are isolated
    codeflash_output = find_cycle_vertices(edges) # 58.0μs -> 16.8μs (245% faster)

def test_graph_with_negative_and_zero_nodes():
    # Negative and zero as node labels
    edges = [(0, -1), (-1, -2), (-2, 0), (1, 2)]
    codeflash_output = find_cycle_vertices(edges) # 91.2μs -> 29.2μs (212% faster)

def test_graph_with_string_nodes():
    # Node labels are strings
    edges = [("a", "b"), ("b", "c"), ("c", "a"), ("d", "e")]
    codeflash_output = find_cycle_vertices(edges) # 80.3μs -> 24.0μs (235% faster)


def test_large_acyclic_graph():
    # Large DAG, should return empty list
    edges = [(i, i+1) for i in range(1000)]
    codeflash_output = find_cycle_vertices(edges) # 3.13ms -> 1.81ms (72.8% faster)

def test_large_single_cycle():
    # Large cycle of 1000 nodes
    edges = [(i, i+1) for i in range(999)]
    edges.append((999, 0))
    codeflash_output = find_cycle_vertices(edges) # 9.49ms -> 1.77ms (435% faster)

def test_large_graph_with_multiple_small_cycles():
    # 10 cycles of 10 nodes each, disconnected
    edges = []
    for k in range(10):
        base = k*10
        for i in range(10):
            edges.append((base+i, base+(i+1)%10))
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 1.08ms -> 200μs (436% faster)

def test_large_graph_with_cycles_and_paths():
    # 5 cycles of 10 nodes, and 50 node path
    edges = []
    for k in range(5):
        base = k*10
        for i in range(10):
            edges.append((base+i, base+(i+1)%10))
    # Add a path
    edges += [(100+i, 100+i+1) for i in range(49)]
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 721μs -> 205μs (252% faster)

def test_large_graph_sparse_cycles():
    # 100 cycles of 2 nodes each
    edges = []
    for i in range(0, 200, 2):
        edges.append((i, i+1))
        edges.append((i+1, i))
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 2.36ms -> 380μs (521% faster)

def test_large_graph_with_self_loops():
    # 500 nodes, each with a self-loop
    edges = [(i, i) for i in range(500)]
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 640μs -> 976μs (34.4% slower)

def test_large_graph_mixed():
    # 250 nodes in a single large cycle, 250 nodes with self-loops, 250 node path
    edges = []
    # Large cycle
    for i in range(250):
        edges.append((i, (i+1)%250))
    # Self-loops
    for i in range(250, 500):
        edges.append((i, i))
    # Path
    for i in range(500, 749):
        edges.append((i, i+1))
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 3.60ms -> 1.40ms (157% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import networkx as nx
# imports
import pytest  # used for our unit tests
from src.dsa.nodes import find_cycle_vertices

# unit tests

# ------------------------
# Basic Test Cases
# ------------------------

def test_no_edges_empty_graph():
    # No edges, no vertices, so no cycles
    codeflash_output = find_cycle_vertices([]) # 19.0μs -> 10.0μs (88.8% faster)

def test_single_vertex_no_cycle():
    # One node, no edges, no cycles
    codeflash_output = find_cycle_vertices([(1, 1)]) # 22.5μs -> 14.7μs (53.0% faster)

def test_two_node_cycle():
    # Simple 2-node cycle: 1->2->1
    codeflash_output = find_cycle_vertices([(1, 2), (2, 1)]) # 58.6μs -> 16.9μs (247% faster)

def test_three_node_cycle():
    # Simple 3-node cycle: 1->2->3->1
    codeflash_output = find_cycle_vertices([(1, 2), (2, 3), (3, 1)]) # 69.4μs -> 18.9μs (267% faster)

def test_disconnected_cycle_and_noncycle():
    # 1->2->3->1 is a cycle, 4->5 is not
    edges = [(1, 2), (2, 3), (3, 1), (4, 5)]
    codeflash_output = find_cycle_vertices(edges) # 75.3μs -> 23.0μs (227% faster)

def test_multiple_disconnected_cycles():
    # Two cycles: 1->2->1 and 3->4->5->3
    edges = [(1, 2), (2, 1), (3, 4), (4, 5), (5, 3)]
    codeflash_output = find_cycle_vertices(edges) # 96.8μs -> 23.6μs (310% faster)

def test_cycle_with_tail():
    # 1->2->3->1 is a cycle, 0->1 is a tail
    edges = [(0, 1), (1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges) # 72.2μs -> 21.5μs (236% faster)

def test_multiple_cycles_sharing_vertices():
    # 1->2->3->1 and 2->4->5->2
    edges = [(1, 2), (2, 3), (3, 1), (2, 4), (4, 5), (5, 2)]
    codeflash_output = find_cycle_vertices(edges) # 122μs -> 23.8μs (414% faster)

# ------------------------
# Edge Test Cases
# ------------------------

def test_self_loop():
    # Single self-loop
    codeflash_output = find_cycle_vertices([(42, 42)]) # 22.0μs -> 14.7μs (50.3% faster)

def test_multiple_self_loops():
    # Multiple self-loops, disconnected
    edges = [(1, 1), (2, 2), (3, 3)]
    codeflash_output = find_cycle_vertices(edges) # 25.2μs -> 19.3μs (30.4% faster)

def test_cycle_with_self_loop():
    # 1->2->3->1 is a cycle, 2->2 is a self-loop (should only appear once)
    edges = [(1, 2), (2, 3), (3, 1), (2, 2)]
    codeflash_output = find_cycle_vertices(edges) # 69.9μs -> 19.9μs (252% faster)

def test_cycle_with_non_participating_nodes():
    # 1->2->3->1 is a cycle, 4->5->6 is not
    edges = [(1, 2), (2, 3), (3, 1), (4, 5), (5, 6)]
    codeflash_output = find_cycle_vertices(edges) # 79.1μs -> 25.8μs (206% faster)

def test_empty_edges():
    # No edges at all
    codeflash_output = find_cycle_vertices([]) # 18.4μs -> 9.67μs (90.5% faster)

def test_cycle_with_duplicate_edges():
    # 1->2->3->1 is a cycle, with duplicate edges
    edges = [(1, 2), (2, 3), (3, 1), (1, 2), (2, 3)]
    codeflash_output = find_cycle_vertices(edges) # 69.6μs -> 19.8μs (252% faster)

def test_large_single_vertex_self_loop():
    # Large value vertex with self-loop
    codeflash_output = find_cycle_vertices([(999999, 999999)]) # 22.9μs -> 16.0μs (43.0% faster)

def test_cycle_with_isolated_vertex():
    # 1->2->3->1 is a cycle, 4 is isolated
    edges = [(1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges + [(4, 4)]) # 70.9μs -> 21.5μs (230% faster)

def test_cycle_with_non_integer_nodes():
    # Using string nodes
    edges = [("a", "b"), ("b", "c"), ("c", "a"), ("d", "e")]
    codeflash_output = find_cycle_vertices(edges) # 80.3μs -> 24.4μs (229% faster)


def test_multiple_cycles_with_overlap():
    # 1->2->3->1 and 3->4->5->3 (3 is shared)
    edges = [(1, 2), (2, 3), (3, 1), (3, 4), (4, 5), (5, 3)]
    codeflash_output = find_cycle_vertices(edges) # 128μs -> 25.0μs (414% faster)

def test_cycle_with_branches():
    # 1->2->3->1 is a cycle, 2->4 is a branch
    edges = [(1, 2), (2, 3), (3, 1), (2, 4)]
    codeflash_output = find_cycle_vertices(edges) # 73.6μs -> 22.2μs (232% faster)

# ------------------------
# Large Scale Test Cases
# ------------------------

def test_large_acyclic_graph():
    # Large DAG: no cycles
    edges = [(i, i+1) for i in range(1000)]
    codeflash_output = find_cycle_vertices(edges) # 3.13ms -> 1.81ms (73.2% faster)

def test_large_single_cycle():
    # Large cycle: 0->1->2->...->999->0
    edges = [(i, (i+1)%1000) for i in range(1000)]
    codeflash_output = find_cycle_vertices(edges) # 9.50ms -> 1.77ms (438% faster)

def test_large_graph_with_multiple_small_cycles():
    # 10 cycles of length 10, disjoint
    edges = []
    for base in range(0, 100, 10):
        for i in range(base, base+10):
            edges.append((i, base + ((i-base+1)%10)))
    codeflash_output = find_cycle_vertices(edges) # 1.08ms -> 203μs (433% faster)

def test_large_graph_with_cycles_and_noncycles():
    # 500-node cycle, 500-node chain
    cycle_edges = [(i, (i+1)%500) for i in range(500)]
    chain_edges = [(i+500, i+501) for i in range(499)]
    edges = cycle_edges + chain_edges
    codeflash_output = find_cycle_vertices(edges) # 6.40ms -> 1.80ms (255% faster)

def test_large_graph_with_self_loops_and_cycles():
    # 100 self-loops, 100-node cycle
    self_loops = [(i, i) for i in range(100)]
    cycle_edges = [(100+i, 100+((i+1)%100)) for i in range(100)]
    edges = self_loops + cycle_edges
    codeflash_output = find_cycle_vertices(edges) # 1.13ms -> 389μs (191% faster)

# ------------------------
# Miscellaneous/Regression Tests
# ------------------------

def test_cycle_vertices_are_sorted():
    # Ensure output is sorted
    edges = [(3, 1), (1, 2), (2, 3)]
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 69.8μs -> 19.4μs (259% faster)

def test_multiple_cycles_with_duplicate_nodes():
    # 1->2->1 and 2->3->4->2, node 2 in both cycles
    edges = [(1, 2), (2, 1), (2, 3), (3, 4), (4, 2)]
    codeflash_output = find_cycle_vertices(edges) # 111μs -> 22.2μs (404% faster)

def test_large_sparse_graph_with_one_cycle():
    # 999 nodes in a chain, last three form a cycle
    edges = [(i, i+1) for i in range(997)] + [(997, 998), (998, 999), (999, 997)]
    codeflash_output = find_cycle_vertices(edges) # 3.14ms -> 1.80ms (74.4% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-find_cycle_vertices-mejgju1v and push.

Codeflash

The optimized code replaces the expensive `nx.simple_cycles()` call with `nx.strongly_connected_components()`, delivering a **214% speedup** by fundamentally changing the algorithm approach.

**Key Optimization:**
- **Original**: Enumerates all simple cycles explicitly using `nx.simple_cycles()` - computationally expensive as it must find and traverse every possible cycle path
- **Optimized**: Uses strongly connected components (SCCs) to identify cycle vertices - leverages Tarjan's algorithm which runs in O(V+E) time

**Why This Works:**
A vertex participates in a cycle if and only if:
1. It's in an SCC with multiple vertices (multi-vertex cycles), OR  
2. It's in a single-vertex SCC with a self-loop

**Performance Analysis:**
From the line profiler, the original spends 89.4% of time in `nx.simple_cycles()`, while the optimized version distributes work across SCC analysis (65.5%) and component processing. The SCC approach scales much better - it processes components once rather than enumerating all possible cycle paths.

**Test Case Performance:**
- **Best gains** on complex graphs with overlapping cycles (410-521% faster) where cycle enumeration is most expensive
- **Consistent speedup** across all cycle types: simple cycles (241-267% faster), disconnected cycles (310% faster), large single cycles (435-438% faster)
- **One exception**: Large graphs with many self-loops show 34% slower performance due to the overhead of checking `graph.has_edge(vertex, vertex)` for each single-vertex SCC

The optimization is particularly effective for real-world graphs with complex cycle structures where the original algorithm's cycle enumeration becomes prohibitively expensive.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Aug 20, 2025
@codeflash-ai codeflash-ai bot requested a review from KRRT7 August 20, 2025 04:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants