Skip to content

Conversation

@YairVaknin-starkware
Copy link
Collaborator

@YairVaknin-starkware YairVaknin-starkware commented Sep 4, 2025

TITLE

[BUGFIX] Fix_temp_segment_chain_bug

Description

Fixes relocation chaining: when a temp segment pointed to another temp segment (multi temp segment hop), we weren’t resolving it all the way to its final destination.
We now flatten relocation rules so each temp maps directly to a concrete address (or int under extensive_hints).
Includes a cycle guard and rejects non-zero offsets when chains end at an int.

Factor out relocation-rule flattening into flatten_relocation_rules() with cfg variants:

  • non-extensive_hints: HashMap<usize, Relocatable>
  • extensive_hints: HashMap<usize, MaybeRelocatable>

Keep shared relocation flow in relocate_memory(); only preprocessing differs.

Added unit tests:

Without extensive_hints

  • flatten_relocation_rules_chain_happy — temp→temp→real chain flattens correctly (offsets composed).
  • flatten_relocation_rules_cycle_err — detects cycle and returns MemoryError::Relocation.
  • flatten_relocation_rules_missing_next_err — dangling chain (no rule for next temp) returns MemoryError::UnallocatedSegment((next_key, temp_len)).

With extensive_hints

  • flatten_relocation_rules_chain_happy_extensive_reloc_and_int — mixed chain:
    • temp→real flattens with offset composition.
    • multi-hop temp→temp→…→int collapses to the final int when cumulative offset is zero.
  • flatten_relocation_rules_int_with_non_zero_offset_err — multi-hop ending in Int with non-zero offset returns MemoryError::NonZeroOffset.
  • flatten_relocation_rules_cycle_err_extensive — detects cycle and returns MemoryError::Relocation.
  • flatten_relocation_rules_missing_next_err_extensive — dangling chain (no rule for next temp) returns MemoryError::UnallocatedSegment((next_key, temp_len)).

Integration (relocate_memory)

  • relocate_memory_temp_chain_to_reloc_multi_hop — exercises a multi-hop temp→temp→real chain end-to-end: references into temp memory are updated to real addresses, offsets are composed correctly, and the temp segment data is moved into the target real segment with consistency checks.
  • relocate_memory_temp_chain_to_int_multi_hop (with extensive_hints) — verifies the “collapse to Int” semantics: a chain like temp→temp→…→Int(99) causes all references to that temp to become Int(99), and the involved temp segments are dropped (their raw cells are not copied) by design.

Checklist

  • Linked to Github Issue
  • Unit tests added
  • Integration tests added.
  • This change requires new documentation.
    • Documentation has been added/updated.
    • CHANGELOG has been updated.

This change is Reviewable

@github-actions
Copy link

github-actions bot commented Sep 4, 2025

**Hyper Thereading Benchmark results**




hyperfine -r 2 -n "hyper_threading_main threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_main' -n "hyper_threading_pr threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 1
  Time (mean ± σ):     24.482 s ±  0.050 s    [User: 23.810 s, System: 0.670 s]
  Range (min … max):   24.447 s … 24.518 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 1
  Time (mean ± σ):     24.802 s ±  0.019 s    [User: 24.140 s, System: 0.658 s]
  Range (min … max):   24.789 s … 24.815 s    2 runs
 
Summary
  hyper_threading_main threads: 1 ran
    1.01 ± 0.00 times faster than hyper_threading_pr threads: 1




hyperfine -r 2 -n "hyper_threading_main threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_main' -n "hyper_threading_pr threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 2
  Time (mean ± σ):     13.315 s ±  0.012 s    [User: 23.943 s, System: 0.667 s]
  Range (min … max):   13.306 s … 13.324 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 2
  Time (mean ± σ):     13.378 s ±  0.088 s    [User: 24.194 s, System: 0.615 s]
  Range (min … max):   13.316 s … 13.441 s    2 runs
 
Summary
  hyper_threading_main threads: 2 ran
    1.00 ± 0.01 times faster than hyper_threading_pr threads: 2




hyperfine -r 2 -n "hyper_threading_main threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_main' -n "hyper_threading_pr threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 4
  Time (mean ± σ):     10.134 s ±  0.233 s    [User: 36.060 s, System: 0.715 s]
  Range (min … max):    9.969 s … 10.299 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 4
  Time (mean ± σ):     10.016 s ±  0.278 s    [User: 36.991 s, System: 0.711 s]
  Range (min … max):    9.819 s … 10.212 s    2 runs
 
Summary
  hyper_threading_pr threads: 4 ran
    1.01 ± 0.04 times faster than hyper_threading_main threads: 4




hyperfine -r 2 -n "hyper_threading_main threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_main' -n "hyper_threading_pr threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 6
  Time (mean ± σ):     10.278 s ±  0.167 s    [User: 36.367 s, System: 0.755 s]
  Range (min … max):   10.160 s … 10.396 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 6
  Time (mean ± σ):      9.897 s ±  0.042 s    [User: 37.437 s, System: 0.737 s]
  Range (min … max):    9.867 s …  9.927 s    2 runs
 
Summary
  hyper_threading_pr threads: 6 ran
    1.04 ± 0.02 times faster than hyper_threading_main threads: 6




hyperfine -r 2 -n "hyper_threading_main threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_main' -n "hyper_threading_pr threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 8
  Time (mean ± σ):      9.891 s ±  0.020 s    [User: 37.207 s, System: 0.793 s]
  Range (min … max):    9.877 s …  9.906 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 8
  Time (mean ± σ):      9.989 s ±  0.071 s    [User: 37.205 s, System: 0.792 s]
  Range (min … max):    9.939 s … 10.040 s    2 runs
 
Summary
  hyper_threading_main threads: 8 ran
    1.01 ± 0.01 times faster than hyper_threading_pr threads: 8




hyperfine -r 2 -n "hyper_threading_main threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_main' -n "hyper_threading_pr threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 16
  Time (mean ± σ):     10.000 s ±  0.029 s    [User: 37.203 s, System: 0.875 s]
  Range (min … max):    9.980 s … 10.021 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 16
  Time (mean ± σ):     10.023 s ±  0.289 s    [User: 37.290 s, System: 0.833 s]
  Range (min … max):    9.818 s … 10.227 s    2 runs
 
Summary
  hyper_threading_main threads: 16 ran
    1.00 ± 0.03 times faster than hyper_threading_pr threads: 16


@github-actions
Copy link

github-actions bot commented Sep 4, 2025

Benchmark Results for unmodified programs 🚀

Command Mean [s] Min [s] Max [s] Relative
base big_factorial 2.044 ± 0.028 2.027 2.120 1.02 ± 0.02
head big_factorial 1.998 ± 0.027 1.969 2.053 1.00
Command Mean [s] Min [s] Max [s] Relative
base big_fibonacci 1.978 ± 0.021 1.962 2.028 1.03 ± 0.01
head big_fibonacci 1.914 ± 0.014 1.900 1.946 1.00
Command Mean [s] Min [s] Max [s] Relative
base blake2s_integration_benchmark 7.297 ± 0.085 7.193 7.485 1.05 ± 0.03
head blake2s_integration_benchmark 6.966 ± 0.148 6.875 7.353 1.00
Command Mean [s] Min [s] Max [s] Relative
base compare_arrays_200000 2.112 ± 0.014 2.096 2.142 1.03 ± 0.01
head compare_arrays_200000 2.045 ± 0.014 2.023 2.072 1.00
Command Mean [s] Min [s] Max [s] Relative
base dict_integration_benchmark 1.367 ± 0.005 1.361 1.375 1.01 ± 0.01
head dict_integration_benchmark 1.356 ± 0.007 1.348 1.368 1.00
Command Mean [s] Min [s] Max [s] Relative
base field_arithmetic_get_square_benchmark 1.194 ± 0.011 1.186 1.216 1.03 ± 0.01
head field_arithmetic_get_square_benchmark 1.164 ± 0.007 1.155 1.176 1.00
Command Mean [s] Min [s] Max [s] Relative
base integration_builtins 7.387 ± 0.113 7.281 7.633 1.04 ± 0.02
head integration_builtins 7.075 ± 0.094 6.966 7.201 1.00
Command Mean [s] Min [s] Max [s] Relative
base keccak_integration_benchmark 7.430 ± 0.069 7.366 7.592 1.02 ± 0.05
head keccak_integration_benchmark 7.284 ± 0.317 7.081 8.159 1.00
Command Mean [s] Min [s] Max [s] Relative
base linear_search 2.079 ± 0.013 2.063 2.110 1.03 ± 0.01
head linear_search 2.025 ± 0.018 2.005 2.058 1.00
Command Mean [s] Min [s] Max [s] Relative
base math_cmp_and_pow_integration_benchmark 1.438 ± 0.005 1.429 1.444 1.01 ± 0.01
head math_cmp_and_pow_integration_benchmark 1.420 ± 0.014 1.399 1.447 1.00
Command Mean [s] Min [s] Max [s] Relative
base math_integration_benchmark 1.405 ± 0.017 1.395 1.453 1.02 ± 0.01
head math_integration_benchmark 1.372 ± 0.009 1.363 1.387 1.00
Command Mean [s] Min [s] Max [s] Relative
base memory_integration_benchmark 1.160 ± 0.005 1.152 1.168 1.02 ± 0.01
head memory_integration_benchmark 1.134 ± 0.011 1.121 1.151 1.00
Command Mean [s] Min [s] Max [s] Relative
base operations_with_data_structures_benchmarks 1.484 ± 0.005 1.475 1.495 1.02 ± 0.01
head operations_with_data_structures_benchmarks 1.456 ± 0.008 1.448 1.474 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
base pedersen 522.7 ± 1.7 520.8 525.7 1.02 ± 0.01
head pedersen 512.8 ± 2.2 510.2 517.9 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
base poseidon_integration_benchmark 598.3 ± 1.8 596.0 602.0 1.03 ± 0.01
head poseidon_integration_benchmark 581.4 ± 5.1 574.6 592.9 1.00
Command Mean [s] Min [s] Max [s] Relative
base secp_integration_benchmark 1.794 ± 0.023 1.779 1.856 1.03 ± 0.01
head secp_integration_benchmark 1.749 ± 0.008 1.739 1.766 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
base set_integration_benchmark 619.0 ± 19.4 610.1 673.7 1.03 ± 0.03
head set_integration_benchmark 600.8 ± 3.1 597.4 608.0 1.00
Command Mean [s] Min [s] Max [s] Relative
base uint256_integration_benchmark 4.125 ± 0.079 4.071 4.314 1.04 ± 0.02
head uint256_integration_benchmark 3.954 ± 0.023 3.928 4.007 1.00

@YairVaknin-starkware YairVaknin-starkware force-pushed the yairv/fix_temp_segment_chain_bug branch from f8ad17b to 602c809 Compare September 4, 2025 10:32
@codecov
Copy link

codecov bot commented Sep 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.71%. Comparing base (7b32c3e) to head (5b3a8d1).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2195      +/-   ##
==========================================
+ Coverage   96.69%   96.71%   +0.02%     
==========================================
  Files         104      104              
  Lines       44023    44339     +316     
==========================================
+ Hits        42567    42883     +316     
  Misses       1456     1456              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@YairVaknin-starkware
Copy link
Collaborator Author

Will add tests next week for codecov

@gabrielbosio
Copy link
Collaborator

Hi, @YairVaknin-starkware! It would be great if we have a description of this PR just to keep track of the development in the base branch more easily.

@YairVaknin-starkware
Copy link
Collaborator Author

Hi, @YairVaknin-starkware! It would be great if we have a description of this PR just to keep track of the development in the base branch more easily.

sure, done. PTAL.

@YairVaknin-starkware
Copy link
Collaborator Author

Will add tests next week for codecov

Done.

@YairVaknin-starkware
Copy link
Collaborator Author

YairVaknin-starkware commented Sep 7, 2025

Also, please note that this is a quick and simple fix, since we assume this table won't grow too much (and each separate chain won't be long). I can also impl it in a way that we won't traverse intermediate chain entries once we already set the value for the key that we started the chain on, but as I said, this doesn't seem worth it, and would need to record the visited entries in each chain.

@gabrielbosio
Copy link
Collaborator

Description looks good. Also I like the detailed tests.

  • Is it possible to add a test that calls relocate_memory like this one?
  • Is this something strictly related to the work being done in starkware-development branch or there might be a case where Cairo VM 2 has to handle a chain of temp segments?

@YairVaknin-starkware YairVaknin-starkware force-pushed the yairv/fix_temp_segment_chain_bug branch from d23f7a6 to 8ab7764 Compare September 16, 2025 17:24
@YairVaknin-starkware
Copy link
Collaborator Author

  • Is it possible to add a test that calls relocate_memory like this one?

added. PTAL @FrancoGiachetta @gabrielbosio @Yael-Starkware.

  • Is this something strictly related to the work being done in starkware-development branch or there might be a case where Cairo VM 2 has to handle a chain of temp segments?

It's a bug that could occur in any cairo0 code, but I only know of a (future) use-case that's needed for Stwo's backend (so aligned with the changes in starkware-development currently).

Copy link
Collaborator

@Yael-Starkware Yael-Starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 7 unresolved discussions (waiting on @FrancoGiachetta and @YairVaknin-starkware)


vm/src/vm/vm_memory/memory.rs line 290 at r2 (raw file):

    }
    #[cfg(not(feature = "extensive_hints"))]
    fn flatten_relocation_rules(&mut self) -> Result<(), MemoryError> {

this function and the next one has a lot of common logic, I'd make them one and separate with the cfg decorator only to the minimal extent.

Code quote:

fn flatten_relocation_rules(&mut self) -> Result<(), MemoryError> {

vm/src/vm/vm_memory/memory.rs line 330 at r2 (raw file):

            loop {
                match dst {
                    MaybeRelocatable::RelocatableValue(r) if r.segment_index < 0 => {

Suggestion:

relocatable

vm/src/vm/vm_memory/memory.rs line 344 at r2 (raw file):

                        match next {
                            MaybeRelocatable::RelocatableValue(nr) => {

Suggestion:

next_relocatable

vm/src/vm/vm_memory/memory.rs line 381 at r2 (raw file):

        for segment in self.data.iter_mut().chain(self.temp_data.iter_mut()) {
            for cell in segment.iter_mut() {
                let value = cell.get_value();

how does a value from the segment turn into a relocatable?

Code quote:

 let value = cell.get_value();

vm/src/vm/vm_memory/memory.rs line 387 at r2 (raw file):

                            addr,
                            &self.relocation_rules,
                        )?);

isn't that duplicate of what happens in flatten_relocation_rules?

Code quote:

                    Some(MaybeRelocatable::RelocatableValue(addr)) if addr.segment_index < 0 => {
                        let mut new_cell = MemoryCell::new(Memory::relocate_address(
                            addr,
                            &self.relocation_rules,
                        )?);

@YairVaknin-starkware YairVaknin-starkware force-pushed the yairv/fix_temp_segment_chain_bug branch from 8ab7764 to 5b3a8d1 Compare October 19, 2025 08:41
@YairVaknin-starkware YairVaknin-starkware changed the base branch from starkware-development to main October 19, 2025 08:42
Copy link
Collaborator Author

@YairVaknin-starkware YairVaknin-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 7 unresolved discussions (waiting on @FrancoGiachetta and @Yael-Starkware)


vm/src/vm/vm_memory/memory.rs line 290 at r2 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

this function and the next one has a lot of common logic, I'd make them one and separate with the cfg decorator only to the minimal extent.

They're almost entirely different. It's much cleaner that way imo. Using config per code line makes the code much less readable.


vm/src/vm/vm_memory/memory.rs line 381 at r2 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

how does a value from the segment turn into a relocatable?

not sure, but if I understand you question, in each cell the value is either an address (relocatable) or absolute (int).


vm/src/vm/vm_memory/memory.rs line 387 at r2 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

isn't that duplicate of what happens in flatten_relocation_rules?

This does the actual relocation after we decide on the final address of the temp segment in flatten_relocation_rules.


vm/src/vm/vm_memory/memory.rs line 330 at r2 (raw file):

            loop {
                match dst {
                    MaybeRelocatable::RelocatableValue(r) if r.segment_index < 0 => {

Done.


vm/src/vm/vm_memory/memory.rs line 344 at r2 (raw file):

                        match next {
                            MaybeRelocatable::RelocatableValue(nr) => {

Done.

Copy link
Collaborator Author

@YairVaknin-starkware YairVaknin-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 7 unresolved discussions (waiting on @FrancoGiachetta and @Yael-Starkware)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants