Skip to content

small codefix for Calc crash-on-open AV — latent UAF, debug-CRT-deter…#485

Open
leginee wants to merge 1 commit into
trunkfrom
fix-letent-use-after-free-bug
Open

small codefix for Calc crash-on-open AV — latent UAF, debug-CRT-deter…#485
leginee wants to merge 1 commit into
trunkfrom
fix-letent-use-after-free-bug

Conversation

@leginee

@leginee leginee commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

but present in all builds.
ScViewData::ReadUserDataSequence (viewdata.cxx:2821) does delete pTabData[nTab]; pTabData[nTab] = new ScViewDataTable; per sheet but never refreshes pThisTab (which pointed at pTabData[nTabNo]). Back in ScTabView::SetTabNo, line 1660 reads aViewData.
GetActivePart() → pThisTab->eWhichActive before line 1663 fixes pThisTab → use-after-free → AV sc!ScTabView::SetTabNo mov ecx,[eax+edx*4+0x664] with edx=0xDDDDDDDD (pGridWin[0xdddddddd]).

This is a genuine latent UAF (reading a dangling pointer is UB); it is masked in release by unspecified allocator behaviour. The delete and new run in the same-thread, adjacent, same size class, with nothing between them. Mainstream allocators keep per-size free lists with MRU/LIFO reuse. New almost always returns the just-freed block — and when it does, pThisTab == pTabData[nTabNo] is true again and points at the live, freshly constructed object.

But it is not guaranteed, and can surface non-deterministically (AI anaysis, may be wrong):

  • Windows LFH randomizes allocation slots (Win8+ exploit mitigation), so new may return a different slot. In release the freed block isn't poisoned, so the immediate read usually still sees plausible old bytes — but the window here is not two instructions: the rest of ReadUserDataSequence parses all other sheets (lots of allocation), any of which can reclaim that block and overwrite the eWhichActive offset with a large value → pGridWin[garbage] → a rare, timing-dependent release crash.
  • Hardened/diagnostic allocators (PageHeap, Application Verifier, ASan, a custom operator new) don't do MRU reuse at all → they crash release too.
  • The debug CRT (MSVCR90D) is just the deterministic trigger: it poison-fills freed blocks with 0xDD and delays reuse, so new returns a different block every time.

@leginee

leginee commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

hi Arrigo, Damjan.
can you review? Thanks.

but present in all builds.
ScViewData::ReadUserDataSequence (viewdata.cxx:2821) does delete
pTabData[nTab]; pTabData[nTab] = new ScViewDataTable; per sheet but
never refreshes pThisTab (which pointed at pTabData[nTabNo]). Back in
ScTabView::SetTabNo, line 1660 reads aViewData.
GetActivePart() → pThisTab->eWhichActive before line 1663 fixes
pThisTab → use-after-free → AV sc!ScTabView::SetTabNo
mov ecx,[eax+edx*4+0x664] with edx=0xDDDDDDDD (pGridWin[0xdddddddd]).

This is a genuine latent UAF (reading a dangling pointer is UB);
it is masked in release by unspecified allocator behaviour.
The delete and new run in the same-thread, adjacent, same size class,
with nothing between them. Mainstream allocators keep per-size free
lists with MRU/LIFO reuse. New almost always returns the just-freed
block — and when it does, pThisTab == pTabData[nTabNo] is true again and
points at the live, freshly constructed object.

But it is not guaranteed, and can surface non-deterministically
(AI anaysis, may be wrong):
- Windows LFH randomizes allocation slots (Win8+ exploit mitigation),
  so new may return a different slot. In release the freed block isn't
  poisoned, so the immediate read usually still sees plausible old bytes
   — but the window here is not two instructions: the rest of
  ReadUserDataSequence parses all other sheets (lots of allocation),
  any of which can reclaim that block and overwrite the eWhichActive
  offset with a large value → pGridWin[garbage] → a rare,
  timing-dependent release crash.
- Hardened/diagnostic allocators (PageHeap, Application Verifier, ASan,
  a custom operator new) don't do MRU reuse at all → they crash release
  too.
- The debug CRT (MSVCR90D) is just the deterministic trigger: it
  poison-fills freed blocks with 0xDD and delays reuse, so new returns
  a different block every time.
@leginee leginee force-pushed the fix-letent-use-after-free-bug branch from e6374a3 to 9dfdb73 Compare June 21, 2026 05:56
@ardovm

ardovm commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Do we have any way to reproduce the issue?
It could help to understand the difference between the proposed fix (CreateTabData( nTabNo )) and UpdateThis()

pTabData[nZoomTab]->aPageZoomY = aDefPageZoomY;
}

// The loop above delete'd and re-new'd pTabData[] entries (including the

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that "the loop above" should be at lines 2811-2828. This comment and the following code should be closer to that, shouldn't they?

In addition, I do not see the delete and new as deterministic as the comment states. Reading the code, I understand that pTabData[] entries could be recreated or not.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment refers to the code lines below that where added. Basically all we do is trigger a refresh of the index and nothing else.
The culprit lines are

							delete pTabData[nTab];
							pTabData[nTab] = new ScViewDataTable;

I think it is that far below because we do not read the index before that point. The If else clause i think ends in 2942. so we could move the code in 2954.
The code needs to be outside the if - else clause in order to ensure that the code is refreshed. Better would be not handle the index static, but recalculate the index every time we want to read it, because on today architecture thats faster. (says the AI, and it is plausible to me :D ) there are more details at:
https://github.com/apache/openoffice/blob/bazel-migration/bug-readme.md#14-calc-crash-on-open-av--latent-uaf-debug-crt-deterministic-not-a-migration-bug

// ~1502). Without it, a debug build AVs on document open (reads 0xDDDDDDDD);
// release masks it only via allocator MRU reuse.
CreateTabData( nTabNo );
pThisTab = pTabData[nTabNo];

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not rather call ScViewData::UpdateThis()?

This code creates pTabData[nTabNo] if it does not exist.
ScViewData::UpdateThis() searches for the last existing element of pTabData[] without changing its contents.

@leginee leginee Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly i do not have the week of AI discussion anymore.

Do we have any way to reproduce the issue?

You need a debug build, that poisens freed memory. That will cause AOO to crash when you start calc. If i understood the AI correctly it is independant of Platform. Valgrind for example can do this on Linux.
But i dont know how that is done.
(from the AI:)
The memory explicitly lists the equivalents that also expose it: PageHeap / AppVerifier / ASan on Windows, and valgrind / MallocScribble / Guard Malloc / MALLOC_PERTURB_ on mac/linux (same shared source). ASan in particular would report it as a use-after-free with a clean read-after-free stack, not an opaque 0xDD AV — so it's actually a better surfacing tool than the debug CRT.

Here is how i crossed the issue: (adapted to the current code)
I did build AOO from the bazel migration branch using:
bazel build //main/staging:install --compilation_mode=dbg --//build:debug_info=True --features=generate_pdb --jobs=1 (takes a hole day, because of jobs=1, if you checkout head, then you should revert commit 2955bc9. Then build AOO with:
bazel build //main/staging:install --compilation_mode=dbg --//build:debug_info=True --features=generate_pdb
and you should get the build in 2h to 3h.)
Then run
cdb -g -G bazel-bin\main\staging\install\program\soffice.exe -env:UserInstallation=file:///C:/temp/ooo_crash -norestore
Go through the wizard until you are in the start menu, then select calc, calc should crash right away. Should be similar on other tools.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not rather call ScViewData::UpdateThis()?

This code creates pTabData[nTabNo] if it does not exist.
ScViewData::UpdateThis() searches for the last existing element of pTabData[] without changing its contents.

I let the AI answer:

Both would refresh pThisTab, but UpdateThis() is the wrong tool here — it carries a side effect that's appropriate for tab deletion, not for an active-tab read. Three reasons our two-line form is correct and UpdateThis() isn't:

  1. UpdateThis() mutates nTabNo — it clamps the active sheet downward.
void ScViewData::UpdateThis() {
    do {
        pThisTab = pTabData[nTabNo];
        if (!pThisTab) {
            if (nTabNo>0) --nTabNo;              // ← walks the active tab DOWN
            else pThisTab = pTabData[0] = new ScViewDataTable;
        }
    } while (!pThisTab);
}

That downward walk is exactly what DeleteTab/InsertTab (lines 515/528) want: tabs were removed, so nTabNo may now be out of range and must be pulled back to a surviving sheet. But ReadUserDataSequence is restoring the saved active tab — nTabNo is meaningful and must not move. If the active sheet happened to have no entry in the user data (pTabData[nTabNo] == NULL), UpdateThis() would silently switch the document to a different, lower-indexed sheet on open. That's a behavior regression, not a fix. CreateTabData(nTabNo) instead materializes that slot in place and leaves nTabNo alone.

  1. CreateTabData initializes the table; UpdateThis's fallback doesn't. CreateTabData (viewdata.cxx:1470) seeds the new ScViewDataTable with the default zoom (eDefZoomType, aDefZoomX, …) — which is the whole subject of the code right above our fix (the zoom-default loop at 2944-2953). UpdateThis's NULL branch only does a bare new ScViewDataTable (and only at index 0), with no zoom defaults. So our form stays consistent with the surrounding zoom logic.

  2. It's a verbatim copy of the file's own blessed idiom. SetTabNo (lines 1501-1502) is literally CreateTabData(nTabNo); pThisTab = pTabData[nTabNo];. For a user-authorized source divergence in a build-migration project, the right posture is the smallest change that mirrors existing correct code in the same translation unit — not reaching for a different helper with broader semantics. Lower risk, obviously-correct, easy to defend upstream.

Worth noting (from the memory): the proper structural fix is neither — it's to delete the cached pThisTab member entirely and derive it via a ThisTab() const { return pTabData[nTabNo]; } accessor, making the invariant correct by construction and killing the whole bug class. We didn't do that because the goal was a minimal, low-risk unblock, not a refactor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants