small codefix for Calc crash-on-open AV — latent UAF, debug-CRT-deter…#485
small codefix for Calc crash-on-open AV — latent UAF, debug-CRT-deter…#485leginee wants to merge 1 commit into
Conversation
|
hi Arrigo, Damjan. |
but present in all builds. ScViewData::ReadUserDataSequence (viewdata.cxx:2821) does delete pTabData[nTab]; pTabData[nTab] = new ScViewDataTable; per sheet but never refreshes pThisTab (which pointed at pTabData[nTabNo]). Back in ScTabView::SetTabNo, line 1660 reads aViewData. GetActivePart() → pThisTab->eWhichActive before line 1663 fixes pThisTab → use-after-free → AV sc!ScTabView::SetTabNo mov ecx,[eax+edx*4+0x664] with edx=0xDDDDDDDD (pGridWin[0xdddddddd]). This is a genuine latent UAF (reading a dangling pointer is UB); it is masked in release by unspecified allocator behaviour. The delete and new run in the same-thread, adjacent, same size class, with nothing between them. Mainstream allocators keep per-size free lists with MRU/LIFO reuse. New almost always returns the just-freed block — and when it does, pThisTab == pTabData[nTabNo] is true again and points at the live, freshly constructed object. But it is not guaranteed, and can surface non-deterministically (AI anaysis, may be wrong): - Windows LFH randomizes allocation slots (Win8+ exploit mitigation), so new may return a different slot. In release the freed block isn't poisoned, so the immediate read usually still sees plausible old bytes — but the window here is not two instructions: the rest of ReadUserDataSequence parses all other sheets (lots of allocation), any of which can reclaim that block and overwrite the eWhichActive offset with a large value → pGridWin[garbage] → a rare, timing-dependent release crash. - Hardened/diagnostic allocators (PageHeap, Application Verifier, ASan, a custom operator new) don't do MRU reuse at all → they crash release too. - The debug CRT (MSVCR90D) is just the deterministic trigger: it poison-fills freed blocks with 0xDD and delays reuse, so new returns a different block every time.
e6374a3 to
9dfdb73
Compare
|
Do we have any way to reproduce the issue? |
| pTabData[nZoomTab]->aPageZoomY = aDefPageZoomY; | ||
| } | ||
|
|
||
| // The loop above delete'd and re-new'd pTabData[] entries (including the |
There was a problem hiding this comment.
I understand that "the loop above" should be at lines 2811-2828. This comment and the following code should be closer to that, shouldn't they?
In addition, I do not see the delete and new as deterministic as the comment states. Reading the code, I understand that pTabData[] entries could be recreated or not.
There was a problem hiding this comment.
The comment refers to the code lines below that where added. Basically all we do is trigger a refresh of the index and nothing else.
The culprit lines are
delete pTabData[nTab];
pTabData[nTab] = new ScViewDataTable;
I think it is that far below because we do not read the index before that point. The If else clause i think ends in 2942. so we could move the code in 2954.
The code needs to be outside the if - else clause in order to ensure that the code is refreshed. Better would be not handle the index static, but recalculate the index every time we want to read it, because on today architecture thats faster. (says the AI, and it is plausible to me :D ) there are more details at:
https://github.com/apache/openoffice/blob/bazel-migration/bug-readme.md#14-calc-crash-on-open-av--latent-uaf-debug-crt-deterministic-not-a-migration-bug
| // ~1502). Without it, a debug build AVs on document open (reads 0xDDDDDDDD); | ||
| // release masks it only via allocator MRU reuse. | ||
| CreateTabData( nTabNo ); | ||
| pThisTab = pTabData[nTabNo]; |
There was a problem hiding this comment.
Why not rather call ScViewData::UpdateThis()?
This code creates pTabData[nTabNo] if it does not exist.
ScViewData::UpdateThis() searches for the last existing element of pTabData[] without changing its contents.
There was a problem hiding this comment.
Sadly i do not have the week of AI discussion anymore.
Do we have any way to reproduce the issue?
You need a debug build, that poisens freed memory. That will cause AOO to crash when you start calc. If i understood the AI correctly it is independant of Platform. Valgrind for example can do this on Linux.
But i dont know how that is done.
(from the AI:)
The memory explicitly lists the equivalents that also expose it: PageHeap / AppVerifier / ASan on Windows, and valgrind / MallocScribble / Guard Malloc / MALLOC_PERTURB_ on mac/linux (same shared source). ASan in particular would report it as a use-after-free with a clean read-after-free stack, not an opaque 0xDD AV — so it's actually a better surfacing tool than the debug CRT.
Here is how i crossed the issue: (adapted to the current code)
I did build AOO from the bazel migration branch using:
bazel build //main/staging:install --compilation_mode=dbg --//build:debug_info=True --features=generate_pdb --jobs=1 (takes a hole day, because of jobs=1, if you checkout head, then you should revert commit 2955bc9. Then build AOO with:
bazel build //main/staging:install --compilation_mode=dbg --//build:debug_info=True --features=generate_pdb
and you should get the build in 2h to 3h.)
Then run
cdb -g -G bazel-bin\main\staging\install\program\soffice.exe -env:UserInstallation=file:///C:/temp/ooo_crash -norestore
Go through the wizard until you are in the start menu, then select calc, calc should crash right away. Should be similar on other tools.
There was a problem hiding this comment.
Why not rather call ScViewData::UpdateThis()?
This code creates pTabData[nTabNo] if it does not exist.
ScViewData::UpdateThis() searches for the last existing element of pTabData[] without changing its contents.
I let the AI answer:
Both would refresh pThisTab, but UpdateThis() is the wrong tool here — it carries a side effect that's appropriate for tab deletion, not for an active-tab read. Three reasons our two-line form is correct and UpdateThis() isn't:
- UpdateThis() mutates nTabNo — it clamps the active sheet downward.
void ScViewData::UpdateThis() {
do {
pThisTab = pTabData[nTabNo];
if (!pThisTab) {
if (nTabNo>0) --nTabNo; // ← walks the active tab DOWN
else pThisTab = pTabData[0] = new ScViewDataTable;
}
} while (!pThisTab);
}
That downward walk is exactly what DeleteTab/InsertTab (lines 515/528) want: tabs were removed, so nTabNo may now be out of range and must be pulled back to a surviving sheet. But ReadUserDataSequence is restoring the saved active tab — nTabNo is meaningful and must not move. If the active sheet happened to have no entry in the user data (pTabData[nTabNo] == NULL), UpdateThis() would silently switch the document to a different, lower-indexed sheet on open. That's a behavior regression, not a fix. CreateTabData(nTabNo) instead materializes that slot in place and leaves nTabNo alone.
-
CreateTabData initializes the table; UpdateThis's fallback doesn't. CreateTabData (viewdata.cxx:1470) seeds the new ScViewDataTable with the default zoom (eDefZoomType, aDefZoomX, …) — which is the whole subject of the code right above our fix (the zoom-default loop at 2944-2953). UpdateThis's NULL branch only does a bare new ScViewDataTable (and only at index 0), with no zoom defaults. So our form stays consistent with the surrounding zoom logic.
-
It's a verbatim copy of the file's own blessed idiom. SetTabNo (lines 1501-1502) is literally CreateTabData(nTabNo); pThisTab = pTabData[nTabNo];. For a user-authorized source divergence in a build-migration project, the right posture is the smallest change that mirrors existing correct code in the same translation unit — not reaching for a different helper with broader semantics. Lower risk, obviously-correct, easy to defend upstream.
Worth noting (from the memory): the proper structural fix is neither — it's to delete the cached pThisTab member entirely and derive it via a ThisTab() const { return pTabData[nTabNo]; } accessor, making the invariant correct by construction and killing the whole bug class. We didn't do that because the goal was a minimal, low-risk unblock, not a refactor.
but present in all builds.
ScViewData::ReadUserDataSequence (viewdata.cxx:2821) does delete pTabData[nTab]; pTabData[nTab] = new ScViewDataTable; per sheet but never refreshes pThisTab (which pointed at pTabData[nTabNo]). Back in ScTabView::SetTabNo, line 1660 reads aViewData.
GetActivePart() → pThisTab->eWhichActive before line 1663 fixes pThisTab → use-after-free → AV sc!ScTabView::SetTabNo mov ecx,[eax+edx*4+0x664] with edx=0xDDDDDDDD (pGridWin[0xdddddddd]).
This is a genuine latent UAF (reading a dangling pointer is UB); it is masked in release by unspecified allocator behaviour. The delete and new run in the same-thread, adjacent, same size class, with nothing between them. Mainstream allocators keep per-size free lists with MRU/LIFO reuse. New almost always returns the just-freed block — and when it does, pThisTab == pTabData[nTabNo] is true again and points at the live, freshly constructed object.
But it is not guaranteed, and can surface non-deterministically (AI anaysis, may be wrong):