Skip to content

[seekdb][memtable] Replace global Iterator pool with session-level cache#915

Open
hnwyllmm wants to merge 1 commit into
masterfrom
task/2026061600116771881
Open

[seekdb][memtable] Replace global Iterator pool with session-level cache#915
hnwyllmm wants to merge 1 commit into
masterfrom
task/2026061600116771881

Conversation

@hnwyllmm

Copy link
Copy Markdown
Member

Task Description

Replace the global object pool for memtable range scan iterators (Iterator<BtreeIterator>, ~5KB each) with a session-level caching mechanism. The previous global ObFixedClassAllocator pool had issues: memory was charged to the 500 tenant and was not observable, the global pool only grew and did not shrink under no load, and it was shared between foreground and background tasks without distinction.

Solution Description

Introduced a two-path distribution mechanism to replace the global object pool:

  • Foreground SQL: Obtains the session via THIS_WORKER.get_session() and uses a session-level free list (ObBtreeIterCache) to cache iterators. Iterators are reused across SQL statements within the same session (zero allocation after prewarm). All memory is released when the session disconnects.
  • Background merge/row estimation: Session is nullptr, iterators are directly allocated/freed via ob_malloc/ob_free. Memory is released immediately upon task completion.
  • Thread safety: Only one worker thread executes per session at any given time (PX workers have independent deserialized sessions), so the free list requires no locking.

Passed Regressions

  • test_query_engine: PASSED (1/1)
  • test_mvcc_callback: PASSED (19/19)
  • Functional verification: All passed for range scans, point queries, multi-range, scans after DML, transaction scans, reverse scans, LIMIT, scans after freeze+merge, cross-layer scans, 10 concurrent connections, and PX PARALLEL(4).
  • Sysbench oltp_read_only 4 threads 30s: TPS=278, QPS=4444, zero errors.
  • Sysbench oltp_point_select 4 threads 30s: TPS=3519, QPS=3519, zero errors.
  • Memory verification: BtreeIterCache fully released after session disconnect, BtreeIter immediately freed after merge completion, old Iterator<BtreeI tag completely disappeared.

Upgrade Compatibility

No impact on upgrades. This is a purely internal memory management change with no persistent format modifications.

Other Information

Under no load, all btree iterator-related memory returns to zero, resolving the issue of permanent occupancy by the global pool's peak watermark.
Performance regression: http://:9090/jenkins/job/lite_perf_guard/1003/

Release Note

Replaced the global memory pool for B-tree iterators used in memtable range scans with a session-level cache, improving memory observability and reclaiming memory immediately when sessions end or background tasks complete.

… cache

The global ObFixedClassAllocator pool for memtable btree iterators was
pinned to tenant 500 and never released memory. This change introduces
a per-session free list (ObBtreeIterCache) so iterator memory is properly
attributed to the business tenant and released when sessions disconnect
achieving full memory reclaim at idle.

Frontend SQL scans use the session cache via THIS_WORKER.get_session;
backend merge and estimation paths use direct ob_malloc/ob_free with
proper tenant attribution via MTL_ID.

Co-Authored-By: Claude Opus 4.6 (1M context) <[REDACTED_EMAIL]>
@hnwyllmm

Copy link
Copy Markdown
Member Author

The mapping Dima issue is about optimizing memory usage for Iterator/BtreeI labels.

@footka

footka commented Jun 18, 2026

Copy link
Copy Markdown
Member

src/sql/session/ob_sql_session_info.h:80

Does px seekdb still mock sessions?

@hnwyllmm

Copy link
Copy Markdown
Member Author

src/sql/session/ob_sql_session_info.h:80

Why is it called a mock session? Isn't it just serializing and deserializing the session information once?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants