Skip to content

Conversation

nipunbatra8
Copy link

@nipunbatra8 nipunbatra8 commented Aug 12, 2025

Motivation

Add a reproducible “update storm” workload to study merge behavior and evaluate a potential bandwidth-capped merge scheduler. This is also intended for future contributors to benchmark and tune their own merge schedulers under realistic update-storm conditions.

Summary of changes

  • Added periodic docs/sec ramp (×2 every 20s) to simulate storms; steady “peace time” afterward.
  • Exposed knobs to vary indexing/search load and reopen cadence.
  • Set aggressive TMP delete reclaim (setDeletesPctAllowed(2.0)) for storm scenarios.
  • Directed InfoStream to file for offline analysis and graphing.

What we observe

  • Docs/sec ramps mark the storm onset.
  • Merge MB/s, merge rate, and segment count spike during the storm.
  • Merging “catches up” and stabilizes in peace time.

Parameters varied (NRTPerfTest.java)

  • Indexing load: docsPerSec (ramp ×2/20s), numIndexThreads (≤ docsPerSec), runTimeSec
  • Searcher pressure: numSearchThreads, reopenPerSec
  • Merge policy (TMP): setDeletesPctAllowed(2.0)
  • Potential to switch Merge scheduler: CMS (baseline) vs. (...)
  • Misc/infra: InfoStream to file

On-graph metrics tracked

  • Rates: Index rate (C/s), Merge rate (C/s), Delete create/reclaim (C/s)
  • Byte rates: Merge (MB/s), Flush (MB/s), Total IO (MB/s)
  • Sizes/counts: Segment count, Index size (GB), Delete %, Commit delta (CMB), Full flush (s)
  • Concurrency: Merge threads, Flush threads

Reproduce

  1. Set up Lucene util.
  2. Check out the draft PR branch.
  3. Run:
    • python3 util/src/python/nrtPerf.py -source wikimediumall -dps 200 -rps 0.06 -nst 8 -nit 8 -rts 3000
  4. cd util/src/python/
  5. Build segment trace:
    • python3 -u infostream_to_segments.py ../../lucene-infostream.log test-output.pk
  6. Generate HTML:
    • python3 -u segments_to_html.py test-output.pk out.html
  7. Open the produced HTML (e.g., out.html or segmetrics.html).

@nipunbatra8 nipunbatra8 changed the title [DRAFT] Update Storm Simulation with NRTPerfTest [Draft] Update Storm Simulation with NRTPerfTest Aug 12, 2025
Copy link
Owner

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Do you have any example segment traces you could attach for some eye-candy? I'm curious what War & Peace times look like w/ the default ConcurrentMergeScheduler...

Also, this tool (NRTPerfTest) is used by the nightly benchmark, so I'd rather make these new options opt-in so we don't change the behavior for nightlies, at least on first cut. It produces this chart (phew, 14+ years of data now!).

Copy link

@ytgu ytgu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

+1 for making these features opt-in so that people can use them to simulate update storms when needed.

@nipunbatra8
Copy link
Author

Here is the aggregate graph and the segment graph with the default ConcurrentMergeScheduler on this test. Working on producing a graph with a while loop on update storms so you can see a loop of update storm followed by peace time then repeat.

Yes made it opt-in so it doesn't disturb the nightlies for now.

@nipunbatra8 nipunbatra8 requested review from mikemccand and ytgu August 28, 2025 19:11
@mikemccand
Copy link
Owner

Hi @nipunbatra8 -- is this still a draft PR? Or it's ready for merging after review?

Copy link
Owner

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @nipunbatra8! I'm still wondering how we could fold this into nightly benchy. Have you opened a spinoff issue for that?

Oh yes, I see it! #459 Thanks.

@nipunbatra8
Copy link
Author

Hi @mikemccand. Yes, let me write a comment instead that explains the specific config options instead of spreading it through the tool and then I will mark for review.

@nipunbatra8 nipunbatra8 changed the title [Draft] Update Storm Simulation with NRTPerfTest Update Storm Simulation with NRTPerfTest Sep 5, 2025
@nipunbatra8
Copy link
Author

Ready for merging!

Copy link

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Sep 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants