docs(datasource): START_COMMIT is exclusive, not inclusive by yihua · Pull Request #18954 · apache/hudi

yihua · 2026-06-10T03:07:31Z

Describe the issue this Pull Request addresses

The doc for hoodie.datasource.read.begin.instanttime (config alias START_COMMIT) currently states New data written with completion_time >= START_COMMIT are fetched out, with the example phrased on or after. This is inconsistent with the actual implementation, which treats START_COMMIT as exclusive:

V1 relation: timeline is filtered via findInstantsInRange(start, end) which is (start, end] (start-exclusive). See InstantComparison.isInRange.
V2 relation: defaults to RangeType.OPEN_CLOSED (start-exclusive) after 31166ce6f1 fix(query): Change start commit time to be exclusive in incremental query on Spark.

Summary and Changelog

Updates the START_COMMIT config description to use > instead of >=, and rephrases the example from on or after to strictly after, matching the runtime behavior of both V1 and V2 incremental relations.

Impact

Documentation only. No code behavior change.

Risk Level

none

Documentation Update

This PR is the documentation update. A companion PR will update the published configuration pages on the Hudi docs site.

Contributor's checklist

Read through contributor's guide
Enough context is provided in the sections above
Adequate tests were added if applicable

Spark's incremental relation filters with completion_time > START_COMMIT (start-exclusive). The config doc said >=, which contradicts both the V1 relation's findInstantsInRange (start, end] and the V2 relation's RangeType.OPEN_CLOSED default.

V1 relation (used for table version 6 source tables) interprets START_COMMIT as requested/instant time; V2 relation (table version 8+) interprets it as completion time. Both are start-exclusive.

…MMIT/END_COMMIT docs

codecov-commenter · 2026-06-10T05:28:09Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.25%. Comparing base (1fd2c36) to head (0b00aad).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #18954      +/-   ##
============================================
- Coverage     68.26%   68.25%   -0.01%     
- Complexity    29500    29513      +13     
============================================
  Files          2542     2542              
  Lines        142618   142637      +19     
  Branches      17790    17789       -1     
============================================
+ Hits          97352    97353       +1     
- Misses        37261    37281      +20     
+ Partials       8005     8003       -2

Flag	Coverage Δ
common-and-other-modules	`44.78% <100.00%> (-0.01%)`	⬇️
hadoop-mr-java-client	`44.74% <ø> (+0.06%)`	⬆️
spark-client-hadoop-common	`48.05% <ø> (-0.01%)`	⬇️
spark-java-tests	`48.76% <100.00%> (-0.01%)`	⬇️
spark-scala-tests	`44.84% <100.00%> (+<0.01%)`	⬆️
utilities	`37.26% <100.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...main/scala/org/apache/hudi/DataSourceOptions.scala	`95.53% <100.00%> (+0.04%)`	⬆️

... and 21 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

hudi-bot · 2026-06-10T05:45:24Z

CI report:

a1c37b2 UNKNOWN
0b00aad Azure: SUCCESS

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

github-actions Bot added the size:XS PR with lines of changes in <= 10 label Jun 10, 2026

yihua added 3 commits June 9, 2026 21:03

Clarify START_COMMIT semantics by table version

13c860b

V1 relation (used for table version 6 source tables) interprets START_COMMIT as requested/instant time; V2 relation (table version 8+) interprets it as completion time. Both are start-exclusive.

Clarify START_COMMIT and END_COMMIT semantics by source table version

a1c37b2

Note override configs (incr/streaming read table version) in START_CO…

0b00aad

…MMIT/END_COMMIT docs

github-actions Bot added size:S PR with lines of changes in (10, 100] and removed size:XS PR with lines of changes in <= 10 labels Jun 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(datasource): START_COMMIT is exclusive, not inclusive#18954

docs(datasource): START_COMMIT is exclusive, not inclusive#18954
yihua wants to merge 4 commits into
apache:masterfrom
yihua:fix-start-commit-doc

yihua commented Jun 10, 2026

Uh oh!

codecov-commenter commented Jun 10, 2026

Uh oh!

hudi-bot commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yihua commented Jun 10, 2026

Describe the issue this Pull Request addresses

Summary and Changelog

Impact

Risk Level

Documentation Update

Contributor's checklist

Uh oh!

codecov-commenter commented Jun 10, 2026

Codecov Report

Uh oh!

hudi-bot commented Jun 10, 2026

CI report:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants