Skip to content

docs(datasource): START_COMMIT is exclusive, not inclusive#18954

Draft
yihua wants to merge 4 commits into
apache:masterfrom
yihua:fix-start-commit-doc
Draft

docs(datasource): START_COMMIT is exclusive, not inclusive#18954
yihua wants to merge 4 commits into
apache:masterfrom
yihua:fix-start-commit-doc

Conversation

@yihua

@yihua yihua commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Describe the issue this Pull Request addresses

The doc for hoodie.datasource.read.begin.instanttime (config alias START_COMMIT) currently states New data written with completion_time >= START_COMMIT are fetched out, with the example phrased on or after. This is inconsistent with the actual implementation, which treats START_COMMIT as exclusive:

  • V1 relation: timeline is filtered via findInstantsInRange(start, end) which is (start, end] (start-exclusive). See InstantComparison.isInRange.
  • V2 relation: defaults to RangeType.OPEN_CLOSED (start-exclusive) after 31166ce6f1 fix(query): Change start commit time to be exclusive in incremental query on Spark.

Summary and Changelog

Updates the START_COMMIT config description to use > instead of >=, and rephrases the example from on or after to strictly after, matching the runtime behavior of both V1 and V2 incremental relations.

Impact

Documentation only. No code behavior change.

Risk Level

none

Documentation Update

This PR is the documentation update. A companion PR will update the published configuration pages on the Hudi docs site.

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

Spark's incremental relation filters with completion_time > START_COMMIT
(start-exclusive). The config doc said >=, which contradicts both the V1
relation's findInstantsInRange (start, end] and the V2 relation's
RangeType.OPEN_CLOSED default.
@github-actions github-actions Bot added the size:XS PR with lines of changes in <= 10 label Jun 10, 2026
yihua added 3 commits June 9, 2026 21:03
V1 relation (used for table version 6 source tables) interprets
START_COMMIT as requested/instant time; V2 relation (table version 8+)
interprets it as completion time. Both are start-exclusive.
@github-actions github-actions Bot added size:S PR with lines of changes in (10, 100] and removed size:XS PR with lines of changes in <= 10 labels Jun 10, 2026
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.25%. Comparing base (1fd2c36) to head (0b00aad).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18954      +/-   ##
============================================
- Coverage     68.26%   68.25%   -0.01%     
- Complexity    29500    29513      +13     
============================================
  Files          2542     2542              
  Lines        142618   142637      +19     
  Branches      17790    17789       -1     
============================================
+ Hits          97352    97353       +1     
- Misses        37261    37281      +20     
+ Partials       8005     8003       -2     
Flag Coverage Δ
common-and-other-modules 44.78% <100.00%> (-0.01%) ⬇️
hadoop-mr-java-client 44.74% <ø> (+0.06%) ⬆️
spark-client-hadoop-common 48.05% <ø> (-0.01%) ⬇️
spark-java-tests 48.76% <100.00%> (-0.01%) ⬇️
spark-scala-tests 44.84% <100.00%> (+<0.01%) ⬆️
utilities 37.26% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...main/scala/org/apache/hudi/DataSourceOptions.scala 95.53% <100.00%> (+0.04%) ⬆️

... and 21 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hudi-bot

Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S PR with lines of changes in (10, 100]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants