Skip to content

Comments

[core] Support ignore corrupt or lost files during read#6821

Open
baiyangtx wants to merge 1 commit intoapache:masterfrom
baiyangtx:ignore-corrupt-files
Open

[core] Support ignore corrupt or lost files during read#6821
baiyangtx wants to merge 1 commit intoapache:masterfrom
baiyangtx:ignore-corrupt-files

Conversation

@baiyangtx
Copy link
Contributor

Purpose

Introduce options scan.ignore-corrupt-files and scan.ignore-lost-files to skip data files with problems.

Tests

  • paimon-spark/paimon-spark-ut/src/test/scala/org/apache/paimon/spark/sql/PaimonQueryTest.scala

API and Format

No

Documentation

No

@baiyangtx baiyangtx force-pushed the ignore-corrupt-files branch from 29d9a59 to 8011de4 Compare December 16, 2025 06:46
@baiyangtx baiyangtx marked this pull request as draft December 22, 2025 09:17
@baiyangtx baiyangtx force-pushed the ignore-corrupt-files branch 2 times, most recently from c8b50f3 to 3192e87 Compare December 25, 2025 11:36
@baiyangtx baiyangtx marked this pull request as ready for review December 25, 2025 13:18
@baiyangtx baiyangtx force-pushed the ignore-corrupt-files branch from f301176 to 27bd82f Compare February 12, 2026 07:12
@baiyangtx baiyangtx force-pushed the ignore-corrupt-files branch from 27bd82f to 3f12b84 Compare February 12, 2026 07:18
@baiyangtx
Copy link
Contributor Author

@Aitozi @JingsongLi PTAL

In production environments, we may encounter various unexpected situations that cause file corruption or loss. To quickly resume downstream tasks, we introduce parameters to ignore these files.

Copy link
Contributor

@Aitozi Aitozi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants