Skip to content

[SPARK-56022][SQL] Preserve SparkThrowable error classes in parmap/awaitResult#54846

Open
linhongliu-db wants to merge 1 commit intoapache:masterfrom
linhongliu-db:preserve-spark-throwable
Open

[SPARK-56022][SQL] Preserve SparkThrowable error classes in parmap/awaitResult#54846
linhongliu-db wants to merge 1 commit intoapache:masterfrom
linhongliu-db:preserve-spark-throwable

Conversation

@linhongliu-db
Copy link
Contributor

What changes were proposed in this pull request?

SparkException at ParquetFileFormat$.readParquetFootersInParallel already has the structured error class FAILED_READ_FILE.CANNOT_READ_FILE_FOOTER with SQL state KD001, but SparkThreadUtils.awaitResult wraps it in a generic SparkException, hiding the SQL state.

This PR:

  1. Adds preserveSparkThrowable parameter to SparkThreadUtils.awaitResult, ThreadUtils.awaitResult, and ThreadUtils.parmap (default false — no behavior change for existing callers)
  2. Updates ParquetFileFormat.readParquetFootersInParallel to pass preserveSparkThrowable = true to parmap, preserving the structured error class

Files changed:

  • SparkThreadUtils.scala — new 3-param awaitResult overload
  • ThreadUtils.scala — new 3-param awaitResult overloads (Awaitable + JFuture) + parmap flag
  • ParquetFileFormat.scala — caller passes preserveSparkThrowable = true to parmap
  • ThreadUtilsSuite.scala — tests for awaitResult and parmap preservation
  • ParquetFileFormatSuite.scala — updated test to verify error class is thrown directly

Why are the changes needed?

When reading parquet footers in parallel, the FAILED_READ_FILE.CANNOT_READ_FILE_FOOTER error class is wrapped in a generic SparkException by awaitResult, losing the structured SQL state. This makes it harder for users and applications to programmatically handle specific error conditions.

Does this PR introduce any user-facing change?

Yes. ParquetFileFormat.readParquetFootersInParallel now throws the structured SparkException directly instead of wrapping it in a generic SparkException. The error class and SQL state are preserved.

How was this patch tested?

  1. Added unit tests in ThreadUtilsSuite for awaitResult (Awaitable and JFuture) and parmap with preserveSparkThrowable flag true/false
  2. Updated ParquetFileFormatSuite to verify the error class is thrown directly

Was this patch authored or co-authored using generative AI tooling?

Yes.

@linhongliu-db linhongliu-db changed the title [SPARK-XXXXX][SQL] Preserve SparkThrowable error classes in parmap/awaitResult [SPARK-56022][SQL] Preserve SparkThrowable error classes in parmap/awaitResult Mar 17, 2026
…aitResult

`SparkException` at `ParquetFileFormat$.readParquetFootersInParallel` already has the structured error class `FAILED_READ_FILE.CANNOT_READ_FILE_FOOTER` with SQL state `KD001`, but `SparkThreadUtils.awaitResult` wraps it in a generic `SparkException`, hiding the SQL state.

This PR:
1. **Adds `preserveSparkThrowable` parameter** to `SparkThreadUtils.awaitResult`, `ThreadUtils.awaitResult`, and `ThreadUtils.parmap` (default `false` — no behavior change for existing callers)
2. **Updates `ParquetFileFormat.readParquetFootersInParallel`** to pass `preserveSparkThrowable = true` to `parmap`, preserving the structured error class

When reading parquet footers in parallel, the `FAILED_READ_FILE.CANNOT_READ_FILE_FOOTER` error class is wrapped in a generic `SparkException` by `awaitResult`, losing the structured SQL state. This makes it harder for users and applications to programmatically handle specific error conditions.

Yes. `ParquetFileFormat.readParquetFootersInParallel` now throws the structured `SparkException` directly instead of wrapping it in a generic `SparkException`. The error class and SQL state are preserved.

1. Added unit tests in `ThreadUtilsSuite` for `awaitResult` (Awaitable and JFuture) and `parmap` with `preserveSparkThrowable` flag true/false
2. Updated `ParquetFileFormatSuite` to verify the error class is thrown directly

Yes.

Co-authored-by: Isaac
@linhongliu-db linhongliu-db force-pushed the preserve-spark-throwable branch from 9df65fd to b754861 Compare March 17, 2026 20:45
Copy link
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if CI is green

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants