Skip to content

Commit

Permalink
[SPARK-50996][K8S] Increase spark.kubernetes.allocation.batch.size
Browse files Browse the repository at this point in the history
…to 10

### What changes were proposed in this pull request?

This PR aims to increase `spark.kubernetes.allocation.batch.size` to 10 from 5 in Apache Spark 4.0.0.

### Why are the changes needed?

Since Apache Spark 2.3.0, Apache Spark uses `5` as the default value of executor allocation batch size for 8 years conservatively.
- #19468

Given that the improvement of K8s hardware infrastructure for last 8 year, we had better use a bigger value, `10`, from Apache Spark 4.0.0 in 2025.

Technically, when we request 1200 executor pod,
- Batch Size `5` takes 4 minutes.
- Batch Size `10` takes 2 minutes.

### Does this PR introduce _any_ user-facing change?

Yes, the users will see faster Spark job resource allocation. The migration guide is updated correspondingly.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #49681 from dongjoon-hyun/SPARK-50996.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
dongjoon-hyun committed Jan 27, 2025
1 parent e7821c8 commit 9da1cd0
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 2 deletions.
2 changes: 2 additions & 0 deletions docs/core-migration-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ license: |

- In Spark 4.0, support for Apache Mesos as a resource manager was removed.

- Since Spark 4.0, Spark will allocate executor pods with a batch size of `10`. To restore the legacy behavior, you can set `spark.kubernetes.allocation.batch.size` to `5`.

- Since Spark 4.0, Spark uses `ReadWriteOncePod` instead of `ReadWriteOnce` access mode in persistence volume claims. To restore the legacy behavior, you can set `spark.kubernetes.legacy.useReadWriteOnceAccessMode` to `true`.

- Since Spark 4.0, Spark reports its executor pod status by checking all containers of that pod. To restore the legacy behavior, you can set `spark.kubernetes.executor.checkAllContainers` to `false`.
Expand Down
2 changes: 1 addition & 1 deletion docs/running-on-kubernetes.md
Original file line number Diff line number Diff line change
Expand Up @@ -682,7 +682,7 @@ See the [configuration page](configuration.html) for information on Spark config
</tr>
<tr>
<td><code>spark.kubernetes.allocation.batch.size</code></td>
<td><code>5</code></td>
<td><code>10</code></td>
<td>
Number of pods to launch at once in each round of executor pod allocation.
</td>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -450,7 +450,7 @@ private[spark] object Config extends Logging {
.version("2.3.0")
.intConf
.checkValue(value => value > 0, "Allocation batch size should be a positive integer")
.createWithDefault(5)
.createWithDefault(10)

val KUBERNETES_ALLOCATION_BATCH_DELAY =
ConfigBuilder("spark.kubernetes.allocation.batch.delay")
Expand Down

0 comments on commit 9da1cd0

Please sign in to comment.