[SDP] BatchTableWrite Flow Execution

jaceklaskowski · jaceklaskowski · commit ea323d5aa8fd · 2025-07-31T19:24:24.000+02:00
diff --git a/docs/declarative-pipelines/BatchTableWrite.md b/docs/declarative-pipelines/BatchTableWrite.md
@@ -1,4 +1,8 @@
-# BatchTableWrite
+---
+title: BatchTableWrite
+---
+
+# BatchTableWrite Flow Execution
 
 `BatchTableWrite` is a [FlowExecution](FlowExecution.md) that writes a batch `DataFrame` to a [Table](#destination).
 
@@ -7,26 +11,46 @@
 `BatchTableWrite` takes the following to be created:
 
 * <span id="identifier"> `TableIdentifier`
-* <span id="flow"> `ResolvedFlow`
+* <span id="flow"> [ResolvedFlow](ResolvedFlow.md)
 * <span id="graph"> [DataflowGraph](DataflowGraph.md)
 * <span id="destination"> [Table](Table.md)
 * <span id="updateContext"> [PipelineUpdateContext](PipelineUpdateContext.md)
 * <span id="sqlConf"> Configuration Properties
 
 `BatchTableWrite` is created when:
 
-* FIXME
+* `FlowPlanner` is requested to [plan a CompleteFlow](FlowPlanner.md#plan)
 
-## executeInternal { #executeInternal }
+## Execute { #executeInternal }
 
-```scala
-executeInternal(): Future[Unit]
-```
+??? note "FlowExecution"
 
-`executeInternal`...FIXME
+    ```scala
+    executeInternal(): Future[Unit]
+    ```
 
----
+    `executeInternal` is part of the [FlowExecution](FlowExecution.md#executeInternal) abstraction.
+
+`executeInternal` activates the [configuration properties](#sqlConf) in the current [SparkSession](FlowExecution.md#spark).
+
+`executeInternal` requests this [PipelineUpdateContext](#updateContext) for the [FlowProgressEventLogger](PipelineUpdateContext.md#flowProgressEventLogger) to [recordRunning](FlowProgressEventLogger.md#recordRunning) with this [ResolvedFlow](#flow).
+
+`executeInternal` requests this [DataflowGraph](#graph) to [re-analyze](DataflowGraph.md#reanalyzeFlow) this [ResolvedFlow](#flow) to get the [DataFrame](ResolvedFlow.md#df) (the logical query plan)
+
+`executeInternal` executes `append` batch write asynchronously:
+
+1. Creates a [DataFrameWriter](../DataFrameWriter.md) for the batch query's logical plan (the [DataFrame](ResolvedFlow.md#df)).
+1. Sets the write format to the [format](Table.md#format) of this [Table](#destination).
+1. In the end, `executeInternal` appends the rows to this [Table](#destination) (using [DataFrameWriter.saveAsTable](../DataFrameWriter.md#saveAsTable) operator).
+
+## isStreaming { #isStreaming }
+
+??? note "FlowExecution"
+
+    ```scala
+    isStreaming: Boolean
+    ```
 
-`executeInternal` is used when:
+    `isStreaming` is part of the [FlowExecution](FlowExecution.md#isStreaming) abstraction.
 
-* FIXME
+`isStreaming` is always disabled (`false`).
diff --git a/docs/declarative-pipelines/FlowExecution.md b/docs/declarative-pipelines/FlowExecution.md
@@ -30,6 +30,10 @@ getOrigin: QueryOrigin
 isStreaming: Boolean
 ```
 
+See:
+
+* [BatchTableWrite](BatchTableWrite.md#isStreaming)
+
 ### PipelineUpdateContext { #updateContext }
 
 ```scala
diff --git a/docs/declarative-pipelines/FlowProgressEventLogger.md b/docs/declarative-pipelines/FlowProgressEventLogger.md
@@ -0,0 +1,3 @@
+# FlowProgressEventLogger
+
+`FlowProgressEventLogger` is...FIXME
diff --git a/docs/declarative-pipelines/index.md b/docs/declarative-pipelines/index.md
@@ -4,7 +4,7 @@ subtitle: ⚠️ 4.1.0-SNAPSHOT
 
 # Declarative Pipelines
 
-**Spark Declarative Pipelines (SDP)** is a declarative framework for building ETL pipelines on Apache Spark.
+**Spark Declarative Pipelines (SDP)** is a declarative framework for building ETL pipelines on Apache Spark using Python or SQL.
 
 !!! danger
     Declarative Pipelines framework is only available in the development branch of Apache Spark 4.1.0-SNAPSHOT.

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+# FlowProgressEventLogger`
	`2`	`+`
	`3`	+`FlowProgressEventLogger` is...FIXME