Skip to content

Conversation

@denik
Copy link
Contributor

@denik denik commented Oct 21, 2025

Changes

Engine selection

Engine is now selected based on available state, rather than env var. The env var is still consulted if there are no local or remote state files.

Since we need remote state, this is done during state-pull. This means we don't know engine in bundle validate/summary and in deploy/destroy until we reached that stage.

Since we don't know if remote is migrated, we pull both terraform state and direct state all the time and decide based on serial number which to use.

Command refactoring

There are many commands that needed refactoring, so I extracted common bundle steps in cmd/bundle/utils/process.go. This allows to enforce certain order on how things are run and encode assumptions in one place. For example, you cannot pull state until you called phases.Initialize() because certain paths are not initialized.

Why

This makes bundle engine setting sticky, once migrated to direct it'll stay on direct. This will be important for subsequent 'bundle deployment migrate' command.

@eng-dev-ecosystem-bot
Copy link
Collaborator

eng-dev-ecosystem-bot commented Oct 21, 2025

Run: 18936300410

Env 🔄​flaky 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip
🔄​ aws linux 3 2 324 590
🔄​ aws windows 2 1 2 325 589
💚​ aws-ucws linux 2 1 446 486
💚​ aws-ucws windows 2 1 447 485
🔄​ azure linux 3 2 324 589
💚​ azure windows 1 2 327 588
🔄​ azure-ucws linux 2 1 444 485
🔄​ azure-ucws windows 2 1 445 484
🔄​ gcp linux 1 1 2 324 591
🔄​ gcp windows 7 2 320 590
16 failing tests:
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
TestAccept 🔄​f 💚​R 💚​R 💚​R 🔄​f 💚​R 🔄​f 🔄​f 💚​R 🔄​f
TestAccept/bundle/resources/pipelines/lakeflow-pipeline ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f
TestAccept/bundle/resources/pipelines/lakeflow-pipeline/DATABRICKS_BUNDLE_ENGINE=direct-exp ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f
TestAccept/bundle/resources/synced_database_tables/basic 🙈​S 🙈​S 💚​R 💚​R 🙈​S 🙈​S 🔄​f 🔄​f 🙈​S 🙈​S
TestAccept/bundle/run/app-with-job 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
TestAccept/bundle/templates/default-python/integration_classic ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p ✅​p ✅​p ✅​p
TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=direct-exp/UV_PYTHON=3.9 ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p ✅​p ✅​p ✅​p
TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=terraform/UV_PYTHON=3.11 ✅​p 🔄​f ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p
TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=terraform/UV_PYTHON=3.13 ✅​p 🔄​f ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p
TestAccept/selftest/record_cloud/pipeline-crud 🔄​f ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p
TestAccept/selftest/record_cloud/pipeline-crud/DATABRICKS_BUNDLE_ENGINE=terraform 🔄​f ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p
TestGenerateFromExistingPipelineAndDeploy ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p
TestFsCpDirToDirFileNotOverwritten ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f
TestFsCpDirToDirFileNotOverwritten/dbfs_to_dbfs ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f
TestFsCpFileToDirFileNotOverwritten ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f
TestFsCpFileToDirFileNotOverwritten/dbfs_to_dbfs ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f

@denik denik changed the title Require bundle engine initialization + refactor Explicit bundle engine initialization + refactor Oct 21, 2025
@denik denik marked this pull request as draft October 23, 2025 07:44
@denik denik force-pushed the denik/require-engine-init branch from 27cb3b3 to 0be1903 Compare October 23, 2025 10:54
@denik denik temporarily deployed to test-trigger-is October 23, 2025 10:54 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is October 23, 2025 10:55 — with GitHub Actions Inactive
@denik denik force-pushed the denik/require-engine-init branch from 9326a2b to bc34c50 Compare October 23, 2025 13:12
@denik denik temporarily deployed to test-trigger-is October 23, 2025 13:12 — with GitHub Actions Inactive
@denik denik force-pushed the denik/require-engine-init branch from bc34c50 to 84de40b Compare October 23, 2025 13:15
@denik denik temporarily deployed to test-trigger-is October 23, 2025 13:16 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is October 23, 2025 13:47 — with GitHub Actions Inactive
@denik denik changed the title Explicit bundle engine initialization + refactor Set engine based on state rather than env var Oct 23, 2025
@denik denik force-pushed the denik/require-engine-init branch from a47f91a to c64329d Compare October 23, 2025 14:02
@denik denik temporarily deployed to test-trigger-is October 23, 2025 14:02 — with GitHub Actions Inactive
@denik denik force-pushed the denik/require-engine-init branch from c64329d to 4cfc80b Compare October 23, 2025 21:05
@denik denik temporarily deployed to test-trigger-is October 23, 2025 21:05 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is October 23, 2025 21:14 — with GitHub Actions Inactive
@denik denik force-pushed the denik/require-engine-init branch from d382efa to c64cb09 Compare October 24, 2025 10:26
@denik denik temporarily deployed to test-trigger-is October 24, 2025 10:26 — with GitHub Actions Inactive
@denik denik changed the base branch from main to denik/log-query October 24, 2025 10:35
@denik denik temporarily deployed to test-trigger-is October 24, 2025 10:43 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is October 24, 2025 10:51 — with GitHub Actions Inactive
@denik denik force-pushed the denik/require-engine-init branch from 5bac25e to 6b4d7ea Compare October 30, 2025 09:39
@denik denik temporarily deployed to test-trigger-is October 30, 2025 09:39 — with GitHub Actions Inactive
}()

bundle.ApplySeqContext(ctx, b,
statemgmt.StatePull(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we pull state anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do, it's a function called PullResourcesState()

@denik denik added this pull request to the merge queue Oct 30, 2025
Merged via the queue into main with commit 269568a Oct 30, 2025
13 checks passed
@denik denik deleted the denik/require-engine-init branch October 30, 2025 14:00
denik added a commit that referenced this pull request Oct 31, 2025
## Why
Now that the engine is selected based on the actual state available,
it's safe to use allow users to specify
DATABRICKS_BUNDLE_ENGINE="direct", only new bundles will be affected.

Depends on #3797
pietern added a commit that referenced this pull request Oct 31, 2025
The test TestParseResourcesStateWithExistingStateFile was creating a
terraform.tfstate file in the test source directory instead of in the
temporary directory.

This was introduced in PR #3797 when the code was changed from using
b.StateLocalPath(ctx) to b.StateFilenameTerraform(ctx). The bug was that
StateFilenameTerraform returns two values (remote filename, local path),
but the test only captured the first value (filename) instead of the
second value (full local path).

This fix:
- Captures the second return value (full local path) instead of the first
- Creates the parent directory structure with os.MkdirAll
- Writes the file to the correct location in the temp directory
- Uses proper file permissions consistent with the codebase (0o700/0o600)
pietern added a commit that referenced this pull request Nov 3, 2025
## Changes

PR #3797 replaced a function that returns a state file path with one
that returns both a filename and a path. The test used only the
filename, causing the state file to be created in the wrong directory.

In the current state, it creates the file in `$PWD`, equal to the
directory where the test file is located. The path in `localPath` is
located inside the bundle cache directory, which is initialized to
somewhere inside `t.TempDir()` above.
shreyas-goenka added a commit that referenced this pull request Nov 6, 2025
## Changes
Removes validation that incorrectly rejected bundles using the
`definitions` field in `databricks.yml`.

## Why
PR #3797 inadvertently applied validation to reject OSS Spark
Declarative Pipelines to all Databricks CLI bundles. This error was only
meant for the pipelines CLI. Customers commonly use the `definitions`
field for YAML anchors (e.g., to reuse cluster configurations), and this
validation blocked their deployments with the error:

```
Error: databricks.yml seems to be formatted for open-source Spark Declarative Pipelines.
Pipelines CLI currently only supports Lakeflow Declarative Pipelines development.
```

The validation has been removed entirely. The Pipelines CLI is not yet
in active use, and the pipelines team can add a better context-aware
check later that only applies when running as the Pipelines CLI.

## Tests
- New acceptance test:
`acceptance/bundle/validate/definitions_yaml_anchors/` verifies bundles
with `definitions` field validate successfully
- Existing test: `acceptance/pipelines/deploy/oss-spark-error/` ensures
the error continues to work for the pipelines CLI.
github-merge-queue bot pushed a commit that referenced this pull request Nov 7, 2025
## Changes
### User visible
- Since #3797 this env var is only
consulted if there is no state. Now it is always consulted and must
match the state, otherwise error is raised.
- New “bundle debug states” to print info about available state files.
- Stricter state validation in case both terraform and direct state
files are present. If serial numbers are the same, error rather than
preferring direct.
- Better error messages when state validation fails.

### Internal
- Use string enum for engine instead if bool and update all functions to
use the enum.
- Move all engine parsing/configuration to bundle/config/engine package.
- Engine configuration is done in top level command handler rather than
inside PullResourcesState.
- PullResourcesState() returns winning StateDesc object.

## Why
Ensure users get the engine they select via env var (and in the future
this will extend to config setting).

## Tests
New acceptance tests.
New test helper print_state.py to print state as is.

---------

Co-authored-by: Pieter Noordhuis <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants