Skip to content

Auto-generate session names for serverless SSH connect#4701

Open
anton-107 wants to merge 5 commits intomainfrom
antonnek/auto-session-names
Open

Auto-generate session names for serverless SSH connect#4701
anton-107 wants to merge 5 commits intomainfrom
antonnek/auto-session-names

Conversation

@anton-107
Copy link
Contributor

@anton-107 anton-107 commented Mar 11, 2026

Summary

  • Remove the requirement for --name in serverless ssh connect; --accelerator alone is now sufficient
  • Auto-generate human-readable session names (e.g. databricks-gpu-a10-20260310-f3a2b1c0) with workspace host hash to avoid SSH known_hosts conflicts across workspaces
  • Track sessions in ~/.databricks/ssh-tunnel-sessions.json with user identity to prevent cross-profile session mixups
  • Offer reconnection to existing sessions on subsequent runs
  • Clean up stale sessions (local keys, SSH config, secret scopes, workspace content) only on definitive server-not-found errors; transient errors are logged as warnings
  • Prune expired sessions (>24h) from disk automatically during session lookup

Stacked on #4697.

Test plan

  • Unit tests for session store (load/save/add/remove/find/expiry/pruning)
  • Unit tests for name generation (format, uniqueness, workspace differentiation, regex validity)
  • Updated validation and proxy command tests
  • Manual test: databricks ssh connect --accelerator GPU_1xA10 creates new session
  • Manual test: subsequent run detects existing session and prompts to reconnect
  • Manual test: stale sessions are cleaned up when server is definitively gone
  • Manual test: different workspaces produce different session names (no known_hosts conflicts)

🤖 Generated with Claude Code

@eng-dev-ecosystem-bot
Copy link
Collaborator

eng-dev-ecosystem-bot commented Mar 11, 2026

Commit: 42afe03

Run: 23438883179

Env 🟨​KNOWN 🔄​flaky 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
🟨​ aws linux 7 1 9 271 801 10:10
🟨​ aws windows 7 1 9 273 799 9:08
🔄​ aws-ucws linux 3 7 9 367 716 7:48
💚​ aws-ucws windows 8 9 371 714 5:14
💚​ azure linux 2 11 274 799 8:24
💚​ azure windows 2 11 276 797 6:22
🔄​ azure-ucws linux 4 1 11 371 712 8:34
🔄​ azure-ucws windows 2 2 11 374 710 6:53
💚​ gcp linux 2 11 270 802 9:38
💚​ gcp windows 2 11 272 800 9:04
22 interesting tests: 9 SKIP, 7 KNOWN, 6 flaky
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
🟨​ TestAccept 🟨​K 🟨​K 🔄​f 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🙈​ TestAccept/bundle/resources/postgres_branches/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/update_protected 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/without_branch_id 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_projects/update_display_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/synced_database_tables/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🔄​ TestAccept/ssh/connect-serverless-gpu 🙈​s 🙈​s 🔄​f ✅​p 🙈​s 🙈​s 🔄​f ✅​p 🙈​s 🙈​s
🔄​ TestAccept/ssh/connect-serverless-gpu/DATABRICKS_BUNDLE_ENGINE=direct 🔄​f ✅​p 🔄​f ✅​p
🔄​ TestAccept/ssh/connection 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 🔄​f 💚​R 💚​R 💚​R
🔄​ TestAccept/ssh/connection/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p ✅​p
🔄​ TestFilerWorkspaceNotebook ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p
🔄​ TestFilerWorkspaceNotebook/rNb.r ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p
Top 26 slowest tests (at least 2 minutes):
duration env testname
6:11 gcp linux TestSecretsPutSecretStringValue
6:03 gcp windows TestSecretsPutSecretStringValue
5:33 aws linux TestSecretsPutSecretStringValue
5:04 aws windows TestSecretsPutSecretStringValue
4:10 azure linux TestSecretsPutSecretStringValue
3:45 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:43 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:39 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:25 azure windows TestSecretsPutSecretStringValue
3:20 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:14 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:13 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:11 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:11 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:10 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:47 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:45 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:43 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:42 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:42 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:40 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:36 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:32 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:23 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:17 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:13 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform

Copy link
Member

@simonfaltum simonfaltum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent Swarm Review] Verdict: Not ready yet

  • 1 Critical
  • 3 Major
  • 1 Gap
  • 2 Nit
  • 2 Suggestion

The feature design (auto-generate session names, track sessions, offer reconnection) is good UX. However, there are several correctness issues that need addressing before this is safe to ship:

  1. Critical: Any probe failure triggers irreversible remote resource cleanup, even for transient network errors
  2. Major: No identity tracking means profile switches can cause cross-identity destructive cleanup
  3. Major: Expired sessions accumulate forever (remote resources leaked)
  4. Major: CLI version changes break reconnection within the 24h window

See inline comments for details.

alive = append(alive, s)
} else {
cleanupStaleSession(ctx, client, s, version)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent Swarm Review] [Critical]

Any probe error is treated as proof that the session is stale.

resolveServerlessSession() calls cleanupStaleSession() for every getServerMetadata() failure. That probe can fail for transient auth, network, workspace API, or version-mismatch reasons. In those cases the CLI will delete local SSH config, remove the session from state, and best-effort delete secret scopes and workspace content for a session that may still be alive.

Both reviewers flagged this. Isaac confirmed Critical in cross-review due to irreversible blast radius.

Suggestion: Only run destructive cleanup on definitive stale signals (e.g., 404/not-found). For transient errors, keep the session and surface a warning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Now only errServerMetadata (definitive not-found) triggers cleanup. Transient errors (network, auth) are logged as warnings and the session is kept.

Accelerator string `json:"accelerator"`
WorkspaceHost string `json:"workspace_host"`
CreatedAt time.Time `json:"created_at"`
ClusterID string `json:"cluster_id,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent Swarm Review] [Major]

Session matching ignores Databricks identity. No username/profile is stored, so switching profiles on the same workspace causes cross-identity session mixup. Combined with the probe-failure cleanup issue, probing another identity's session fails and triggers cleanup of their remote resources.

Suggestion: Persist an identity key (e.g., workspace username) in the Session struct and include it in FindMatching.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added UserName to the Session struct. FindMatching now requires and filters by user name, preventing cross-identity session mixups.

result = append(result, s)
}
}
return result, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent Swarm Review] [Major]

Expired sessions accumulate in state file indefinitely. FindMatching filters out expired sessions from results but never removes them from the store on disk or triggers cleanup of their remote resources (secret scopes, workspace content). Over time the state file grows unboundedly and remote resources are leaked.

Suggestion: Prune expired sessions during Load or FindMatching, saving the cleaned store back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. FindMatching now prunes expired sessions from disk when it encounters them, so the state file stays bounded.


date := time.Now().Format("20060102")
b := make([]byte, 3)
_, _ = rand.Read(b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent Swarm Review] [Nit]

_, _ = rand.Read(b) discards the error. If crypto/rand.Read fails, b is all zeros, producing predictable session names. Consider checking the error or adding a comment explaining why it's intentionally ignored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now panics on rand.Read failure (which should never happen with crypto/rand on a healthy system) instead of silently producing predictable names.

Comment on lines +578 to +593
hostKeyChecking := "StrictHostKeyChecking=accept-new"
if opts.IsServerlessMode() {
hostKeyChecking = "StrictHostKeyChecking=no"
}

sshArgs := []string{
"-l", userName,
"-i", privateKeyPath,
"-o", "IdentitiesOnly=yes",
"-o", "StrictHostKeyChecking=accept-new",
"-o", hostKeyChecking,
"-o", "ConnectTimeout=360",
"-o", "ProxyCommand=" + proxyCommand,
}
if opts.UserKnownHostsFile != "" {
if opts.IsServerlessMode() {
sshArgs = append(sshArgs, "-o", "UserKnownHostsFile=/dev/null")
} else if opts.UserKnownHostsFile != "" {
Copy link
Contributor

@ilia-db ilia-db Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had such options before, but the security didn't like it.

With auto host name generation we should not have that many host conflicts, right?

Before you would get them if you re-used the same name to connect to a different workspace. Re-using the same name for the same workspace is fine, as our server will get the server ssh key from the secrets scope that's tied to the name (and with the same name the scope will be the same). But across different workspaces we will get a problem, since server keys will be different.

Can we also add workspace id (real one, or based on the host url) to the generated session name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — removed StrictHostKeyChecking=no entirely. Instead, session names now include a workspace host hash (4 hex chars from MD5 of the host URL), so different workspaces get different session names and won't have known_hosts conflicts.

// resolveServerlessSession handles auto-generation and reconnection for serverless sessions.
// It checks local state for existing sessions matching the workspace and accelerator,
// probes them to see if they're still alive, and prompts the user to reconnect or create new.
func resolveServerlessSession(ctx context.Context, client *databricks.WorkspaceClient, opts *ClientOptions) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but this can be a method on the ClientOptions struct, might be easier to understand that we are mutating the options here then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — resolveServerlessSession is now a method on *ClientOptions.

func resolveServerlessSession(ctx context.Context, client *databricks.WorkspaceClient, opts *ClientOptions) error {
version := build.GetInfo().Version

matching, err := sessions.FindMatching(ctx, client.Config.Host, opts.Accelerator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like majority of this logic can be moved to the sessions package (up until line 788). getServerMetadata can be passed as an argument. Then it will be easier to test.

Same for cleanupStaleSession. Or will there be circular dependencies it we do that? (since that function has a lot of them)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea — I'll consider moving the probe/prompt logic to the sessions package in a follow-up. For now the probing depends on getServerMetadata and cleanupStaleSession which have heavy workspace client dependencies, so extracting it cleanly would need passing those as function arguments. Keeping it here for now to avoid a larger refactor in this PR.


// GenerateSessionName creates a human-readable session name from the accelerator type.
// Format: <prefix>-<random_hex>, e.g. "gpu-a10-f3a2b1c0".
func GenerateSessionName(accelerator string) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, will it help with known_hosts conflicts if we add a workspace id/host here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Added a 4-char hash of the workspace host to generated session names (format: databricks-gpu-a10-20260316-<wshash><random>). Different workspaces produce different names, avoiding known_hosts conflicts without needing StrictHostKeyChecking=no.

@anton-107 anton-107 force-pushed the ssh-connect-elapsed-time branch 2 times, most recently from eb36b9e to c65e058 Compare March 17, 2026 12:23
Base automatically changed from ssh-connect-elapsed-time to main March 17, 2026 13:33
@anton-107 anton-107 force-pushed the antonnek/auto-session-names branch from a177bea to e8fa14a Compare March 18, 2026 14:59
@anton-107 anton-107 temporarily deployed to test-trigger-is March 18, 2026 15:00 — with GitHub Actions Inactive
@anton-107 anton-107 temporarily deployed to test-trigger-is March 18, 2026 15:18 — with GitHub Actions Inactive
@anton-107 anton-107 temporarily deployed to test-trigger-is March 18, 2026 15:31 — with GitHub Actions Inactive
anton-107 added a commit that referenced this pull request Mar 18, 2026
## Summary
- Wrap the settings JSON preview in `{ }` with proper indentation so it
stands out visually
- Add blank lines around the settings block for padding
- Default the "Apply these settings?" prompt to yes (`[Y/n]`) — pressing
Enter accepts
- Shorten inline comments (`// Global setting` instead of `// Global
setting that affects all remote ssh connections`)

Stacked on #4701.

## Test plan
- [x] Existing vscode settings tests pass
- [ ] Manual test: verify the prompt renders with proper formatting and
padding
- [ ] Manual test: pressing Enter without typing accepts the settings

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@anton-107 anton-107 temporarily deployed to test-trigger-is March 18, 2026 16:54 — with GitHub Actions Inactive
Copy link
Member

@simonfaltum simonfaltum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Swarm Summary (2 independent reviewers + cross-review)

Verdict: REQUEST CHANGES - Good UX direction, but two bugs need fixing before merge plus one important design concern.

  1. [CRITICAL] Shared key directory deletion - cleanupStaleSession() removes filepath.Dir(keyPath) which is the shared ~/.databricks/ssh-tunnel-keys directory, not the session's files. One cleanup wipes keys for ALL active SSH tunnels.
  2. [IMPORTANT] Transient errors trigger cleanup - resolveServerlessSession() treats ANY getServerMetadata error as proof the session is dead. Transient network/auth failures will delete a live session's local keys, SSH config, secret scope, and workspace content.
  3. [IMPORTANT] Session store not safe for concurrent processes - No file locking on read-modify-write, fixed temp file name for concurrent Save() calls, os.Rename not atomic on Windows.

Comment on lines +800 to +803
// Remove local SSH keys.
keyPath, err := keys.GetLocalSSHKeyPath(ctx, s.Name, "")
if err == nil {
os.RemoveAll(filepath.Dir(keyPath))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CRITICAL] os.RemoveAll(filepath.Dir(keyPath)) is destructive. keys.GetLocalSSHKeyPath() returns ~/.databricks/ssh-tunnel-keys/<sessionName> (a file), so filepath.Dir() resolves to ~/.databricks/ssh-tunnel-keys/, the shared directory for ALL SSH tunnels.

Cleaning up one stale session will wipe keys for every other active SSH tunnel on the machine.

Fix: delete only the session-specific files:

os.Remove(keyPath)
os.Remove(keyPath + ".pub")

Or store each session in its own subdirectory.

Comment on lines +757 to +766
var alive []sessions.Session
for _, s := range matching {
_, _, _, probeErr := getServerMetadata(ctx, client, s.Name, s.ClusterID, version, o.Liteswap)
if probeErr == nil {
alive = append(alive, s)
} else if errors.Is(probeErr, errServerMetadata) {
// Only clean up when the server is definitively gone (metadata endpoint returns not-found).
// Transient errors (network, auth) should not trigger cleanup.
cleanupStaleSession(ctx, client, s, version)
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[IMPORTANT] This probe treats ANY getServerMetadata error as proof the session is dead and immediately runs cleanupStaleSession(). But getServerMetadata() returns errors for transient network failures, auth failures, and HTTP client errors, not just definitive "session not found".

A transient workspace API problem will therefore delete a live session's local keys, SSH config, secret scope, workspace content, and state entry.

Fix: distinguish permanent "session not found" errors from transient probe failures. Only run cleanup on a proven not-found condition. Log transient errors as warnings and skip cleanup.

Comment on lines +89 to +97
}
return nil
}

// Add persists a new session to the store, replacing any existing session with the same name.
func Add(ctx context.Context, s Session) error {
store, err := Load(ctx)
if err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[IMPORTANT] The session store is not safe for concurrent CLI processes. Add()/Remove() do a read-modify-write with no file locking. Two simultaneous ssh connect commands can clobber each other's updates.

Additionally, Save() uses a fixed temp path (path + ".tmp"), so concurrent writes can collide on the temp file. On Windows, os.Rename is not an atomic replace of an existing file.

Fix: add interprocess locking (e.g., flock), use a unique temp file per write, and consider a platform-safe replace helper.

Comment on lines +738 to +741
func (o *ClientOptions) resolveServerlessSession(ctx context.Context, client *databricks.WorkspaceClient) error {
version := build.GetInfo().Version

me, err := client.CurrentUser.Me(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[SUGGESTION] resolveServerlessSession() calls client.CurrentUser.Me(ctx) here. Then cleanupStaleSession() calls it again for each stale session (line 810). And session persistence calls it a third time (line 344). Each call is a network round-trip.

Consider calling Me() once and passing the result through to avoid 3+ redundant API calls per connection.

Comment on lines 233 to 237
return " {\n" + strings.Join(lines, ",\n") + "\n }"
}

func promptUserForUpdate(ctx context.Context, ide, connectionName string, missing *missingSettings) (bool, error) {
question := fmt.Sprintf(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[SUGGESTION] Changing from AskYesOrNo to Ask with [Y/n] default "y" changes the default behavior: empty input now means yes. The old AskYesOrNo may have had different semantics. This UX change should be called out in the PR description.

anton-107 and others added 5 commits March 23, 2026 14:07
…upport

Remove the requirement for --name in serverless SSH connect. Sessions are
now auto-generated with human-readable names (e.g. databricks-gpu-a10-20260310-a1b2c3),
tracked in ~/.databricks/ssh-tunnel-sessions.json, and offered for reconnection
on subsequent runs. Stale sessions are cleaned up automatically. Sessions expire
after 24 hours. Also fixes known_hosts key mismatches for serverless by disabling
strict host key checking (identity verified via Databricks auth).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…stKeyChecking=no

- Add UserName to Session struct and include in FindMatching to prevent
  cross-identity session mixups when switching profiles
- Only run cleanup on definitive errServerMetadata errors; log and skip
  on transient failures (network, auth) to avoid deleting live sessions
- Add workspace host hash to generated session names to avoid SSH
  known_hosts conflicts across workspaces, removing the need for
  StrictHostKeyChecking=no and UserKnownHostsFile=/dev/null
- Prune expired sessions from disk during FindMatching
- Make resolveServerlessSession a method on ClientOptions
- Handle rand.Read error explicitly

Co-authored-by: Isaac
@anton-107 anton-107 force-pushed the antonnek/auto-session-names branch from 6575760 to 42afe03 Compare March 23, 2026 13:08
@anton-107 anton-107 temporarily deployed to test-trigger-is March 23, 2026 13:08 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants