Skip to content

Add create, attach, and list endpoints to dynamic disks API#2743

Open
Alphasite wants to merge 5 commits into
mainfrom
feature/dynamic-disk-phase2-cli-endpoints
Open

Add create, attach, and list endpoints to dynamic disks API#2743
Alphasite wants to merge 5 commits into
mainfrom
feature/dynamic-disk-phase2-cli-endpoints

Conversation

@Alphasite

@Alphasite Alphasite commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Summary

Extends the BOSH Director's dynamic disk API with three new endpoints to support Phase 2 CLI commands (TNZ-109499):

  • GET /dynamic_disks — list all dynamic disks (bosh.admin or bosh.dynamic-disks.list scope)
  • POST /dynamic_disks — create a disk in the IaaS without attaching it (bosh.dynamic-disks.create scope)
  • POST /dynamic_disks/:name/attach — attach an existing named disk to a VM (bosh.dynamic-disks.attach scope)

Changes

New files

  • src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb — background job that creates a disk via CPI without attaching
  • src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb — background job that attaches a named disk to a VM (idempotent)
  • src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/create_dynamic_disk_spec.rb
  • src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/attach_dynamic_disk_spec.rb

Modified files

  • src/bosh-director/lib/bosh/director/api/controllers/dynamic_disks_controller.rb — added 3 new routes
  • src/bosh-director/lib/bosh/director.rb — required new job classes
  • src/bosh-director/spec/support/test_identity_provider.rb — added dynamic-disks-lister and dynamic-disks-attacher test users
  • src/bosh-director/spec/unit/bosh/director/api/controllers/dynamic_disks_controller_spec.rb — full coverage for all 6 endpoints

Test plan

  • bundle exec rspec spec/unit/bosh/director/api/controllers/dynamic_disks_controller_spec.rb spec/unit/bosh/director/jobs/dynamic_disks/70 examples, 0 failures

Related: TNZ-109499, TNZ-99509

Adds three new Director API endpoints to complement the existing
provide/detach/delete endpoints:

- GET  /dynamic_disks          list all dynamic disks (bosh.dynamic-disks.list)
- POST /dynamic_disks          create disk without attaching (bosh.dynamic-disks.create)
- POST /dynamic_disks/:name/attach  attach existing disk to a VM (bosh.dynamic-disks.attach)

Implements CreateDynamicDisk and AttachDynamicDisk background jobs,
both on the :dynamic_disks worker queue. All operations are idempotent.

70 unit specs, all passing.

[TNZ-109499]

Co-authored-by: Cursor <cursoragent@cursor.com>
Copilot AI review requested due to automatic review settings June 17, 2026 21:18
@Alphasite Alphasite marked this pull request as draft June 17, 2026 21:18
@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Two new background job classes are introduced under Bosh::Director::Jobs::DynamicDisks: CreateDynamicDisk and AttachDynamicDisk. CreateDynamicDisk resolves the target deployment and active VM in the requested availability zone, calls the CPI to create a disk, persists a Models::DynamicDisk record, and optionally updates disk metadata. AttachDynamicDisk resolves the disk, instance, and VM, enforces attachment-state invariants, calls the CPI to attach the disk under a VM lock to capture a disk hint, updates the model with VM and zone data, and notifies the agent. The find_disk_cloud_properties helper is updated to accept either a deployment or instance as its first argument. Three new routes are added to DynamicDisksController: GET / lists all dynamic disks with eager-loaded names, POST / enqueues CreateDynamicDisk after validation, and POST /:disk_name/attach enqueues AttachDynamicDisk after validation. Both job files are added to the director require graph. Full unit test coverage is added for both jobs and all three controller routes, including test infrastructure updates.

Suggested reviewers

  • rkoster
  • mkocher
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: adding three new endpoints (create, attach, list) to the dynamic disks API.
Description check ✅ Passed The description comprehensively covers all required sections: what the change is about, context (with tracking numbers), tests run successfully (70 examples, 0 failures), release notes guidance, breaking changes assessment, and team collaboration tags.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/dynamic-disk-phase2-cli-endpoints

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 OpenGrep (1.23.0)
src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/create_dynamic_disk_spec.rb

┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

[00.12][ERROR]: unable to find a config; path .coderabbit-opengrep-fallback.yml does not exist


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the BOSH Director dynamic disks API with endpoints to list disks, create a disk (without attaching), and attach an existing disk to a VM, to support the next phase of CLI commands.

Changes:

  • Add GET /dynamic_disks to list dynamic disks (new scope: bosh.dynamic-disks.list).
  • Add POST /dynamic_disks to create a dynamic disk in the IaaS without attaching it.
  • Add POST /dynamic_disks/:name/attach to attach an existing named disk to a VM via a background job.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/bosh-director/lib/bosh/director/api/controllers/dynamic_disks_controller.rb Adds list/create/attach endpoints and enqueues new background jobs
src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb New job to create a dynamic disk without attaching
src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb New job to attach an existing dynamic disk to an instance’s active VM
src/bosh-director/lib/bosh/director.rb Requires the newly added job classes
src/bosh-director/spec/support/test_identity_provider.rb Adds a scoped test user for the list endpoint
src/bosh-director/spec/unit/bosh/director/api/controllers/dynamic_disks_controller_spec.rb Adds controller coverage for the new endpoints and scope enforcement
src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/create_dynamic_disk_spec.rb Adds unit tests for the create job
src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/attach_dynamic_disk_spec.rb Adds unit tests for the attach job

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/bosh-director/lib/bosh/director/api/controllers/dynamic_disks_controller.rb Outdated
Comment on lines +27 to +33
cloud_properties = find_disk_cloud_properties(instance, @disk_pool_name)
cloud = Bosh::Director::CloudFactory.create.get(vm.cpi)

disk_model = Models::DynamicDisk.find(name: @disk_name)
if disk_model.nil?
cloud_properties['name'] = @disk_name
disk_cid = cloud.create_disk(@disk_size, cloud_properties, vm.cid)

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@src/bosh-director/lib/bosh/director/api/controllers/dynamic_disks_controller.rb`:
- Around line 8-21: The get '/' route handler in the DynamicDisksController has
an N+1 query issue because it retrieves all dynamic disks with
Models::DynamicDisk.all and then accesses the associated objects
(disk.deployment and disk.vm.instance) within the map block, causing separate
database queries for each disk. Fix this by using Sequel's eager loading method
on the Models::DynamicDisk query to pre-load the deployment and vm associations
along with the nested instance association before performing the map operation.

In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb`:
- Around line 30-42: The code directly mutates the cloud_properties hash by
assigning cloud_properties['name'] = `@disk_name` on line 32, which could cause
unintended side effects if this hash is shared or cached. Instead of modifying
the original hash in place, create a new hash by merging the cloud_properties
with a new hash containing the name key, then pass this merged hash to the
cloud.create_disk call. This preserves the original cloud_properties hash and
prevents potential side effects from mutations.

In
`@src/bosh-director/spec/unit/bosh/director/api/controllers/dynamic_disks_controller_spec.rb`:
- Around line 337-356: Expand the validation test coverage for the POST '/'
endpoint to match the comprehensiveness of the existing POST '/provide' endpoint
tests. In addition to the current test contexts for disk_name being nil and
disk_size being 0, add more test cases covering nil values for other fields,
empty string validations, and boundary value tests for disk_size and other
numeric properties. Reference the validation test patterns used in the POST
'/provide' endpoint (lines 84-168) to ensure consistency in what scenarios are
being tested, while leveraging the same safe_property validation helper that
both endpoints use.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fb65caa2-5ab2-45be-a214-c92ffe8705bb

📥 Commits

Reviewing files that changed from the base of the PR and between df5d172 and 7e2883d.

📒 Files selected for processing (8)
  • src/bosh-director/lib/bosh/director.rb
  • src/bosh-director/lib/bosh/director/api/controllers/dynamic_disks_controller.rb
  • src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb
  • src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb
  • src/bosh-director/spec/support/test_identity_provider.rb
  • src/bosh-director/spec/unit/bosh/director/api/controllers/dynamic_disks_controller_spec.rb
  • src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/attach_dynamic_disk_spec.rb
  • src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/create_dynamic_disk_spec.rb

Comment thread src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb Outdated
KauzClay added 2 commits June 23, 2026 14:35
Models::DynamicDisk.all.map will cause N+1 queries when accessing
disk.deployment and disk.vm.instance for each row. Eager-load the associated
records so listing scales with number of disks.

ai-assisted=yes
[TNZ-109499] Add CLI commands for Dynamic Disks - Phase 2 (list, create, standalone attach)

Signed-off-by: Nishad Mathur <nishad.mathur@broadcom.com>
also allows attach_dynamic_disk to accept and set metadata, since ProvideDynamicDisk technically can override metadata if a disk already exists

ai-assisted=yes
[TNZ-109499] Add CLI commands for Dynamic Disks - Phase 2 (list, create, standalone attach)

Signed-off-by: Nishad Mathur <nishad.mathur@broadcom.com>
@Alphasite Alphasite marked this pull request as ready for review June 23, 2026 21:06

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb (2)

29-31: 🩺 Stability & Availability | 🟠 Major | 🏗️ Heavy lift

Same-VM idempotent return can suppress agent reconciliation after partial failure.

Line 37 persists attachment state before agent notification (Lines 44-45), but Lines 29-31 return early on retries when VM already matches. If an earlier attempt fails after Line 37 (e.g., metadata/agent error), retry skips add_dynamic_disk, leaving the agent unsynced.

Also applies to: 37-45

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb`
around lines 29 - 31, The early return check in the unless block at lines 29-31
prevents agent reconciliation on retries when a disk is already attached to the
same VM, which can leave the agent out of sync if a previous attempt failed
after persisting attachment state at line 37 but before agent notification at
lines 44-45. Modify the logic to allow the flow to continue to the
add_dynamic_disk call even when the disk is already attached to the same VM, or
remove this early return condition entirely so that agent notification (lines
44-45) is always executed regardless of whether the attachment state already
exists in the database.

29-37: 🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Attachment-state check is outside a disk-scoped lock, allowing cross-VM races.

The disk_model.vm.nil? check (Lines 29-33) happens before locking, and the lock at Line 36 is VM-scoped. Two jobs targeting different VMs can concurrently pass the nil check for the same disk and race into attach_disk.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb`
around lines 29 - 37, The disk attachment state check (disk_model.vm.nil?)
occurs outside any disk-scoped lock, allowing concurrent jobs targeting
different VMs to race on the same disk. The existing with_vm_lock protects the
attach_disk call but only for a specific VM, not the disk itself. Add a
disk-scoped lock that wraps the entire disk attachment logic including the nil
check, the condition that verifies if the disk is already attached to the same
or different VM, and the attach_disk operation. This ensures that only one job
can check and modify the attachment state of a specific disk at any given time,
preventing cross-VM races on disk_model.
src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb (1)

21-42: 🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

CPI disk creation is not failure-safe across the DB write boundary.

At Line 34, the CPI create happens before the DB write at Line 35. If persistence fails (race/DB failure), the job can leave orphaned IaaS disks, and retries may create more. The pre-check at Lines 21-22 is not atomic protection.

💡 Suggested hardening pattern
-        cloud_properties['name'] = `@disk_name`
-        disk_cid = cloud.create_disk(`@disk_size`, cloud_properties, vm.cid)
-        disk_model = Models::DynamicDisk.create(
-          name: `@disk_name`,
-          disk_cid: disk_cid,
-          deployment_id: instance.deployment.id,
-          size: `@disk_size`,
-          disk_pool_name: `@disk_pool_name`,
-          cpi: vm.cpi,
-        )
+        cloud_properties = cloud_properties.merge('name' => `@disk_name`)
+        disk_cid = nil
+        begin
+          disk_cid = cloud.create_disk(`@disk_size`, cloud_properties, vm.cid)
+          disk_model = Models::DynamicDisk.create(
+            name: `@disk_name`,
+            disk_cid: disk_cid,
+            deployment_id: instance.deployment.id,
+            size: `@disk_size`,
+            disk_pool_name: `@disk_pool_name`,
+            cpi: vm.cpi,
+          )
+        rescue
+          cloud.delete_disk(disk_cid) unless disk_cid.nil?
+          raise
+        end
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb`
around lines 21 - 42, The cloud.create_disk call at line 34 happens before the
Models::DynamicDisk.create call at line 35, creating a window where the disk
exists in the IaaS but not in the database. If the database write fails,
orphaned disks are left and retries create duplicates. Wrap both the
cloud.create_disk operation and the Models::DynamicDisk.create operation within
a single database transaction to ensure atomicity, or consider creating the
database record first with a pending state, then updating it after successful
cloud disk creation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb`:
- Around line 29-31: The early return check in the unless block at lines 29-31
prevents agent reconciliation on retries when a disk is already attached to the
same VM, which can leave the agent out of sync if a previous attempt failed
after persisting attachment state at line 37 but before agent notification at
lines 44-45. Modify the logic to allow the flow to continue to the
add_dynamic_disk call even when the disk is already attached to the same VM, or
remove this early return condition entirely so that agent notification (lines
44-45) is always executed regardless of whether the attachment state already
exists in the database.
- Around line 29-37: The disk attachment state check (disk_model.vm.nil?) occurs
outside any disk-scoped lock, allowing concurrent jobs targeting different VMs
to race on the same disk. The existing with_vm_lock protects the attach_disk
call but only for a specific VM, not the disk itself. Add a disk-scoped lock
that wraps the entire disk attachment logic including the nil check, the
condition that verifies if the disk is already attached to the same or different
VM, and the attach_disk operation. This ensures that only one job can check and
modify the attachment state of a specific disk at any given time, preventing
cross-VM races on disk_model.

In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb`:
- Around line 21-42: The cloud.create_disk call at line 34 happens before the
Models::DynamicDisk.create call at line 35, creating a window where the disk
exists in the IaaS but not in the database. If the database write fails,
orphaned disks are left and retries create duplicates. Wrap both the
cloud.create_disk operation and the Models::DynamicDisk.create operation within
a single database transaction to ensure atomicity, or consider creating the
database record first with a pending state, then updating it after successful
cloud disk creation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2d6b5c6f-fbc5-4b73-a085-3193aa69fb49

📥 Commits

Reviewing files that changed from the base of the PR and between 7e2883d and 875c239.

📒 Files selected for processing (6)
  • src/bosh-director/lib/bosh/director/api/controllers/dynamic_disks_controller.rb
  • src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb
  • src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb
  • src/bosh-director/spec/unit/bosh/director/api/controllers/dynamic_disks_controller_spec.rb
  • src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/attach_dynamic_disk_spec.rb
  • src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/create_dynamic_disk_spec.rb

…on attach

- GET /dynamic_disks now returns disk_pool_name and metadata fields
- POST /dynamic_disks (create): replace instance_id with deployment_name+az;
  finds an active VM in the requested AZ to use as CPI hint for placement;
  stores AZ in availability_zone column at creation time
- POST /dynamic_disks/:name/attach: assert disk AZ matches target VM AZ,
  skip check when disk AZ is nil (disks created before AZ tracking)
- Refactor find_disk_cloud_properties helper to accept deployment or instance
- 76 unit specs, all passing

[TNZ-109499]

Co-authored-by: Cursor <cursoragent@cursor.com>
@Alphasite Alphasite force-pushed the feature/dynamic-disk-phase2-cli-endpoints branch from 563e69e to 68e7872 Compare June 24, 2026 01:25

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb (1)

29-33: 🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Serialize attach operations by disk identity, not only VM identity.

Line 29 pre-checks disk_model.vm, but Line 42 locks by vm.cid. Two concurrent attaches of the same disk to different VMs can both pass the pre-check and race into CPI attach before Line 43 persists ownership, causing inconsistent disk attachment state.

Please guard the full precheck + attach + update sequence with a disk-scoped lock (or equivalent atomic claim on the disk row).

Also applies to: 42-43

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb`
around lines 29 - 33, The current locking mechanism in the attach_dynamic_disk
operation is scoped by VM identity (at line 42 using vm.cid), but the pre-check
validation (lines 29-33) operates on disk identity. This creates a race
condition where two concurrent attach operations targeting the same disk to
different VMs can both pass the pre-check with different VM locks, then both
proceed to call the CPI attach, causing inconsistent disk state. Change the
synchronization strategy from VM-scoped locking to disk-scoped locking to ensure
that the entire sequence of pre-check validation, CPI attach call, and
persistence update is atomically protected by a lock based on the disk identity
(disk_model) rather than the target VM identity, preventing concurrent
attachments of the same disk.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb`:
- Around line 29-33: The current locking mechanism in the attach_dynamic_disk
operation is scoped by VM identity (at line 42 using vm.cid), but the pre-check
validation (lines 29-33) operates on disk identity. This creates a race
condition where two concurrent attach operations targeting the same disk to
different VMs can both pass the pre-check with different VM locks, then both
proceed to call the CPI attach, causing inconsistent disk state. Change the
synchronization strategy from VM-scoped locking to disk-scoped locking to ensure
that the entire sequence of pre-check validation, CPI attach call, and
persistence update is atomically protected by a lock based on the disk identity
(disk_model) rather than the target VM identity, preventing concurrent
attachments of the same disk.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3597a1b6-f1df-4370-a83d-9e12224d5241

📥 Commits

Reviewing files that changed from the base of the PR and between 875c239 and 563e69e.

📒 Files selected for processing (7)
  • src/bosh-director/lib/bosh/director/api/controllers/dynamic_disks_controller.rb
  • src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb
  • src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb
  • src/bosh-director/lib/bosh/director/jobs/helpers/dynamic_disk_helpers.rb
  • src/bosh-director/spec/unit/bosh/director/api/controllers/dynamic_disks_controller_spec.rb
  • src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/attach_dynamic_disk_spec.rb
  • src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/create_dynamic_disk_spec.rb

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb (1)

29-33: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Do not return early before reconciliation in idempotent attach path.

Line 30 returns immediately when already attached to the same VM, which skips metadata reconciliation and agent sync; this makes retries non-convergent. Also, Line 51 unconditionally calls add_dynamic_disk, unlike existing dynamic-disk flow that guards when disk_hint is nil.

Suggested fix shape
- unless disk_model.vm.nil?
-   return "disk `#{`@disk_name`}` is already attached to vm `#{disk_model.vm.cid}`" if disk_model.vm.id == vm.id
+ already_attached_same_vm = false
+ unless disk_model.vm.nil?
+   if disk_model.vm.id == vm.id
+     already_attached_same_vm = true
+   else
+     raise "disk `#{`@disk_name`}` is already attached to a different vm `#{disk_model.vm.cid}`"
+   end
-   raise "disk `#{`@disk_name`}` is already attached to a different vm `#{disk_model.vm.cid}`"
  end

- disk_hint = with_vm_lock(vm.cid, timeout: VM_LOCK_TIMEOUT) { cloud.attach_disk(vm.cid, disk_model.disk_cid) }
- disk_model.update(vm_id: vm.id, availability_zone: vm.instance.availability_zone, disk_hint: disk_hint)
+ unless already_attached_same_vm
+   disk_hint = with_vm_lock(vm.cid, timeout: VM_LOCK_TIMEOUT) { cloud.attach_disk(vm.cid, disk_model.disk_cid) }
+   disk_model.update(vm_id: vm.id, availability_zone: vm.instance.availability_zone, disk_hint: disk_hint)
+ end

- agent_client.add_dynamic_disk(disk_model.disk_cid, disk_model.disk_hint)
+ agent_client.add_dynamic_disk(disk_model.disk_cid, disk_model.disk_hint) unless disk_model.disk_hint.nil?

Also applies to: 50-51

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb`
around lines 29 - 33, The idempotent attach path returns early when a disk is
already attached to the same VM, which skips necessary metadata reconciliation
and agent sync operations, making retries non-convergent. Remove the early
return in the condition block within the unless statement (that returns the
"disk is already attached" message) so that the method continues execution to
perform reconciliation and sync even when already attached to the correct VM.
Additionally, guard the unconditional call to add_dynamic_disk so that it only
executes when disk_hint is not nil, matching the pattern used in the existing
dynamic-disk flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb`:
- Around line 22-39: The create_dynamic_disk operation has a race condition
between the existence check at `Models::DynamicDisk.find(name: `@disk_name`)` and
the actual disk creation at `cloud.create_disk()`. Two concurrent jobs with the
same disk_name can both pass the existence check and create disks in the IaaS,
leaving orphaned resources. Serialize the create-by-name operation by adding a
locking mechanism (such as a database-level lock or mutex based on `@disk_name`)
around the entire sequence from the existence check through to the
Models::DynamicDisk.create call. This ensures that only one job can execute the
full check-and-create flow for a given disk_name at any time, preventing
duplicate IaaS disk creation.

---

Outside diff comments:
In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb`:
- Around line 29-33: The idempotent attach path returns early when a disk is
already attached to the same VM, which skips necessary metadata reconciliation
and agent sync operations, making retries non-convergent. Remove the early
return in the condition block within the unless statement (that returns the
"disk is already attached" message) so that the method continues execution to
perform reconciliation and sync even when already attached to the correct VM.
Additionally, guard the unconditional call to add_dynamic_disk so that it only
executes when disk_hint is not nil, matching the pattern used in the existing
dynamic-disk flow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7606deb5-cd63-4db7-a8a0-090b6b9c747f

📥 Commits

Reviewing files that changed from the base of the PR and between 563e69e and 68e7872.

📒 Files selected for processing (7)
  • src/bosh-director/lib/bosh/director/api/controllers/dynamic_disks_controller.rb
  • src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb
  • src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb
  • src/bosh-director/lib/bosh/director/jobs/helpers/dynamic_disk_helpers.rb
  • src/bosh-director/spec/unit/bosh/director/api/controllers/dynamic_disks_controller_spec.rb
  • src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/attach_dynamic_disk_spec.rb
  • src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/create_dynamic_disk_spec.rb

Comment on lines +22 to +39
disk_model = Models::DynamicDisk.find(name: @disk_name)
raise "disk `#{@disk_name}` already exists" unless disk_model.nil?

deployment = Models::Deployment.find(name: @deployment_name)
raise "deployment `#{@deployment_name}` not found" if deployment.nil?

# Find an active VM in the requested AZ to use as the hint for create_disk.
# The IaaS uses the VM's location to determine which AZ/datastore to place the disk in.
vm = find_active_vm_in_az(deployment, @az)
raise "no active VM found in deployment `#{@deployment_name}` in AZ `#{@az}`" if vm.nil?

cloud = Bosh::Director::CloudFactory.create.get(vm.cpi)

cloud_properties = find_disk_cloud_properties(deployment, @disk_pool_name)
cloud_properties['name'] = @disk_name

disk_cid = cloud.create_disk(@disk_size, cloud_properties, vm.cid)
disk_model = Models::DynamicDisk.create(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🔴 Critical | 🏗️ Heavy lift

Serialize create-by-name to prevent duplicate/orphan IaaS disks.

Line 22 (find) and Line 38 (cloud.create_disk) are not atomic. Two concurrent jobs with the same @disk_name can both pass the existence check and both create disks in CPI; only one record may persist, leaving orphaned infrastructure resources.

Suggested direction
- disk_model = Models::DynamicDisk.find(name: `@disk_name`)
- raise "disk `#{`@disk_name`}` already exists" unless disk_model.nil?
+ # Serialize by disk name (DB lock or distributed lock) before calling CPI create_disk
+ # and persist reservation/record before external side effects where possible.
+ # If CPI create succeeds but persistence fails, perform compensating cleanup (delete_disk).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb`
around lines 22 - 39, The create_dynamic_disk operation has a race condition
between the existence check at `Models::DynamicDisk.find(name: `@disk_name`)` and
the actual disk creation at `cloud.create_disk()`. Two concurrent jobs with the
same disk_name can both pass the existence check and create disks in the IaaS,
leaving orphaned resources. Serialize the create-by-name operation by adding a
locking mechanism (such as a database-level lock or mutex based on `@disk_name`)
around the entire sequence from the existence check through to the
Models::DynamicDisk.create call. This ensures that only one job can execute the
full check-and-create flow for a given disk_name at any time, preventing
duplicate IaaS disk creation.

- Avoid mutating the cloud_properties hash returned by find_disk_cloud_properties; use merge instead
- Clean up orphaned IaaS disk with cloud.delete_disk if the DB write fails after cloud.create_disk
- Fix attach idempotency: remove early return on same-VM so agent reconciliation (add_dynamic_disk) always runs on retry; guard the agent call when disk_hint is nil
- Expand POST / controller validation tests to match the coverage of POST /provide

ai-assisted=yes
[TNZ-109499] Add CLI commands for Dynamic Disks - Phase 2 (list, create, standalone attach)

Co-authored-by: Cursor <cursoragent@cursor.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/attach_dynamic_disk_spec.rb (1)

93-112: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add an idempotent retry test when disk_hint is nil.

Current same-VM idempotency coverage only exercises a non-nil disk_hint. Add a disk_hint: nil case so behavior is explicit and protected against regressions.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/attach_dynamic_disk_spec.rb`
around lines 93 - 112, The current test for the idempotent retry scenario only
covers the case where disk_hint has a non-nil value. Add an additional test
context after the existing 'when disk is already attached to the same VM'
context that specifically exercises the idempotent retry behavior when disk_hint
is nil. This new context should use the same FactoryBot.create pattern to create
a disk but with disk_hint explicitly set to nil, and verify that the same
behavior occurs: the cloud attach_disk method is not called, but the
agent_client receives the add_dynamic_disk call with nil as the disk_hint
parameter.
src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb (2)

29-49: 🗄️ Data Integrity & Integration | 🔴 Critical | 🏗️ Heavy lift

Protect attachment-state checks with a disk-level lock/transaction.

Line 30-35 (disk_model.vm check) and Line 47-48 (attach_disk + update) are not atomic per disk. VM-level locking only serializes per target VM, so concurrent jobs can attach the same disk to different VMs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb`
around lines 29 - 49, The disk attachment check and update operations are not
protected by a disk-level lock, creating a race condition where concurrent jobs
can attach the same disk to different VMs. The `disk_model.vm` nil check and the
`attach_disk` + `disk_model.update` operations must be wrapped in a disk-level
lock or transaction to ensure atomicity. Currently only a VM-level lock
(with_vm_lock) is used which only serializes per target VM, not per disk. Add a
disk-level locking mechanism (similar to with_vm_lock but for the disk) that
encompasses both the initial disk state check and the attach/update operations
to prevent concurrent attachment of the same disk to multiple VMs.

56-57: 🩺 Stability & Availability | 🟠 Major

Add nil check before notifying agent in idempotent retry path.

Line 57 passes disk_model.disk_hint to the agent without validation. If the disk is already attached to the same VM (idempotent retry), the code skips lines 46-49 and uses the stored disk_hint directly from the database. Since disk_hint_json is nullable in the schema and the model returns nil when empty, this can result in add_dynamic_disk(disk_cid, nil) being sent to the agent.

This inconsistency exists elsewhere in the codebase: provide_dynamic_disk.rb (lines 59–61) explicitly guards with unless disk_model.disk_hint.nil? before calling add_dynamic_disk, and mount_info_mismatch.rb (line 74) guards with if disk_hint. Add a similar guard in the idempotent retry path to ensure compatibility with legacy or partially-updated disk records that may have a null disk_hint_json value.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb`
around lines 56 - 57, The call to agent_client.add_dynamic_disk at line 57 in
the attach_dynamic_disk method passes disk_model.disk_hint without validating it
first. Since disk_hint_json is nullable in the database schema,
disk_model.disk_hint can be nil, which would result in sending
add_dynamic_disk(disk_cid, nil) to the agent. Add a guard condition around the
agent_client.add_dynamic_disk call to check that disk_model.disk_hint is not nil
before invoking it, similar to the pattern used in provide_dynamic_disk.rb
(lines 59-61) and mount_info_mismatch.rb (line 74). Only call add_dynamic_disk
when disk_model.disk_hint has a value.
♻️ Duplicate comments (1)
src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb (1)

21-52: 🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Serialize create-by-name around the CPI call path.

Line 22 and Line 38 are still a check-then-act race: concurrent jobs for the same @disk_name can both pass existence check and both call cloud.create_disk before one DB write fails. Even with Line 49 cleanup, this still causes non-atomic external side effects and brittle retry behavior.

Suggested direction
 def perform
-  disk_model = Models::DynamicDisk.find(name: `@disk_name`)
-  raise "disk `#{`@disk_name`}` already exists" unless disk_model.nil?
+  with_lock("dynamic-disk-create:#{`@disk_name`}") do
+    disk_model = Models::DynamicDisk.find(name: `@disk_name`)
+    raise "disk `#{`@disk_name`}` already exists" unless disk_model.nil?
 
-  deployment = Models::Deployment.find(name: `@deployment_name`)
-  ...
-  disk_cid = cloud.create_disk(`@disk_size`, cloud_properties, vm.cid)
-  begin
-    disk_model = Models::DynamicDisk.create(...)
-  rescue
-    cloud.delete_disk(disk_cid)
-    raise
+    deployment = Models::Deployment.find(name: `@deployment_name`)
+    ...
+    disk_cid = cloud.create_disk(`@disk_size`, cloud_properties, vm.cid)
+    begin
+      disk_model = Models::DynamicDisk.create(...)
+    rescue
+      cloud.delete_disk(disk_cid)
+      raise
+    end
   end
 end
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb`
around lines 21 - 52, The perform method has a race condition where concurrent
jobs can both pass the disk existence check at line 22-23 and proceed to call
cloud.create_disk at line 37, causing duplicate disk creation attempts in the
cloud. Serialize access around the critical section from the disk existence
check through the cloud.create_disk call by implementing a lock mechanism (such
as a database-level lock or mutex) based on the disk_name to ensure only one job
can execute this sequence at a time and prevent concurrent calls to
cloud.create_disk for the same disk name.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb`:
- Around line 29-49: The disk attachment check and update operations are not
protected by a disk-level lock, creating a race condition where concurrent jobs
can attach the same disk to different VMs. The `disk_model.vm` nil check and the
`attach_disk` + `disk_model.update` operations must be wrapped in a disk-level
lock or transaction to ensure atomicity. Currently only a VM-level lock
(with_vm_lock) is used which only serializes per target VM, not per disk. Add a
disk-level locking mechanism (similar to with_vm_lock but for the disk) that
encompasses both the initial disk state check and the attach/update operations
to prevent concurrent attachment of the same disk to multiple VMs.
- Around line 56-57: The call to agent_client.add_dynamic_disk at line 57 in the
attach_dynamic_disk method passes disk_model.disk_hint without validating it
first. Since disk_hint_json is nullable in the database schema,
disk_model.disk_hint can be nil, which would result in sending
add_dynamic_disk(disk_cid, nil) to the agent. Add a guard condition around the
agent_client.add_dynamic_disk call to check that disk_model.disk_hint is not nil
before invoking it, similar to the pattern used in provide_dynamic_disk.rb
(lines 59-61) and mount_info_mismatch.rb (line 74). Only call add_dynamic_disk
when disk_model.disk_hint has a value.

In
`@src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/attach_dynamic_disk_spec.rb`:
- Around line 93-112: The current test for the idempotent retry scenario only
covers the case where disk_hint has a non-nil value. Add an additional test
context after the existing 'when disk is already attached to the same VM'
context that specifically exercises the idempotent retry behavior when disk_hint
is nil. This new context should use the same FactoryBot.create pattern to create
a disk but with disk_hint explicitly set to nil, and verify that the same
behavior occurs: the cloud attach_disk method is not called, but the
agent_client receives the add_dynamic_disk call with nil as the disk_hint
parameter.

---

Duplicate comments:
In
`@src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb`:
- Around line 21-52: The perform method has a race condition where concurrent
jobs can both pass the disk existence check at line 22-23 and proceed to call
cloud.create_disk at line 37, causing duplicate disk creation attempts in the
cloud. Serialize access around the critical section from the disk existence
check through the cloud.create_disk call by implementing a lock mechanism (such
as a database-level lock or mutex) based on the disk_name to ensure only one job
can execute this sequence at a time and prevent concurrent calls to
cloud.create_disk for the same disk name.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: de11dbb8-dd69-4155-9f0f-19da14c419f2

📥 Commits

Reviewing files that changed from the base of the PR and between 68e7872 and bb78550.

📒 Files selected for processing (5)
  • src/bosh-director/lib/bosh/director/jobs/dynamic_disks/attach_dynamic_disk.rb
  • src/bosh-director/lib/bosh/director/jobs/dynamic_disks/create_dynamic_disk.rb
  • src/bosh-director/spec/unit/bosh/director/api/controllers/dynamic_disks_controller_spec.rb
  • src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/attach_dynamic_disk_spec.rb
  • src/bosh-director/spec/unit/bosh/director/jobs/dynamic_disks/create_dynamic_disk_spec.rb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Waiting for Changes | Open for Contribution

Development

Successfully merging this pull request may close these issues.

3 participants