-
Notifications
You must be signed in to change notification settings - Fork 15
Add AWS spot instance support for EC2 hypervisors #63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,6 +6,7 @@ Conditions: | |
| AddSecondaryVolume: !Not [!Equals [!Ref EC2Type, 'MetalMachine']] | ||
| UseCapacityReservation: !Not [!Equals [!Ref CapacityReservationId, '']] | ||
| UseSpecificAZ: !Not [!Equals [!Ref AvailabilityZone, '']] | ||
| UseSpotInstance: !Equals [!Ref UseSpot, 'Yes'] | ||
|
|
||
| Mappings: | ||
| VolumeSize: | ||
|
|
@@ -54,6 +55,13 @@ Parameters: | |
| Type: String | ||
| Description: Specific AZ for instance placement (optional) | ||
| Default: "" | ||
| UseSpot: | ||
| Type: String | ||
| Description: Whether to use spot instances (Yes/No) | ||
| Default: "No" | ||
| AllowedValues: | ||
| - "Yes" | ||
| - "No" | ||
|
|
||
| Metadata: | ||
| AWS::CloudFormation::Interface: | ||
|
|
@@ -238,6 +246,18 @@ Resources: | |
| CapacityReservationTarget: | ||
| CapacityReservationId: !Ref CapacityReservationId | ||
|
|
||
| # Launch Template for Spot Instances (only created when spot is used) | ||
| RHELSpotLaunchTemplate: | ||
| Type: AWS::EC2::LaunchTemplate | ||
| Condition: UseSpotInstance | ||
| Properties: | ||
| LaunchTemplateData: | ||
| InstanceMarketOptions: | ||
| MarketType: spot | ||
| SpotOptions: | ||
| SpotInstanceType: persistent | ||
| InstanceInterruptionBehavior: stop | ||
|
|
||
|
Comment on lines
+249
to
+260
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
# Look for any spot-request cancellation logic across the deploy tree
rg -nP -C2 'cancel-spot-instance-requests|describe-spot-instance-requests|SpotInstanceRequestId'
fd -t f 'destroy|delete|cleanup|teardown' deploy/aws-hypervisor | xargs -I{} sh -c 'echo "=== {} ==="; cat "{}"'Repository: openshift-eng/two-node-toolbox Length of output: 2729 🌐 Web query:
💡 Result: No, the associated Spot Instance request is not cancelled automatically when the CloudFormation stack is deleted. CloudFormation deletes the EC2 instance by terminating it, but for persistent Spot requests, termination causes the request to become open again, potentially launching a new instance. Multiple user reports confirm the Spot request persists after stack deletion, requiring manual cancellation. Citations:
Add spot instance request cancellation to teardown process. The The teardown script must explicitly cancel all Spot Instance Requests associated with the stack before or after CloudFormation stack deletion. Reference the Spot request ID stored during instance launch and use 🤖 Prompt for AI Agents |
||
| # EC2 Instance | ||
| RHELInstance: | ||
| Type: AWS::EC2::Instance | ||
|
|
@@ -251,10 +271,14 @@ Resources: | |
| IamInstanceProfile: !Ref RHELInstanceProfile | ||
| InstanceType: !Ref HostInstanceType | ||
| LaunchTemplate: !If | ||
| - UseCapacityReservation | ||
| - LaunchTemplateId: !Ref RHELLaunchTemplate | ||
| Version: !GetAtt RHELLaunchTemplate.LatestVersionNumber | ||
| - !Ref AWS::NoValue | ||
| - UseSpotInstance | ||
| - LaunchTemplateId: !Ref RHELSpotLaunchTemplate | ||
| Version: !GetAtt RHELSpotLaunchTemplate.LatestVersionNumber | ||
| - !If | ||
| - UseCapacityReservation | ||
| - LaunchTemplateId: !Ref RHELLaunchTemplate | ||
| Version: !GetAtt RHELLaunchTemplate.LatestVersionNumber | ||
| - !Ref AWS::NoValue | ||
| NetworkInterfaces: | ||
| - AssociatePublicIpAddress: "False" | ||
| DeviceIndex: "0" | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spot request cancellation fails if instance is already terminated.
The current approach queries
describe-instancesto get the spot request ID, but this fails silently (line 71:2>/dev/null || echo "") when the instance no longer exists. This leaves persistent spot requests orphaned—they will continue attempting to launch new instances.Prefer reading from a persisted
spot-request-idfile (if created at stack time), falling back to the instance query:Proposed fix
# Cancel persistent spot request if the instance is a spot instance instance_id_file="${instance_data_dir}/aws-instance-id" +spot_request_file="${instance_data_dir}/spot-request-id" +spot_request_id="" + +# Prefer persisted spot request ID (survives instance termination) +if [[ -f "${spot_request_file}" ]]; then + spot_request_id=$(cat "${spot_request_file}") +elif [[ -f "${instance_id_file}" ]]; then + # Fallback: query instance (only works if instance still exists) -if [[ -f "${instance_id_file}" ]]; then instance_id=$(cat "${instance_id_file}") spot_request_id=$(aws --region "${REGION}" ec2 describe-instances \ --instance-ids "${instance_id}" \ --query 'Reservations[0].Instances[0].SpotInstanceRequestId' \ --output text --no-cli-pager 2>/dev/null || echo "") +fi - if [[ -n "${spot_request_id}" && "${spot_request_id}" != "None" && "${spot_request_id}" != "null" ]]; then - msg_info "Canceling persistent spot request ${spot_request_id}..." - aws --region "${REGION}" ec2 cancel-spot-instance-requests \ - --spot-instance-request-ids "${spot_request_id}" \ - --no-cli-pager >/dev/null 2>&1 || msg_warning "Failed to cancel spot request (may already be canceled)" - fi +if [[ -n "${spot_request_id}" && "${spot_request_id}" != "None" && "${spot_request_id}" != "null" ]]; then + msg_info "Canceling persistent spot request ${spot_request_id}..." + aws --region "${REGION}" ec2 cancel-spot-instance-requests \ + --spot-instance-request-ids "${spot_request_id}" \ + --no-cli-pager >/dev/null 2>&1 || msg_warning "Failed to cancel spot request (may already be canceled)" fi + +rm -f "${spot_request_file}"📝 Committable suggestion
🤖 Prompt for AI Agents