-
Notifications
You must be signed in to change notification settings - Fork 129
feat: Add a rds-instance-stop chaos fault #710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add a rds-instance-stop chaos fault #710
Conversation
Logs:
|
experimentDetails.InstanceID = types.Getenv("INSTANCE_ID", "") | ||
experimentDetails.ChaosPodName = types.Getenv("POD_NAME", "") | ||
experimentDetails.Delay, _ = strconv.Atoi(types.Getenv("STATUS_CHECK_DELAY", "2")) | ||
experimentDetails.Timeout, _ = strconv.Atoi(types.Getenv("STATUS_CHECK_TIMEOUT", "600")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stopping an RDS instance takes about 10 minutes. Therefore, I set STATUS_CHECK_TIMEOUT
to 600.
Signed-off-by: Jongwoo Han <[email protected]>
log.Infof("[Wait]: Wait for RDS instance '%v' to get in available state", identifier) | ||
if err := awslib.WaitForRDSInstanceUp(experimentsDetails.Timeout, experimentsDetails.Delay, experimentsDetails.Region, identifier); err != nil { | ||
return stacktrace.Propagate(err, "rds instance failed to start") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instances can remain stopped for up to 168 hours. This duration is generally sufficient, so it should not pose any issues.
Thank you so much for this contribution and your patience! 🙌 |
<tr> | ||
<td> RDS Instance Stop </td> | ||
<td> This experiment causes termination of an RDS instance before bringing it back to available state using the instance identifier after the specified chaos duration. We can also control the number of target instance using instance affected percentage</td> | ||
<td> <a href="https://litmuschaos.github.io/litmus/experiments/categories/aws/rds-instance-stop/"> Here </a> </td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add this document then provide the URL
The requested changes can be taken in the subsequent PRs. Thanks! |
|
||
if experimentsDetails.EngineName != "" { | ||
// Marking AUT as running, as we already checked the status of application under test | ||
msg := "AUT: Running" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
msg := "AUT: Running" | |
msg := "RDS Instance: Running" |
|
||
// Verify the aws rds instance is available (pre-chaos) | ||
if chaosDetails.DefaultHealthCheck { | ||
log.Info("[Status]: Verify that the aws rds instances are in available state (pre-chaos)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be added before the the probe - line#85
|
||
if experimentsDetails.EngineName != "" { | ||
// Marking AUT as running, as we already checked the status of application under test | ||
msg := "AUT: Running" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
* feat: Add a rds-instance-stop chaos fault Signed-off-by: Jongwoo Han <[email protected]> --------- Signed-off-by: Jongwoo Han <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]>
…Limit (#738) * Fix: handle pagination in ssm describe Signed-off-by: Sami Shabaneh <[email protected]> * implement exponential backoff with jitter for API rate limiting Signed-off-by: Sami Shabaneh <[email protected]> * Refactor Signed-off-by: Sami Shabaneh <[email protected]> * Update pkg/cloud/aws/ssm/ssm-operations.go Co-authored-by: Neelanjan Manna <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * fixup Signed-off-by: Sami Shabaneh <[email protected]> * Update pkg/cloud/aws/ssm/ssm-operations.go Co-authored-by: Udit Gaurav <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * Fix: include error message from stderr if container-kill fails (#740) (#741) Signed-off-by: Björn Kylberg <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * fix(logs): Fix the error logs for container-kill fault (#745) Signed-off-by: Shubham Chaudhary <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * fix(container-kill): Fixed the container stop command timeout issue (#747) Signed-off-by: Shubham Chaudhary <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * feat: Add a rds-instance-stop chaos fault (#710) * feat: Add a rds-instance-stop chaos fault Signed-off-by: Jongwoo Han <[email protected]> --------- Signed-off-by: Jongwoo Han <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * Update pkg/cloud/aws/ssm/ssm-operations.go Signed-off-by: Sami Shabaneh <[email protected]> * fix go fmt ./... Signed-off-by: Udit Gaurav <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * Filter instances on api call Signed-off-by: Sami Shabaneh <[email protected]> * fixes lint Signed-off-by: Udit Gaurav <[email protected]> --------- Signed-off-by: Sami Shabaneh <[email protected]> Signed-off-by: Björn Kylberg <[email protected]> Signed-off-by: Shubham Chaudhary <[email protected]> Signed-off-by: Jongwoo Han <[email protected]> Signed-off-by: Udit Gaurav <[email protected]> Co-authored-by: Neelanjan Manna <[email protected]> Co-authored-by: Udit Gaurav <[email protected]> Co-authored-by: Björn Kylberg <[email protected]> Co-authored-by: Shubham Chaudhary <[email protected]> Co-authored-by: Jongwoo Han <[email protected]> Co-authored-by: Udit Gaurav <[email protected]>
* feat: Add a rds-instance-stop chaos fault Signed-off-by: Jongwoo Han <[email protected]> --------- Signed-off-by: Jongwoo Han <[email protected]> Signed-off-by: SSanjeevi <[email protected]>
…Limit (litmuschaos#738) * Fix: handle pagination in ssm describe Signed-off-by: Sami Shabaneh <[email protected]> * implement exponential backoff with jitter for API rate limiting Signed-off-by: Sami Shabaneh <[email protected]> * Refactor Signed-off-by: Sami Shabaneh <[email protected]> * Update pkg/cloud/aws/ssm/ssm-operations.go Co-authored-by: Neelanjan Manna <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * fixup Signed-off-by: Sami Shabaneh <[email protected]> * Update pkg/cloud/aws/ssm/ssm-operations.go Co-authored-by: Udit Gaurav <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * Fix: include error message from stderr if container-kill fails (litmuschaos#740) (litmuschaos#741) Signed-off-by: Björn Kylberg <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * fix(logs): Fix the error logs for container-kill fault (litmuschaos#745) Signed-off-by: Shubham Chaudhary <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * fix(container-kill): Fixed the container stop command timeout issue (litmuschaos#747) Signed-off-by: Shubham Chaudhary <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * feat: Add a rds-instance-stop chaos fault (litmuschaos#710) * feat: Add a rds-instance-stop chaos fault Signed-off-by: Jongwoo Han <[email protected]> --------- Signed-off-by: Jongwoo Han <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * Update pkg/cloud/aws/ssm/ssm-operations.go Signed-off-by: Sami Shabaneh <[email protected]> * fix go fmt ./... Signed-off-by: Udit Gaurav <[email protected]> Signed-off-by: Sami Shabaneh <[email protected]> * Filter instances on api call Signed-off-by: Sami Shabaneh <[email protected]> * fixes lint Signed-off-by: Udit Gaurav <[email protected]> --------- Signed-off-by: Sami Shabaneh <[email protected]> Signed-off-by: Björn Kylberg <[email protected]> Signed-off-by: Shubham Chaudhary <[email protected]> Signed-off-by: Jongwoo Han <[email protected]> Signed-off-by: Udit Gaurav <[email protected]> Co-authored-by: Neelanjan Manna <[email protected]> Co-authored-by: Udit Gaurav <[email protected]> Co-authored-by: Björn Kylberg <[email protected]> Co-authored-by: Shubham Chaudhary <[email protected]> Co-authored-by: Jongwoo Han <[email protected]> Co-authored-by: Udit Gaurav <[email protected]> Signed-off-by: SSanjeevi <[email protected]>
What this PR does / why we need it:
I have implemented the rds-instance-stop chaos fault in this PR based on the proposal.
Please see also litmuschaos/chaos-charts#635
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #Special notes for your reviewer:
cc. @namkyu1999
Checklist:
breaking-changes
tagrequires-upgrade
tag