Skip to content

Production - [Alerting] Android emulator failure rate alert #11620

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dotnet-eng-status bot opened this issue Nov 14, 2022 · 9 comments
Closed

Production - [Alerting] Android emulator failure rate alert #11620

dotnet-eng-status bot opened this issue Nov 14, 2022 · 9 comments
Assignees
Labels
Critical Grafana Alert Issues opened by Grafana Inactive Alert Issues from Grafana alerts that are now "OK" Ops - First Responder Production Tied to the Production environment (as opposed to Staging)

Comments

@dotnet-eng-status
Copy link

💔 Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=API open / a001OKI} 92
  • FailureRate {Machine=API open / a001OKK} 81
  • FailureRate {Machine=API open / a001OL7} 88
  • FailureRate {Machine=API open / a001OLF} 100
  • FailureRate {Machine=API open / a001OLH} 100
  • FailureRate {Machine=API open / a001OLO} 100
  • FailureRate {Machine=API open / a001OLP} 90
  • FailureRate {Machine=API open / a001OLQ} 100
  • FailureRate {Machine=API open / a001OLU} 84
  • FailureRate {Machine=API open / a001OLV} 100
  • FailureRate {Machine=API open / a001OLZ} 92
  • FailureRate {Machine=API open / a001OM1} 100
  • FailureRate {Machine=API open / a001OMB} 94

Go to rule

@dotnet/dnceng, please investigate

Automation information below, do not change

Grafana-Automated-Alert-Id-e38f14fe3367451d8de43da6e2453fdd

@dotnet-eng-status dotnet-eng-status bot added Active Alert Issues from Grafana alerts that are now active Critical Ops - First Responder Grafana Alert Issues opened by Grafana Production Tied to the Production environment (as opposed to Staging) labels Nov 14, 2022
@premun premun self-assigned this Nov 14, 2022
@premun
Copy link
Member

premun commented Nov 14, 2022

Ah, right, this was re-opened after I closed #11605.

Will jsut wait for it to go green before closing

@dotnet-eng-status
Copy link
Author

💔 Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=API 29 / a000VJC} 85
  • FailureRate {Machine=API 29 / a000VJJ} 92
  • FailureRate {Machine=API 29 / a000VKD} 85
  • FailureRate {Machine=API open / a001OLD} 84
  • FailureRate {Machine=API open / a001OLP} 94
  • FailureRate {Machine=API open / a001OLU} 84
  • FailureRate {Machine=API open / a001OLZ} 96
  • FailureRate {Machine=API open / a001OMB} 94
  • FailureRate {Machine=API open / a001OPE} 94
  • FailureRate {Machine=API open / a001OPF} 94
  • FailureRate {Machine=API open / a001ORY} 94
  • FailureRate {Machine=API open / a001OS0} 85
  • FailureRate {Machine=API open / a001OS3} 100
  • FailureRate {Machine=API open / a001OS9} 100
  • FailureRate {Machine=API open / a001OSC} 83
  • FailureRate {Machine=API open / a001OSF} 93
  • FailureRate {Machine=API open / a001OSG} 100
  • FailureRate {Machine=API open / a001OSL} 81
  • FailureRate {Machine=API open / a001OSN} 85
  • FailureRate {Machine=API open / a001OSQ} 91
  • FailureRate {Machine=API open / a001OSS} 95
  • FailureRate {Machine=API open / a001OT4} 90
  • FailureRate {Machine=API open / a001OT6} 83
  • FailureRate {Machine=API open / a001OT7} 95
  • FailureRate {Machine=API open / a001OX7} 82

Go to rule

@dotnet-eng-status
Copy link
Author

💔 Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=API 29 / a000VJC} 85
  • FailureRate {Machine=API 29 / a000VJJ} 92
  • FailureRate {Machine=API 29 / a000VKD} 85
  • FailureRate {Machine=API open / a001OPE} 94
  • FailureRate {Machine=API open / a001OPF} 94
  • FailureRate {Machine=API open / a001ORY} 94
  • FailureRate {Machine=API open / a001OS0} 85
  • FailureRate {Machine=API open / a001OS3} 100
  • FailureRate {Machine=API open / a001OS9} 100
  • FailureRate {Machine=API open / a001OSC} 83
  • FailureRate {Machine=API open / a001OSF} 93
  • FailureRate {Machine=API open / a001OSG} 100
  • FailureRate {Machine=API open / a001OSL} 81
  • FailureRate {Machine=API open / a001OSN} 85
  • FailureRate {Machine=API open / a001OSQ} 91
  • FailureRate {Machine=API open / a001OSS} 95
  • FailureRate {Machine=API open / a001OT4} 90
  • FailureRate {Machine=API open / a001OT6} 83
  • FailureRate {Machine=API open / a001OT7} 95
  • FailureRate {Machine=API open / a001OVZ} 88
  • FailureRate {Machine=API open / a001OWX} 91
  • FailureRate {Machine=API open / a001OX6} 88
  • FailureRate {Machine=API open / a001OX7} 82

Go to rule

4 similar comments
@dotnet-eng-status
Copy link
Author

💔 Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=API 29 / a000VJC} 85
  • FailureRate {Machine=API 29 / a000VJJ} 92
  • FailureRate {Machine=API 29 / a000VKD} 85
  • FailureRate {Machine=API open / a001OPE} 94
  • FailureRate {Machine=API open / a001OPF} 94
  • FailureRate {Machine=API open / a001ORY} 94
  • FailureRate {Machine=API open / a001OS0} 85
  • FailureRate {Machine=API open / a001OS3} 100
  • FailureRate {Machine=API open / a001OS9} 100
  • FailureRate {Machine=API open / a001OSC} 83
  • FailureRate {Machine=API open / a001OSF} 93
  • FailureRate {Machine=API open / a001OSG} 100
  • FailureRate {Machine=API open / a001OSL} 81
  • FailureRate {Machine=API open / a001OSN} 85
  • FailureRate {Machine=API open / a001OSQ} 91
  • FailureRate {Machine=API open / a001OSS} 95
  • FailureRate {Machine=API open / a001OT4} 90
  • FailureRate {Machine=API open / a001OT6} 83
  • FailureRate {Machine=API open / a001OT7} 95
  • FailureRate {Machine=API open / a001OVZ} 88
  • FailureRate {Machine=API open / a001OWX} 91
  • FailureRate {Machine=API open / a001OX6} 88
  • FailureRate {Machine=API open / a001OX7} 82

Go to rule

@dotnet-eng-status
Copy link
Author

💔 Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=API 29 / a000VJC} 85
  • FailureRate {Machine=API 29 / a000VJJ} 92
  • FailureRate {Machine=API 29 / a000VKD} 85
  • FailureRate {Machine=API open / a001OPE} 94
  • FailureRate {Machine=API open / a001OPF} 94
  • FailureRate {Machine=API open / a001ORY} 94
  • FailureRate {Machine=API open / a001OS0} 85
  • FailureRate {Machine=API open / a001OS3} 100
  • FailureRate {Machine=API open / a001OS9} 100
  • FailureRate {Machine=API open / a001OSC} 83
  • FailureRate {Machine=API open / a001OSF} 93
  • FailureRate {Machine=API open / a001OSG} 100
  • FailureRate {Machine=API open / a001OSL} 81
  • FailureRate {Machine=API open / a001OSN} 85
  • FailureRate {Machine=API open / a001OSQ} 91
  • FailureRate {Machine=API open / a001OSS} 95
  • FailureRate {Machine=API open / a001OT4} 90
  • FailureRate {Machine=API open / a001OT6} 83
  • FailureRate {Machine=API open / a001OT7} 95
  • FailureRate {Machine=API open / a001OVZ} 88
  • FailureRate {Machine=API open / a001OWX} 91
  • FailureRate {Machine=API open / a001OX6} 88
  • FailureRate {Machine=API open / a001OX7} 82

Go to rule

@dotnet-eng-status
Copy link
Author

💔 Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=API 29 / a000VJC} 85
  • FailureRate {Machine=API 29 / a000VJJ} 92
  • FailureRate {Machine=API 29 / a000VKD} 85
  • FailureRate {Machine=API open / a001OPE} 94
  • FailureRate {Machine=API open / a001OPF} 94
  • FailureRate {Machine=API open / a001ORY} 94
  • FailureRate {Machine=API open / a001OS0} 85
  • FailureRate {Machine=API open / a001OS3} 100
  • FailureRate {Machine=API open / a001OS9} 100
  • FailureRate {Machine=API open / a001OSC} 83
  • FailureRate {Machine=API open / a001OSF} 93
  • FailureRate {Machine=API open / a001OSG} 100
  • FailureRate {Machine=API open / a001OSL} 81
  • FailureRate {Machine=API open / a001OSN} 85
  • FailureRate {Machine=API open / a001OSQ} 91
  • FailureRate {Machine=API open / a001OSS} 95
  • FailureRate {Machine=API open / a001OT4} 90
  • FailureRate {Machine=API open / a001OT6} 83
  • FailureRate {Machine=API open / a001OT7} 95
  • FailureRate {Machine=API open / a001OVZ} 88
  • FailureRate {Machine=API open / a001OWX} 91
  • FailureRate {Machine=API open / a001OX6} 88
  • FailureRate {Machine=API open / a001OX7} 82

Go to rule

@dotnet-eng-status
Copy link
Author

💔 Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=API 29 / a000VJC} 85
  • FailureRate {Machine=API 29 / a000VJJ} 92
  • FailureRate {Machine=API 29 / a000VKD} 85
  • FailureRate {Machine=API open / a001OPE} 94
  • FailureRate {Machine=API open / a001OPF} 94
  • FailureRate {Machine=API open / a001ORY} 94
  • FailureRate {Machine=API open / a001OS0} 85
  • FailureRate {Machine=API open / a001OS3} 100
  • FailureRate {Machine=API open / a001OS9} 100
  • FailureRate {Machine=API open / a001OSC} 83
  • FailureRate {Machine=API open / a001OSF} 93
  • FailureRate {Machine=API open / a001OSG} 100
  • FailureRate {Machine=API open / a001OSL} 81
  • FailureRate {Machine=API open / a001OSN} 85
  • FailureRate {Machine=API open / a001OSQ} 91
  • FailureRate {Machine=API open / a001OSS} 95
  • FailureRate {Machine=API open / a001OT4} 90
  • FailureRate {Machine=API open / a001OT6} 83
  • FailureRate {Machine=API open / a001OT7} 95
  • FailureRate {Machine=API open / a001OVZ} 88
  • FailureRate {Machine=API open / a001OWX} 91
  • FailureRate {Machine=API open / a001OX6} 88
  • FailureRate {Machine=API open / a001OX7} 82

Go to rule

@dotnet-eng-status
Copy link
Author

💔 Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=API open / a001OVZ} 88
  • FailureRate {Machine=API open / a001OWX} 93
  • FailureRate {Machine=API open / a001OX6} 88

Go to rule

@dotnet-eng-status dotnet-eng-status bot added Inactive Alert Issues from Grafana alerts that are now "OK" and removed Active Alert Issues from Grafana alerts that are now active labels Nov 18, 2022
@dotnet-eng-status
Copy link
Author

💚 Metric state changed to ok

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

Go to rule

@premun premun closed this as completed Nov 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Critical Grafana Alert Issues opened by Grafana Inactive Alert Issues from Grafana alerts that are now "OK" Ops - First Responder Production Tied to the Production environment (as opposed to Staging)
Projects
None yet
Development

No branches or pull requests

1 participant