Skip to content

Commit

Permalink
upgrade modules, docs and add priority offset
Browse files Browse the repository at this point in the history
  • Loading branch information
obeleh committed Jul 22, 2022
1 parent 7f01df5 commit bff7a28
Show file tree
Hide file tree
Showing 23 changed files with 175 additions and 112 deletions.
29 changes: 29 additions & 0 deletions .github/workflows/documentation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: Generate terraform docs

on:
push:
# don't run when we push a tag
tags-ignore:
- '*'
# don't run when we merge to main
# the action should have run already
branches-ignore:
- 'main'
jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: terraform-linters/setup-tflint@v2
name: Setup TFLint
with:
tflint_version: v0.38.1
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
- uses: pre-commit/[email protected]
# pre-commit fails if it changed files
# we want to go on
continue-on-error: true
- uses: pre-commit/[email protected]
- uses: EndBug/add-and-commit@v9
with:
default_author: github_actions
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@
*.tfstate.*
**/.terraform/
**/secrets.auto.tfvars
examples/.terraform.lock.hcl
7 changes: 3 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
repos:
- repo: https://github.com/gruntwork-io/pre-commit
rev: v0.1.14
rev: v0.1.12
hooks:
- id: terraform-fmt
- id: terraform-validate
- id: tflint
- repo: git@github.com:kabisa/terraform-datadog-pre-commit-hook.git
rev: "1.2.2"
- repo: https://github.com/kabisa/terraform-datadog-pre-commit-hook
rev: "1.3.6"
hooks:
- id: terraform-datadog-docs
exclude: ^README.md$
args:
- "."
21 changes: 18 additions & 3 deletions .terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

164 changes: 85 additions & 79 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Modules are generated with this tool: https://github.com/kabisa/datadog-terrafor
module "containers" {
source = "kabisa/docker-container/datadog"
notification_channel = "[email protected]"
notification_channel = "@[email protected]"
service = "MyApp"
env = "prd"
filter_str = "app:myapp"
Expand All @@ -37,25 +37,30 @@ module "containers" {
ingress_traffic_warning = 7500000 # 7.5MB/s
ingress_traffic_critical = 10000000 # 10MB/s
}
```


[Module Variables](#module-variables)

Monitors:
* [Terraform module for Datadog Docker Container](#terraform-module-for-datadog-docker-container)
* [CPU Usage](#cpu-usage)
* [Thread Count](#thread-count)
* [Disk Io Write](#disk-io-write)
* [Ingress Traffic](#ingress-traffic)
* [Memory Used Percent](#memory-used-percent)
* [Egress Traffic](#egress-traffic)
* [Disk Io Read](#disk-io-read)
* [Module Variables](#module-variables)

| Monitor name | Default enabled | Priority | Query |
|-----------------|------|----|------------------------|
| [CPU Usage](#cpu-usage) | True | 3 | `avg(last_15m):avg:docker.cpu.usage{tag:xxx} by {container_name,host${local.by_cluster}} > 85` |
| [Disk Io Read](#disk-io-read) | True | 3 | `avg(last_15m):avg:docker.io.read_bytes{tag:xxx} by {container_name,host${local.by_cluster}} > ` |
| [Disk Io Write](#disk-io-write) | True | 3 | `avg(last_15m):avg:docker.io.write_bytes{tag:xxx} by {container_name,host${local.by_cluster}} > ` |
| [Egress Traffic](#egress-traffic) | True | 3 | `avg(last_15m):avg:docker.net.bytes_sent{tag:xxx} by {container_name,host${local.by_cluster}} > ` |
| [Ingress Traffic](#ingress-traffic) | True | 3 | `avg(last_15m):avg:docker.net.bytes_rcvd{tag:xxx} by {container_name,host${local.by_cluster}} > ` |
| [Memory Used Percent](#memory-used-percent) | True | 3 | `avg(last_5m):avg:docker.mem.in_use{tag:xxx} by {container_name,host${local.by_cluster}} > 85` |
| [Thread Count](#thread-count) | False | 3 | `avg(last_30m):avg:docker.thread.count{tag:xxx} by {host${local.by_cluster},container_name} > ` |

# Getting started developing
[pre-commit](http://pre-commit.com/) was used to do Terraform linting and validating.

Steps:
- Install [pre-commit](http://pre-commit.com/). E.g. `brew install pre-commit`.
- Run `pre-commit install` in this repo. (Every time you cloud a repo with pre-commit enabled you will need to run the pre-commit install command)
- Run `pre-commit install` in this repo. (Every time you clone a repo with pre-commit enabled you will need to run the pre-commit install command)
- That’s it! Now every time you commit a code change (`.tf` file), the hooks in the `hooks:` config `.pre-commit-config.yaml` will execute.

## CPU Usage
Expand Down Expand Up @@ -83,29 +88,29 @@ avg(last_15m):avg:docker.cpu.usage{tag:xxx} by {container_name,host${local.by_cl
| cpu_usage_priority | 3 | No | Number from 1 (high) to 5 (low). |


## Thread Count
## Disk Io Read

Query:
```terraform
avg(last_30m):avg:docker.thread.count{tag:xxx} by {host${local.by_cluster},container_name} >
avg(last_15m):avg:docker.io.read_bytes{tag:xxx} by {container_name,host${local.by_cluster}} >
```

| variable | default | required | description |
|--------------------------------|----------|----------|----------------------------------|
| thread_count_enabled | False | No | |
| thread_count_warning | None | No | |
| thread_count_critical | None | No | |
| thread_count_evaluation_period | last_30m | No | |
| thread_count_note | "" | No | |
| thread_count_docs | "" | No | |
| thread_count_filter_override | "" | No | |
| thread_count_alerting_enabled | True | No | |
| thread_count_no_data_timeframe | None | No | |
| thread_count_notify_no_data | False | No | |
| thread_count_ok_threshold | None | No | |
| thread_count_name_prefix | "" | No | |
| thread_count_name_suffix | "" | No | |
| thread_count_priority | 3 | No | Number from 1 (high) to 5 (low). |
| disk_io_read_enabled | True | No | |
| disk_io_read_warning | None | No | |
| disk_io_read_critical | | Yes | |
| disk_io_read_evaluation_period | last_15m | No | |
| disk_io_read_note | "" | No | |
| disk_io_read_docs | "" | No | |
| disk_io_read_filter_override | "" | No | |
| disk_io_read_alerting_enabled | True | No | |
| disk_io_read_no_data_timeframe | None | No | |
| disk_io_read_notify_no_data | False | No | |
| disk_io_read_ok_threshold | None | No | |
| disk_io_read_name_prefix | "" | No | |
| disk_io_read_name_suffix | "" | No | |
| disk_io_read_priority | 3 | No | Number from 1 (high) to 5 (low). |


## Disk Io Write
Expand Down Expand Up @@ -133,6 +138,31 @@ avg(last_15m):avg:docker.io.write_bytes{tag:xxx} by {container_name,host${local.
| disk_io_write_priority | 3 | No | Number from 1 (high) to 5 (low). |


## Egress Traffic

Query:
```terraform
avg(last_15m):avg:docker.net.bytes_sent{tag:xxx} by {container_name,host${local.by_cluster}} >
```

| variable | default | required | description |
|----------------------------------|----------|----------|----------------------------------|
| egress_traffic_enabled | True | No | |
| egress_traffic_warning | None | No | |
| egress_traffic_critical | | Yes | |
| egress_traffic_evaluation_period | last_15m | No | |
| egress_traffic_note | "" | No | |
| egress_traffic_docs | "" | No | |
| egress_traffic_filter_override | "" | No | |
| egress_traffic_alerting_enabled | True | No | |
| egress_traffic_no_data_timeframe | None | No | |
| egress_traffic_notify_no_data | False | No | |
| egress_traffic_ok_threshold | None | No | |
| egress_traffic_name_prefix | "" | No | |
| egress_traffic_name_suffix | "" | No | |
| egress_traffic_priority | 3 | No | Number from 1 (high) to 5 (low). |


## Ingress Traffic

Query:
Expand Down Expand Up @@ -183,68 +213,44 @@ avg(last_5m):avg:docker.mem.in_use{tag:xxx} by {container_name,host${local.by_cl
| memory_used_percent_priority | 3 | No | Number from 1 (high) to 5 (low). |


## Egress Traffic

Query:
```terraform
avg(last_15m):avg:docker.net.bytes_sent{tag:xxx} by {container_name,host${local.by_cluster}} >
```

| variable | default | required | description |
|----------------------------------|----------|----------|----------------------------------|
| egress_traffic_enabled | True | No | |
| egress_traffic_warning | None | No | |
| egress_traffic_critical | | Yes | |
| egress_traffic_evaluation_period | last_15m | No | |
| egress_traffic_note | "" | No | |
| egress_traffic_docs | "" | No | |
| egress_traffic_filter_override | "" | No | |
| egress_traffic_alerting_enabled | True | No | |
| egress_traffic_no_data_timeframe | None | No | |
| egress_traffic_notify_no_data | False | No | |
| egress_traffic_ok_threshold | None | No | |
| egress_traffic_name_prefix | "" | No | |
| egress_traffic_name_suffix | "" | No | |
| egress_traffic_priority | 3 | No | Number from 1 (high) to 5 (low). |


## Disk Io Read
## Thread Count

Query:
```terraform
avg(last_15m):avg:docker.io.read_bytes{tag:xxx} by {container_name,host${local.by_cluster}} >
avg(last_30m):avg:docker.thread.count{tag:xxx} by {host${local.by_cluster},container_name} >
```

| variable | default | required | description |
|--------------------------------|----------|----------|----------------------------------|
| disk_io_read_enabled | True | No | |
| disk_io_read_warning | None | No | |
| disk_io_read_critical | | Yes | |
| disk_io_read_evaluation_period | last_15m | No | |
| disk_io_read_note | "" | No | |
| disk_io_read_docs | "" | No | |
| disk_io_read_filter_override | "" | No | |
| disk_io_read_alerting_enabled | True | No | |
| disk_io_read_no_data_timeframe | None | No | |
| disk_io_read_notify_no_data | False | No | |
| disk_io_read_ok_threshold | None | No | |
| disk_io_read_name_prefix | "" | No | |
| disk_io_read_name_suffix | "" | No | |
| disk_io_read_priority | 3 | No | Number from 1 (high) to 5 (low). |
| thread_count_enabled | False | No | |
| thread_count_warning | None | No | |
| thread_count_critical | None | No | |
| thread_count_evaluation_period | last_30m | No | |
| thread_count_note | "" | No | |
| thread_count_docs | "" | No | |
| thread_count_filter_override | "" | No | |
| thread_count_alerting_enabled | True | No | |
| thread_count_no_data_timeframe | None | No | |
| thread_count_notify_no_data | False | No | |
| thread_count_ok_threshold | None | No | |
| thread_count_name_prefix | "" | No | |
| thread_count_name_suffix | "" | No | |
| thread_count_priority | 3 | No | Number from 1 (high) to 5 (low). |


## Module Variables

| variable | default | required | description |
|----------------------|----------|----------|--------------|
| filter_str | | Yes | |
| env | | Yes | |
| service | | Yes | |
| notification_channel | | Yes | |
| additional_tags | [] | No | |
| locked | False | No | |
| name_prefix | "" | No | |
| name_suffix | "" | No | |
| runs_in_k8s | False | No | |
| variable | default | required | description |
|----------------------|----------|----------|----------------------------------------------------------|
| filter_str | | Yes | |
| env | | Yes | |
| service | | Yes | |
| notification_channel | | Yes | |
| additional_tags | [] | No | |
| locked | False | No | |
| name_prefix | "" | No | |
| name_suffix | "" | No | |
| runs_in_k8s | False | No | |
| priority_offset | 0 | No | For non production workloads we can +1 on the priorities |


2 changes: 1 addition & 1 deletion cpu-usage-variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -68,4 +68,4 @@ variable "cpu_usage_priority" {

type = number
default = 3
}
}
5 changes: 3 additions & 2 deletions cpu-usage.tf
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ locals {
}

module "cpu_usage" {
source = "[email protected]:kabisa/terraform-datadog-generic-monitor.git?ref=0.7.0"
source = "kabisa/generic-monitor/datadog"
version = "1.0.0"

name = "Container - CPU usage"
query = "avg(${var.cpu_usage_evaluation_period}):avg:docker.cpu.usage{${local.cpu_usage_filter}} by {container_name,host${local.by_cluster}} > ${var.cpu_usage_critical}"
Expand All @@ -22,7 +23,7 @@ module "cpu_usage" {
alerting_enabled = var.cpu_usage_alerting_enabled
warning_threshold = var.cpu_usage_warning
critical_threshold = var.cpu_usage_critical
priority = var.cpu_usage_priority
priority = min(var.cpu_usage_priority + var.priority_offset, 5)
docs = var.cpu_usage_docs
note = var.cpu_usage_note

Expand Down
2 changes: 1 addition & 1 deletion disk-io-read-variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -67,4 +67,4 @@ variable "disk_io_read_priority" {

type = number
default = 3
}
}
5 changes: 3 additions & 2 deletions disk-io-read.tf
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ locals {
}

module "disk_io_read" {
source = "[email protected]:kabisa/terraform-datadog-generic-monitor.git?ref=0.7.0"
source = "kabisa/generic-monitor/datadog"
version = "1.0.0"

name = "Container - Disk IO Read"
query = "avg(${var.disk_io_read_evaluation_period}):avg:docker.io.read_bytes{${local.disk_io_read_filter}} by {container_name,host${local.by_cluster}} > ${var.disk_io_read_critical}"
Expand All @@ -22,7 +23,7 @@ module "disk_io_read" {
alerting_enabled = var.disk_io_read_alerting_enabled
warning_threshold = var.disk_io_read_warning
critical_threshold = var.disk_io_read_critical
priority = var.disk_io_read_priority
priority = min(var.disk_io_read_priority + var.priority_offset, 5)
docs = var.disk_io_read_docs
note = var.disk_io_read_note

Expand Down
2 changes: 1 addition & 1 deletion disk-io-write-variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -67,4 +67,4 @@ variable "disk_io_write_priority" {

type = number
default = 3
}
}
Loading

0 comments on commit bff7a28

Please sign in to comment.