Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
142 changes: 142 additions & 0 deletions content/en/docs/operations/services/monitoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
---
title: "Monitoring Hub Reference"
linkTitle: "Monitoring"
---

{{< include "monitoring-overview.md" >}}

## Monitoring Architecture

```mermaid
graph TD
A[VMAgent] --> B[VMCluster]
B --> C[Grafana]
B --> D[Alerta]
E[Fluent Bit] --> F[VLogs]
D --> G[Telegram]
D --> H[Slack]
```

## Alerting Flow

```mermaid
sequenceDiagram
participant P as Prometheus
participant AM as Alertmanager
participant A as Alerta
participant T as Telegram
participant S as Slack
P->>AM: Send Alert
AM->>A: Forward Alert
A->>T: Send Notification
A->>S: Send Notification
```

<!--
Autogenerated content. Don't edit this file directly; edit sources instead.
metadata: https://github.com/cozystack/website/blob/main/content/en/docs/operations/services/_include/monitoring.md
source: https://github.com/cozystack/cozystack/blob/main/packages/extra/monitoring/README.md
-->


## Parameters

### Common parameters

| Name | Description | Type | Value |
| ------ | ----------------------------------------------------------------------------------------------------- | -------- | ----- |
| `host` | The hostname used to access Grafana externally (defaults to 'grafana' subdomain for the tenant host). | `string` | `""` |


### Metrics storage configuration

| Name | Description | Type | Value |
| ------------------------------------------------ | ------------------------------------------- | ---------- | ------- |
| `metricsStorages` | Configuration of metrics storage instances. | `[]object` | `[...]` |
| `metricsStorages[i].name` | Name of the storage instance. | `string` | `""` |
| `metricsStorages[i].retentionPeriod` | Retention period for metrics. | `string` | `""` |
| `metricsStorages[i].deduplicationInterval` | Deduplication interval for metrics. | `string` | `""` |
| `metricsStorages[i].storage` | Persistent volume size. | `string` | `10Gi` |
| `metricsStorages[i].storageClassName` | StorageClass used for the data. | `string` | `""` |
| `metricsStorages[i].vminsert` | Configuration for vminsert. | `object` | `{}` |
| `metricsStorages[i].vminsert.minAllowed` | Minimum guaranteed resources. | `object` | `{}` |
| `metricsStorages[i].vminsert.minAllowed.cpu` | CPU request. | `quantity` | `""` |
| `metricsStorages[i].vminsert.minAllowed.memory` | Memory request. | `quantity` | `""` |
| `metricsStorages[i].vminsert.maxAllowed` | Maximum allowed resources. | `object` | `{}` |
| `metricsStorages[i].vminsert.maxAllowed.cpu` | CPU limit. | `quantity` | `""` |
| `metricsStorages[i].vminsert.maxAllowed.memory` | Memory limit. | `quantity` | `""` |
| `metricsStorages[i].vmselect` | Configuration for vmselect. | `object` | `{}` |
| `metricsStorages[i].vmselect.minAllowed` | Minimum guaranteed resources. | `object` | `{}` |
| `metricsStorages[i].vmselect.minAllowed.cpu` | CPU request. | `quantity` | `""` |
| `metricsStorages[i].vmselect.minAllowed.memory` | Memory request. | `quantity` | `""` |
| `metricsStorages[i].vmselect.maxAllowed` | Maximum allowed resources. | `object` | `{}` |
| `metricsStorages[i].vmselect.maxAllowed.cpu` | CPU limit. | `quantity` | `""` |
| `metricsStorages[i].vmselect.maxAllowed.memory` | Memory limit. | `quantity` | `""` |
| `metricsStorages[i].vmstorage` | Configuration for vmstorage. | `object` | `{}` |
| `metricsStorages[i].vmstorage.minAllowed` | Minimum guaranteed resources. | `object` | `{}` |
| `metricsStorages[i].vmstorage.minAllowed.cpu` | CPU request. | `quantity` | `""` |
| `metricsStorages[i].vmstorage.minAllowed.memory` | Memory request. | `quantity` | `""` |
| `metricsStorages[i].vmstorage.maxAllowed` | Maximum allowed resources. | `object` | `{}` |
| `metricsStorages[i].vmstorage.maxAllowed.cpu` | CPU limit. | `quantity` | `""` |
| `metricsStorages[i].vmstorage.maxAllowed.memory` | Memory limit. | `quantity` | `""` |


### Logs storage configuration

| Name | Description | Type | Value |
| ---------------------------------- | ---------------------------------------- | ---------- | ------------ |
| `logsStorages` | Configuration of logs storage instances. | `[]object` | `[...]` |
| `logsStorages[i].name` | Name of the storage instance. | `string` | `""` |
| `logsStorages[i].retentionPeriod` | Retention period for logs. | `string` | `1` |
| `logsStorages[i].storage` | Persistent volume size. | `string` | `10Gi` |
| `logsStorages[i].storageClassName` | StorageClass used to store the data. | `string` | `replicated` |


### Alerta configuration

| Name | Description | Type | Value |
| ----------------------------------------- | --------------------------------------------------------------------- | ---------- | ------- |
| `alerta` | Configuration for the Alerta service. | `object` | `{}` |
| `alerta.storage` | Persistent volume size for the database. | `string` | `10Gi` |
| `alerta.storageClassName` | StorageClass used for the database. | `string` | `""` |
| `alerta.resources` | Resource configuration. | `object` | `{}` |
| `alerta.resources.requests` | Resource requests. | `object` | `{}` |
| `alerta.resources.requests.cpu` | CPU request. | `quantity` | `100m` |
| `alerta.resources.requests.memory` | Memory request. | `quantity` | `256Mi` |
| `alerta.resources.limits` | Resource limits. | `object` | `{}` |
| `alerta.resources.limits.cpu` | CPU limit. | `quantity` | `1` |
| `alerta.resources.limits.memory` | Memory limit. | `quantity` | `1Gi` |
| `alerta.alerts` | Alert routing configuration. | `object` | `{}` |
| `alerta.alerts.telegram` | Configuration for Telegram alerts. | `object` | `{}` |
| `alerta.alerts.telegram.token` | Telegram bot token. | `string` | `""` |
| `alerta.alerts.telegram.chatID` | Telegram chat ID(s), separated by commas. | `string` | `""` |
| `alerta.alerts.telegram.disabledSeverity` | List of severities without alerts (e.g. ["informational","warning"]). | `[]string` | `[]` |
| `alerta.alerts.slack` | Configuration for Slack alerts. | `object` | `{}` |
| `alerta.alerts.slack.url` | Configuration uri for Slack alerts. | `string` | `""` |
| `alerta.alerts.slack.disabledSeverity` | List of severities without alerts (e.g. ["informational","warning"]). | `[]string` | `[]` |


### Grafana configuration

| Name | Description | Type | Value |
| ----------------------------------- | ---------------------------------------- | ---------- | ------- |
| `grafana` | Configuration for Grafana. | `object` | `{}` |
| `grafana.db` | Database configuration. | `object` | `{}` |
| `grafana.db.size` | Persistent volume size for the database. | `string` | `10Gi` |
| `grafana.resources` | Resource configuration. | `object` | `{}` |
| `grafana.resources.requests` | Resource requests. | `object` | `{}` |
| `grafana.resources.requests.cpu` | CPU request. | `quantity` | `100m` |
| `grafana.resources.requests.memory` | Memory request. | `quantity` | `256Mi` |
| `grafana.resources.limits` | Resource limits. | `object` | `{}` |
| `grafana.resources.limits.cpu` | CPU limit. | `quantity` | `1` |
| `grafana.resources.limits.memory` | Memory limit. | `quantity` | `1Gi` |


### Vmagent configuration

| Name | Description | Type | Value |
| ------------------------ | ---------------------------------------- | -------- | ----- |
| `vmagent` | Configuration for VictoriaMetrics Agent. | `object` | `{}` |
| `vmagent.externalLabels` | External labels applied to all metrics. | `object` | `{}` |
| `vmagent.remoteWrite` | Remote write configuration. | `object` | `{}` |