差异化更新告警策略
请求信息
请求行
PATCH /apis/monitoring.coreos.com/v1/namespaces/{namespace}/prometheusrules/{name}
差异化更新指定的 Prometheus 告警规则。
请求行参数
名称 | 类型 | 是否必填项 | 描述 |
---|---|---|---|
name | string | 是 | Prometheus 告警规则的名称。 |
namespace | string | 是 | 命名空间。对象名称和身份验证范围,例如团队和项目。 |
请求参数
名称 | 类型 | 是否必填项 | 描述 |
---|---|---|---|
dryRun | string | 否 | 当该参数出现时,表示不应该持久化修改。一个无效的或无法识别的 dryRun 指令将导致错误响应,并且不会进一步处理请求。有效值为:All,将处理所有的试运行阶段。 |
fieldManager | string | 否 | fieldManager 是与正在进行这些更改的参与者或实体相关联的名称。值必须少于 128 字符,且仅包含可打印字符,参见 https://golang.org/pkg/unicode/#IsPrint。 |
pretty | string | 否 | 如果为 true,则会将返回结果输出成适合打印的格式。 |
请求体
Content-Type
application/json-patch+json
,
application/merge-patch+json
,
application/apply-patch+yaml
请求体示例
Patch 方法,为了给 Kubernetes Patch 请求体一个具体的名称和类型。
{
"metadata": {
"annotations": {
"cpaas.io/operator": "admin@cpaas.io"
}
}
}
返回信息
Content-Type
application/json
,
application/yaml
状态码: 200
OK
返回体示例
Prometheus 实例的告警规则。
{
"apiVersion": "monitoring.coreos.com/v1",
"kind": "PrometheusRule",
"metadata": {
"annotations": {
"alert.cpaas.io/alert.legend": "{\"custom-s1uxp-8246a8b710391e713bc3168cdbafe3c3\":\"{{.namespace}}/{{.service}}\"}",
"alert.cpaas.io/alert.status": "{\"custom-fvl73-b39cdec96f4630c3570b88acf5800928\":\"inactive\",\"custom-s1uxp-8246a8b710391e713bc3168cdbafe3c3\":\"firing\",\"pod.cpu.utilization-iz54h-c600647e59aaaaa5e6c431e207459b67\":\"inactive\",\"pod.memory.utilization-ltvdj-dd6049c7f8a00a9e164293947a42744b\":\"inactive\",\"workload.cpu.utilization-b0qt1-12b6e8d32e8a2b97747921995afef25c\":\"inactive\",\"workload.memory.utilization-ug1tl-16eb03e59370f0f4acf5d7c583936f84\":\"inactive\",\"workload.pod.restarted.count-dvmuq-4056dd427e38b3dac6893109c5a9d88c\":\"inactive\",\"workload.replicas.available-tkak6-4eb3f543b0dc15b887ddbf1b42d20d76\":\"inactive\"}",
"alert.cpaas.io/notifications": "[{\"namespace\":\"cpaas-system\",\"name\":\"cpaas-admin-notification\"}]",
"alert.cpaas.io/patch": "",
"alert.cpaas.io/rules.version": "5",
"alert.cpaas.io/silence.config": "{\"startsAt\":\"2021-06-19T07:17:43Z\",\"endsAt\":\"2021-06-24T07:16:18Z\",\"creator\":\"admin@cpaas.io\"}",
"cpaas.io/description": "Cpaas平台组件Prometheus的告警策略",
"cpaas.io/display-name": "平台组件Prometheus",
"cpaas.io/operator": "admin@cpaas.io",
"cpaas.io/updated-at": "2021-06-18T07:17:58Z"
},
"creationTimestamp": "2021-06-11T05:42:11Z",
"generation": 2,
"labels": {
"alert.cpaas.io/alert.status": "firing",
"alert.cpaas.io/built-in": "true",
"alert.cpaas.io/cluster": "calico",
"alert.cpaas.io/kind": "StatefulSet",
"alert.cpaas.io/name": "prometheus-kube-prometheus-0",
"alert.cpaas.io/namespace": "cpaas-system",
"alert.cpaas.io/owner": "System",
"alert.cpaas.io/project": "system",
"alert.cpaas.io/silence.status": "active",
"alert.cpaas.io/silence.uuid": "6de1ecd1-f151-4219-b80b-89386205fdeb",
"helm.sh/chart-name": "kube-prometheus",
"helm.sh/chart-version": "v3.5.0-feat-add-recordingrules-for-platform-overview.2106110246",
"helm.sh/release-name": "kube-prometheus",
"helm.sh/release-namespace": "cpaas-system",
"prometheus": "kube-prometheus"
},
"managedFields": [
{
"apiVersion": "monitoring.coreos.com/v1",
"fieldsType": "FieldsV1",
"fieldsV1": {
"f:metadata": {
"f:annotations": {
"f:alert.cpaas.io/silence.config": {},
"f:cpaas.io/operator": {},
"f:cpaas.io/updated-at": {}
}
}
},
"manager": "sentry",
"operation": "Update",
"time": "2021-06-18T07:18:05Z"
},
{
"apiVersion": "monitoring.coreos.com/v1",
"fieldsType": "FieldsV1",
"fieldsV1": {
"f:metadata": {
"f:annotations": {
".": {},
"f:alert.cpaas.io/alert.legend": {},
"f:alert.cpaas.io/notifications": {},
"f:cpaas.io/description": {},
"f:cpaas.io/display-name": {}
},
"f:labels": {
".": {},
"f:alert.cpaas.io/built-in": {},
"f:alert.cpaas.io/cluster": {},
"f:alert.cpaas.io/kind": {},
"f:alert.cpaas.io/name": {},
"f:alert.cpaas.io/namespace": {},
"f:alert.cpaas.io/owner": {},
"f:alert.cpaas.io/project": {},
"f:helm.sh/chart-name": {},
"f:helm.sh/chart-version": {},
"f:helm.sh/release-name": {},
"f:helm.sh/release-namespace": {},
"f:prometheus": {}
}
},
"f:spec": {
".": {},
"f:groups": {}
}
},
"manager": "Go-http-client",
"operation": "Update",
"time": "2021-06-23T05:25:06Z"
},
{
"apiVersion": "monitoring.coreos.com/v1",
"fieldsType": "FieldsV1",
"fieldsV1": {
"f:metadata": {
"f:annotations": {
"f:alert.cpaas.io/action": {},
"f:alert.cpaas.io/alert.status": {},
"f:alert.cpaas.io/rules.version": {}
},
"f:labels": {
"f:alert.cpaas.io/alert.status": {},
"f:alert.cpaas.io/silence.status": {},
"f:alert.cpaas.io/silence.uuid": {}
}
}
},
"manager": "manager",
"operation": "Update",
"time": "2021-06-23T05:25:06Z"
}
],
"name": "cpaas-prometheus-rules",
"namespace": "cpaas-system",
"resourceVersion": "14752116",
"selfLink": "/apis/monitoring.coreos.com/v1/namespaces/cpaas-system/prometheusrules/cpaas-prometheus-rules",
"uid": "fe6d2514-c27e-4913-afd1-1cd8adeaae07"
},
"spec": {
"groups": [
{
"name": "general",
"rules": [
{
"alert": "pod.cpu.utilization-iz54h-c600647e59aaaaa5e6c431e207459b67",
"annotations": {
"alert_current_value": "{{$value}}",
"alert_notifications": "[{\"namespace\":\"cpaas-system\",\"name\":\"cpaas-admin-notification\"}]"
},
"expr": " sum by (pod_name) (container_cpu_usage_seconds_total_irate5m{namespace=\"cpaas-system\",pod_name=~\"prometheus-kube-prometheus-0-[0-9]{1,3}\",image!=\"\",container_name!=\"POD\"}) / sum by (pod_name) (container_spec_cpu_quota{namespace=\"cpaas-system\",pod_name=~\"prometheus-kube-prometheus-0-[0-9]{1,3}\",image!=\"\",container_name!=\"POD\"}) * 100000 \u003e0.9",
"for": "60s",
"labels": {
"alert_cluster": "calico",
"alert_indicator": "pod.cpu.utilization",
"alert_indicator_aggregate_range": "0",
"alert_indicator_comparison": "\u003e",
"alert_indicator_query": "",
"alert_indicator_threshold": "0.9",
"alert_involved_object_kind": "StatefulSet",
"alert_involved_object_name": "prometheus-kube-prometheus-0",
"alert_involved_object_namespace": "cpaas-system",
"alert_name": "pod.cpu.utilization-iz54h",
"alert_project": "system",
"alert_resource": "cpaas-prometheus-rules",
"severity": "Medium"
}
},
{
"alert": "pod.memory.utilization-ltvdj-dd6049c7f8a00a9e164293947a42744b",
"annotations": {
"alert_current_value": "{{$value}}",
"alert_notifications": "[{\"namespace\":\"cpaas-system\",\"name\":\"cpaas-admin-notification\"}]"
},
"expr": " sum by(pod_name) (container_memory_usage_bytes_without_cache{namespace=\"cpaas-system\",pod_name=~\"prometheus-kube-prometheus-0-[0-9]{1,3}\"}) / sum by (pod_name) (container_spec_memory_limit_bytes{namespace=\"cpaas-system\",pod_name=~\"prometheus-kube-prometheus-0-[0-9]{1,3}\",image!=\"\",container_name!=\"POD\"}) \u003e0.9",
"for": "60s",
"labels": {
"alert_cluster": "calico",
"alert_indicator": "pod.memory.utilization",
"alert_indicator_aggregate_range": "0",
"alert_indicator_comparison": "\u003e",
"alert_indicator_query": "",
"alert_indicator_threshold": "0.9",
"alert_involved_object_kind": "StatefulSet",
"alert_involved_object_name": "prometheus-kube-prometheus-0",
"alert_involved_object_namespace": "cpaas-system",
"alert_name": "pod.memory.utilization-ltvdj",
"alert_project": "system",
"alert_resource": "cpaas-prometheus-rules",
"severity": "Medium"
}
},
{
"alert": "workload.cpu.utilization-b0qt1-12b6e8d32e8a2b97747921995afef25c",
"annotations": {
"alert_current_value": "{{$value}}",
"alert_notifications": "[{\"namespace\":\"cpaas-system\",\"name\":\"cpaas-admin-notification\"}]"
},
"expr": " sum by (deployment_name) (container_cpu_usage_seconds_total_irate5m{namespace=\"cpaas-system\",pod_name=~\"prometheus-kube-prometheus-0-[0-9]{1,3}\",image!=\"\",container_name!=\"POD\"}) / sum by (deployment_name) (container_spec_cpu_quota{namespace=\"cpaas-system\",pod_name=~\"prometheus-kube-prometheus-0-[0-9]{1,3}\",image!=\"\",container_name!=\"POD\"}) * 100000 \u003e0.9",
"for": "60s",
"labels": {
"alert_cluster": "calico",
"alert_indicator": "workload.cpu.utilization",
"alert_indicator_aggregate_range": "0",
"alert_indicator_comparison": "\u003e",
"alert_indicator_query": "",
"alert_indicator_threshold": "0.9",
"alert_involved_object_kind": "StatefulSet",
"alert_involved_object_name": "prometheus-kube-prometheus-0",
"alert_involved_object_namespace": "cpaas-system",
"alert_name": "workload.cpu.utilization-b0qt1",
"alert_project": "system",
"alert_resource": "cpaas-prometheus-rules",
"severity": "Medium"
}
},
{
"alert": "workload.memory.utilization-ug1tl-16eb03e59370f0f4acf5d7c583936f84",
"annotations": {
"alert_current_value": "{{$value}}",
"alert_notifications": "[{\"namespace\":\"cpaas-system\",\"name\":\"cpaas-admin-notification\"}]"
},
"expr": " sum by (deployment_name) (container_memory_usage_bytes_without_cache{namespace=\"cpaas-system\",pod_name=~\"prometheus-kube-prometheus-0-[0-9]{1,3}\"}) / sum by (deployment_name) (container_spec_memory_limit_bytes{namespace=\"cpaas-system\",pod_name=~\"prometheus-kube-prometheus-0-[0-9]{1,3}\",image!=\"\",container_name!=\"POD\"}) \u003e0.9",
"for": "60s",
"labels": {
"alert_cluster": "calico",
"alert_indicator": "workload.memory.utilization",
"alert_indicator_aggregate_range": "0",
"alert_indicator_comparison": "\u003e",
"alert_indicator_query": "",
"alert_indicator_threshold": "0.9",
"alert_involved_object_kind": "StatefulSet",
"alert_involved_object_name": "prometheus-kube-prometheus-0",
"alert_involved_object_namespace": "cpaas-system",
"alert_name": "workload.memory.utilization-ug1tl",
"alert_project": "system",
"alert_resource": "cpaas-prometheus-rules",
"severity": "Medium"
}
},
{
"alert": "workload.pod.restarted.count-dvmuq-4056dd427e38b3dac6893109c5a9d88c",
"annotations": {
"alert_current_value": "{{$value}}",
"alert_notifications": "[{\"namespace\":\"cpaas-system\",\"name\":\"cpaas-admin-notification\"}]"
},
"expr": "sum (delta(kube_pod_container_status_restarts_total{namespace=\"cpaas-system\",pod=~\"prometheus-kube-prometheus-0-[0-9]{1,3}\"}[5m]))\u003e5",
"for": "60s",
"labels": {
"alert_cluster": "calico",
"alert_indicator": "workload.pod.restarted.count",
"alert_indicator_aggregate_range": "0",
"alert_indicator_comparison": "\u003e",
"alert_indicator_query": "",
"alert_indicator_threshold": "5",
"alert_involved_object_kind": "StatefulSet",
"alert_involved_object_name": "prometheus-kube-prometheus-0",
"alert_involved_object_namespace": "cpaas-system",
"alert_name": "workload.pod.restarted.count-dvmuq",
"alert_project": "system",
"alert_resource": "cpaas-prometheus-rules",
"severity": "Medium"
}
},
{
"alert": "workload.replicas.available-tkak6-4eb3f543b0dc15b887ddbf1b42d20d76",
"annotations": {
"alert_current_value": "{{$value}}",
"alert_notifications": "[{\"namespace\":\"cpaas-system\",\"name\":\"cpaas-admin-notification\"}]"
},
"expr": " min(kube_statefulset_status_replicas_ready{statefulset=\"prometheus-kube-prometheus-0\",namespace=\"cpaas-system\"}) \u003c1",
"for": "60s",
"labels": {
"alert_cluster": "calico",
"alert_indicator": "workload.replicas.available",
"alert_indicator_aggregate_range": "0",
"alert_indicator_comparison": "\u003c",
"alert_indicator_query": "",
"alert_indicator_threshold": "1",
"alert_involved_object_kind": "StatefulSet",
"alert_involved_object_name": "prometheus-kube-prometheus-0",
"alert_involved_object_namespace": "cpaas-system",
"alert_name": "workload.replicas.available-tkak6",
"alert_project": "system",
"alert_resource": "cpaas-prometheus-rules",
"severity": "Medium"
}
},
{
"alert": "custom-s1uxp-8246a8b710391e713bc3168cdbafe3c3",
"annotations": {
"alert_current_value": "{{$value}}",
"alert_notifications": "[{\"namespace\":\"cpaas-system\",\"name\":\"cpaas-admin-notification\"}]",
"display_name": "Prometheus获取Metrics的Service的可用性",
"summary": "Prometheus采集Metrics的Service({{ $labels.service }})无法访问"
},
"expr": "max by (namespace, service) (up{service!=\"kube-prometheus-exporter-dockerd\"})!=1",
"for": "60s",
"labels": {
"alert_cluster": "calico",
"alert_indicator": "custom",
"alert_indicator_aggregate_range": "0",
"alert_indicator_alias": "custom.prometheus.targets.up",
"alert_indicator_comparison": "!=",
"alert_indicator_query": "",
"alert_indicator_threshold": "1",
"alert_indicator_unit": "",
"alert_involved_object_kind": "StatefulSet",
"alert_involved_object_name": "prometheus-kube-prometheus-0",
"alert_involved_object_namespace": "cpaas-system",
"alert_name": "custom-s1uxp",
"alert_project": "system",
"alert_resource": "cpaas-prometheus-rules",
"severity": "Medium"
}
},
{
"alert": "custom-fvl73-b39cdec96f4630c3570b88acf5800928",
"annotations": {
"alert_current_value": "{{$value}}",
"alert_notifications": "[{\"namespace\":\"cpaas-system\",\"name\":\"cpaas-admin-notification\"}]",
"display_name": "Prometheus最近一次加载配置是否成功"
},
"expr": "max(prometheus_config_last_reload_successful{namespace=\"cpaas-system\",pod=~\"prometheus-kube-prometheus-0-[0-9]{1,3}\"})!=1",
"for": "60s",
"labels": {
"alert_cluster": "calico",
"alert_indicator": "custom",
"alert_indicator_aggregate_range": "0",
"alert_indicator_alias": "custom.prometheus.config.last.reload.successful",
"alert_indicator_comparison": "!=",
"alert_indicator_query": "",
"alert_indicator_threshold": "1",
"alert_indicator_unit": "",
"alert_involved_object_kind": "StatefulSet",
"alert_involved_object_name": "prometheus-kube-prometheus-0",
"alert_involved_object_namespace": "cpaas-system",
"alert_name": "custom-fvl73",
"alert_project": "system",
"alert_resource": "cpaas-prometheus-rules",
"severity": "Medium"
}
}
]
}
]
}
}
返回体说明
名称 | 类型 | 描述 |
---|---|---|
apiVersion | string | 查看公共参数 |
kind | string | 查看公共参数 |
metadata | object | 查看公共参数 |
spec | object |
PrometheusRuleSpec 包含了规则的标准参数。
字段路径:spec |
spec.groups | array |
Prometheus 告警规则文件的内容。
字段路径:spec.groups |
spec.groups[] | object |
按顺序统计的记录和告警规则的列表。
字段路径:spec.groups[] |
groups[].name | string |
规则组的名称。
字段路径:spec.groups[].name |
groups[].rules | array |
规则列表。
字段路径:spec.groups[].rules |
groups[].rules[] | object |
描述了一个告警或记录规则。
字段路径:spec.groups[].rules[] |
rules[].alert | string |
告警规则 的名称。
字段路径:spec.groups[].rules[].alert |
rules[].annotations | object |
告警规则的注释,告警规则中,须包含以下必须的注释:
-`alert_current_value`:Prometheus 查询表达式的值。固定值为 "$value"。
-`alert_notifications`:告警触发时,发送通知的通知策略列表。值为 "[]" 表示不设置通知;值为 "[{\"name\":\"xqren\",\"namespace\":\"cpaas-system\"}]" 时,表示为告警策略设置了一个位于 cpaas-system 命名空间中的名称为 xqren 的通知策略。
字段路径:spec.groups[].rules[].annotations |
rules[].for | string |
触发规则的持续时间。
字段路径:spec.groups[].rules[].for |
rules[].labels | object |
告警规则的标签。告警规则中,需包含如下必须的 label:
-`alert_cluster`:告警规则所在集群的名称。
-`alert_project`:告警规则所在项目的名称。
-`alert_namespace`:告警规则所在命名空间的名称。
-`alert_resource`:告警策略的名称。
-`alert_name`:告警规则的名称
-`alert_involved_object_kind`:告警资源的类型,可选:Cluster、Node、Workload。
-`alert_involved_object_namespace`:告警资源所在命名空间的名称。
-`alert_involved_object_name`:告警资源的名称。
-`alert_indicator`:用于查询规则的指标名称,"custom" 表示自定义指标。
-`alert_indicator_comparison`:规则的比较运算符,例如:>。
-`alert_indicator_threshold`:规则的阈值,例如:0.5。
-`alert_indicator_unit`:指标的单位,例如:%。
-`severity`:指标的告警等级,可选:Critical、High、Medium、Low。
字段路径:spec.groups[].rules[].labels |