使用大卫的k8s监控面板(k8s+prometheus+grafana)

问题

书接上回,对EKS(AWS云k8s)启用AMP(AWS云Prometheus)监控+AMG(AWS云 grafana),上次我们只是配通了EKS+AMP+AMG的监控路径。这次使用一位大卫老师的grafana的面板,具体地址如下:
https://grafana.com/grafana/dashboards/15757-kubernetes-views-global/

安装kube-state-metrics

为了想Prometheus暴露一些有用的性能指标,需要在k8s集群中,安装kube-state-metrics。

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install kube-state-metrics prometheus-community/kube-state-metrics -n kube-system

测试验证:

kubectl port-forward svc/kube-state-metrics -n kube-system 8080:8080

使用PromQL测试:

count(kube_pod_status_ready{condition="false"}) by (namespace, pod)

prometheus配置

scrape_configs:
- job_name: kube-state-metrics
  honor_timestamps: true
  scrape_interval: 1m
  scrape_timeout: 1m
  metrics_path: /metrics
  scheme: http
  static_configs:
  - targets:
    - kube-state-metrics.kube-system.svc.cluster.local:8080

安装 prometheus-node-exporter

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus-node-exporter prometheus-community/prometheus-node-exporter -n kube-system

测试:

export POD_NAME=$(kubectl get pods --namespace kube-system -l "app.kubernetes.io/name=prometheus-node-exporter,app.kubernetes.io/instance=prometheus-node-exporter" -o jsonpath="{.items[0].metadata.name}")
kubectl port-forward --namespace kube-system $POD_NAME 9100

prometheus配置

scrape_configs:
- job_name: 'node-exporter'
  kubernetes_sd_configs:
  - role: node
  relabel_configs:
  - action: replace
    source_labels: [__address__]
    regex: '(.*):10250'
    replacement: '${1}:9100'
    target_label: __address__

整体prometheus配置

global:
  scrape_interval: 30s
  # external_labels:
    # clusterArn: <REPLACE_ME>
scrape_configs:
  # pod metrics
  - job_name: pod_exporter
    kubernetes_sd_configs:
      - role: pod
  # container metrics
  - job_name: cadvisor
    scheme: https
    authorization:
      credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    kubernetes_sd_configs:
      - role: node
    relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - replacement: kubernetes.default.svc:443
        target_label: __address__
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor
  # apiserver metrics
  - bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    job_name: kubernetes-apiservers
    kubernetes_sd_configs:
    - role: endpoints
    relabel_configs:
    - action: keep
      regex: default;kubernetes;https
      source_labels:
      - __meta_kubernetes_namespace
      - __meta_kubernetes_service_name
      - __meta_kubernetes_endpoint_port_name
    scheme: https
  # kube proxy metrics
  - job_name: kube-proxy
    honor_labels: true
    kubernetes_sd_configs:
    - role: pod
    relabel_configs:
    - action: keep
      source_labels:
      - __meta_kubernetes_namespace
      - __meta_kubernetes_pod_name
      separator: '/'
      regex: 'kube-system/kube-proxy.+'
    - source_labels:
      - __address__
      action: replace
      target_label: __address__
      regex: (.+?)(\\:\\d+)?
      replacement: $1:10249
  # kube-state-metrics
  - job_name: kube-state-metrics
    honor_timestamps: true
    scrape_interval: 1m
    scrape_timeout: 1m
    metrics_path: /metrics
    scheme: http
    static_configs:
    - targets:
      - kube-state-metrics.kube-system.svc.cluster.local:8080
  # node-exporter
  - job_name: 'node-exporter'
    kubernetes_sd_configs:
    - role: node
    relabel_configs:
    - action: replace
      source_labels: [__address__]
      regex: '(.*):10250'
      replacement: '${1}:9100'
      target_label: __address__

这里需要重新创建一个抓取程序。

效果

全局监控效果

参考

相关推荐

  1. 常规k8s监控指标

    2024-04-23 02:50:02       37 阅读
  2. k8s部署管理以及prometheus相关监控

    2024-04-23 02:50:02       64 阅读

最近更新

  1. docker php8.1+nginx base 镜像 dockerfile 配置

    2024-04-23 02:50:02       94 阅读
  2. Could not load dynamic library ‘cudart64_100.dll‘

    2024-04-23 02:50:02       100 阅读
  3. 在Django里面运行非项目文件

    2024-04-23 02:50:02       82 阅读
  4. Python语言-面向对象

    2024-04-23 02:50:02       91 阅读

热门阅读

  1. C语言C++面试题 (包答案)

    2024-04-23 02:50:02       76 阅读
  2. 信息收集的方式与工具

    2024-04-23 02:50:02       104 阅读
  3. rst文件是什么?如何阅读rst文件

    2024-04-23 02:50:02       97 阅读
  4. 第十四届蓝桥杯 子串简写 | 树状数组解法

    2024-04-23 02:50:02       30 阅读
  5. AI时代,智能体成下一个爆点

    2024-04-23 02:50:02       142 阅读
  6. 网络通信 mac表 tcp连接

    2024-04-23 02:50:02       184 阅读
  7. Elasticsearch克隆索引

    2024-04-23 02:50:02       41 阅读
  8. 什么是掩码补丁位置?

    2024-04-23 02:50:02       54 阅读
  9. 图标字体库——Font Awesome

    2024-04-23 02:50:02       37 阅读
  10. MySQL用户和权限管理深入指南

    2024-04-23 02:50:02       30 阅读