普罗米修斯
普罗米修斯
1、监控的意义
监控:monitor 监视我们的服务器或者服务,一旦出现问题,要告诉我们(告警),运维人员及时去处理,将公司的损失减少到最小
监控: 在公司里非常重要,任何公司都重视,还需要监控 背后: 收集数据,分析数据,可以提前预知一些问题,及时处理
官网:https://prometheus.io/
2、普罗米修斯的架构图
1.tsdb time series database 时序数据库 --》hdd/ssd hdd机械磁盘 hard disk drive ssd固态磁盘 -->solid state drive
promQL : select ,insert等 Prometheus 内部的SQL语句,可以用来查询TSDB里的数据
2.http server web服务
3.pushgateway 中间件(代理),临时存放数据的软件
4.alertmanager 告警的软件
5.exporter 收集数据,采集数据 木马程序 : 安装到被监控的机器上 agent 代理
普罗米修斯的核心组件:
Prometheus Server: 包含了tsdb,http server,retrieval
Alertmanager: 告警软件
push gateway: 中间件代理
exporter: 收集数据
3、安装普罗米修斯
第1步:安装prometheus server
源码安装
1.上传下载的源码包到linux服务器
[root@nfs-server ~]# mkdir /prom
[root@nfs-server ~]# cd /prom
[root@nfs-server prom]# ls
[root@nfs-server prom]# tar xf prometheus-2.51.0.linux-amd64.tar.gz
[root@nfs-server prom]# ls
node_exporter-1.7.0.linux-amd64.tar.gz prometheus-2.51.0.linux-amd64.tar.gz
prometheus-2.51.0.linux-amd64
[root@nfs-server prom]# mv prometheus-2.51.0.linux-amd64 prometheus
[root@nfs-server prom]#
2.临时和永久修改PATH变量,添加prometheus的路径
[root@nfs-server prom]# cd prometheus
[root@nfs-server prometheus]# PATH=/prom/prometheus:$PATH
[root@nfs-server prometheus]# vim /etc/profile
PATH=/prom/prometheus:$PATH #添加 最后一行
3.执行prometheus程序
[root@nfs-server prometheus]# nohup prometheus --config.file=/prom/prometheus/prometheus.yml &
[1] 2954
[root@nfs-server prometheus]# nohup: 忽略输入并把输出追加到"nohup.out"
[root@nfs-server prometheus]# ps aux|grep prome
root 2954 0.8 2.2 1322356 42344 pts/0 Sl 17:16 0:00 prometheus --config.file=/prom/prometheus/prometheus.yml
root 2965 0.0 0.0 112824 976 pts/0 S+ 17:16 0:00 grep --color=auto prome
[root@nfs-server prometheus]#
[root@nfs-server prometheus]# netstat -antplu|grep prome
tcp6 0 0 :::9090 :::* LISTEN 2954/prometheus
tcp6 0 0 ::1:9090 ::1:56866 ESTABLISHED 2954/prometheus
tcp6 0 0 ::1:56866 ::1:9090 ESTABLISHED 2954/prometheus
[root@nfs-server prometheus]#
普罗米修斯监听的是9090端口
把普罗米修斯做成一个服务来进行管理,非常方便
[root@prometheus prometheus]# vim /usr/lib/systemd/system/prometheus.service
[Unit]
Description=prometheus
[Service]
ExecStart=/prom/prometheus/prometheus --config.file=/prom/prometheus/prometheus.yml
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
[Install]
WantedBy=multi-user.target
systemctl daemon-reload 重新加载一下
可以访问9090端口了
注意:
第一次因为是使用nohup 方式启动的prometheus,还是需要使用后kill 的方式杀死第一次启动的进程
后面可以使用service方式管理prometheus了
[root@nfs-server prometheus]# service prometheus stop
Redirecting to /bin/systemctl stop prometheus.service
[root@nfs-server prometheus]# ps aux|grep prome
root 2954 0.0 2.8 1322612 53800 pts/0 Sl 17:16 0:00 prometheus --config.file=/prom/prometheus/prometheus.yml
root 3100 0.0 0.0 112824 972 pts/0 S+ 17:26 0:00 grep --color=auto prome
[root@nfs-server prometheus]# kill -9 2954
[root@nfs-server prometheus]# ps aux|grep prome
root 3102 0.0 0.0 112824 976 pts/0 S+ 17:26 0:00 grep --color=auto prome
[1]+ 已杀死 nohup prometheus --config.file=/prom/prometheus/prometheus.yml
[root@nfs-server prometheus]#
重新使用service的方式启动Prometheus
[root@nfs-server prometheus]# service prometheus start
Redirecting to /bin/systemctl start prometheus.service
[root@nfs-server prometheus]#
设置Prometheus开机启动
[root@nfs-server prometheus]# systemctl enable prometheus
Created symlink from /etc/systemd/system/multi-user.target.wants/prometheus.service to /usr/lib/systemd/system/prometheus.service.
[root@nfs-server prometheus]#
第三步:在node节点服务器((任何一台linux系统,例如master,node-1))上安装exporter程序
exporter 是Prometheus的客户端的数据采集工具–》go语言编写的
1.下载node_exporter-1.7.0.linux-amd64.tar.gz源码,上传到节点服务器上
[root@master ~]# mkdir /exporter
[root@master ~]# cd /exporter/
[root@master exporter]# ls
node_exporter-1.7.0.linux-amd64.tar.gz
[root@master exporter]#
2.解压
[root@master exporter]# ls
node_exporter-1.7.0.linux-amd64.tar.gz
[root@master exporter]# tar xf node_exporter-1.7.0.linux-amd64.tar.gz
[root@master exporter]# mv node_exporter-1.7.0.linux-amd64 node_exporter
[root@master exporter]# mv node_exporter /node_exporter
[root@master exporter]# cd /node_exporter/
[root@master node_exporter]#
3.修改PATH变量
[root@master node_exporter]# PATH=/node_exporter/:$PATH
[root@master node_exporter]# ls
LICENSE node_exporter NOTICE
[root@master node_exporter]# vim /etc/bashrc
'在末尾加上'
PATH=/node_exporter/:$PATH
4.执行node exporter 代理程序agent
[root@master node_exporter]# nohup node_exporter --web.listen-address 0.0.0.0:8090 &
[1] 35088
[root@master node_exporter]# nohup: 忽略输入并把输出追加到"nohup.out"
[root@master node_exporter]# ps aux|grep node_exporter
root 35088 0.0 0.3 1240220 6428 pts/0 Sl 17:42 0:00 node_exporter --web.listen-address 0.0.0.0:8090
root 35203 0.0 0.0 112824 984 pts/0 S+ 17:42 0:00 grep --color=auto node_exporter
[root@master node_exporter]#
具体的端口号,可以自己定义,只要不和其他的服务冲突就可以
5.设置node_exporter开机启动
[root@master node_exporter]# vim /etc/rc.local
[root@master node_exporter]# cat /etc/rc.local
#!/bin/bash
# THIS FILE IS ADDED FOR COMPATIBILITY PURPOSES
#
# It is highly advisable to create own systemd services or udev rules
# to run scripts during boot instead of using this file.
#
# In contrast to previous versions due to parallel execution during boot
# this script will NOT be run after all other services.
#
# Please note that you must run 'chmod +x /etc/rc.d/rc.local' to ensure
# that this script will be executed during boot.
touch /var/lock/subsys/local
nohup /node_exporter/node_exporter --web.listen-address 0.0.0.0:8090 &
[root@master node_exporter]#
[root@master node_exporter]# chmod +x /etc/rc.d/rc.local
第四步:在prometheus里操作
server里添加我们在哪些机器里安装了exporter程序,这样就可以知道去哪里pull数据
[root@nfs-server prometheus]# ls
console_libraries data nohup.out prometheus promtool
consoles LICENSE NOTICE prometheus.yml
[root@nfs-server prometheus]# vim prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
- job_name: "master"
static_configs:
- targets: ["192.168.182.133:8090"]
[root@nfs-server prometheus]#
'重启服务'
[root@nfs-server prometheus]# service prometheus restart
Redirecting to /bin/systemctl restart prometheus.service
[root@nfs-server prometheus]#
4、利用grafana去出图
grafana 和prometheus server安装在一台服务器上
第一步:去官网下载grafana-enterprise-10.2.3-1.x86_64.rpm
[root@nfs-server ~]# ls
anaconda-ks.cfg grafana-enterprise-10.2.3-1.x86_64.rpm
back_log_pwd_shadow.sh hpa-example.tar
[root@nfs-server ~]# yum install grafana-enterprise-10.2.3-1.x86_64.rpm -y
设置grafana开机启动
[root@nfs-server ~]# systemctl enable grafana-server
Created symlink from /etc/systemd/system/multi-user.target.wants/grafana-server.service to /usr/lib/systemd/system/grafana-server.service.
[root@nfs-server ~]#
查看进程是否启动—>这儿有个小问题,记得要启动服务
sudo systemctl start grafana-server
[root@nfs-server ~]# ps aux|grep grafana
grafana 3442 28.5 5.5 1475704 102608 ? Ssl 18:14 0:01 /usr/share/grafana/bin/grafana server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafan-server.pid --packaging=rpm cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/var/lib/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins cfg:default.paths.provisioning=/etc/grafana/provisioning
root 3454 0.0 0.0 112824 972 pts/0 S+ 18:14 0:00 grep --color=auto grafana
[root@nfs-server ~]# netstat -antplu|grep grafana
tcp 0 0 192.168.182.136:36376 34.120.177.193:443 ESTABLISHED 3442/grafana
tcp 0 0 192.168.182.136:36378 34.120.177.193:443 ESTABLISHED 3442/grafana
tcp6 0 0 :::3000 :::* LISTEN 3442/grafana
[root@nfs-server ~]#
可以去修改语言环境为中文
在右上角的个人资料里进行修改
配置prometheus数据源
管理–》数据源–》add new data source–>prometheus
'添加数据源'
http://192.168.182.136:9090
导入grafana的模板
这2个模板ID非常好用,推荐使用
1860
8919 -->推荐使用,因为是中文版的字符
5、小知识点
5.1、系统负载
进程的数量,最近一分钟,五分钟,15分钟等
5.2、常规监控项目
- cpu使用率
- 内存使用率
- 磁盘IO读写速度和使用率
- 网络带宽
5.3、针对某个进程或者软件进行监控
- nginx
- mysql
- kafuka
这些得去官网下载特定的监控程序
5.4、深入研究普罗米修斯的方向
- 各种exporter: nginx,mysql等特定的软件
- 研究普罗米修斯的promQL -->了解更多的普罗米修斯指标
- 普罗米修斯的数据类型