(三)Kafka 监控之 Streams 监控(Streams Monitoring)和其他

目录

 一. 前言

二. Kafka Streams 监控(Streams Monitoring)

2.7. RocksDB 指标(RocksDB Metrics)

2.8. 记录缓存指标(Record Cache Metrics)

三. 其他(Other)


 一. 前言

    接上一篇《(二)Kafka 监控之 Streams 监控(Streams Monitoring)》,本文从 2.7 小节开始。

二. Kafka Streams 监控(Streams Monitoring)

2.7. RocksDB 指标(RocksDB Metrics)

原文引用:RocksDB metrics are grouped into statistics-based metrics and properties-based metrics. The former are recorded from statistics that a RocksDB state store collects whereas the latter are recorded from properties that RocksDB exposes. Statistics collected by RocksDB provide cumulative measurements over time, e.g. bytes written to the state store. Properties exposed by RocksDB provide current measurements, e.g., the amount of memory currently used. Note that the store-scope for built-in RocksDB state stores are currently the following:

  • rocksdb-state (for RocksDB backed key-value store)
  • rocksdb-window-state (for RocksDB backed window store)
  • rocksdb-session-state (for RocksDB backed session store)

    RocksDB 指标分为基于统计的指标和基于属性的指标。前者是从 RocksDB 状态存储收集的统计数据中记录的,而后者是从 RocksDB 公开的属性中记录的。RocksDB 收集的统计数据提供了一段时间内的累积测量值,例如写入状态存储的字节数。RocksDB 公开的属性提供当前测量值,例如当前使用的内存量。请注意,内置 RocksDB 状态存储的存储范围当前如下:

  • rocksdb-state(用于 RocksDB 支持的键值存储)
  • rocksdb-window-state(用于 RocksDB 支持的窗口存储)
  • rocksdb-session-state(用于 RocksDB 支持的会话存储)。

原文引用:RocksDB Statistics-based Metrics: All of the following statistics-based metrics have a recording level of debug because collecting statistics in RocksDB may have an impact on performance. Statistics-based metrics are collected every minute from the RocksDB state stores. If a state store consists of multiple RocksDB instances, as is the case for WindowStores and SessionStores, each metric reports an aggregation over the RocksDB instances of the state store.

    RocksDB 基于统计的指标:以下所有基于统计的指标都有 debug 级别的记录,因为在 RocksDB 中收集统计数据可能会对性能产生影响。每分钟从 RocksDB 状态存储中收集基于统计的指标。如果一个状态存储由多个 RocksDB 实例组成,就像 WindowStores 和 SessionStores 的情况一样,每个指标都会报告状态存储的 RocksDB 示例的聚合。

METRIC/ATTRIBUTE NAME DESCRIPTION MBEAN NAME
bytes-written-rate The average number of bytes written per second to the RocksDB state store. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
bytes-written-total The total number of bytes written to the RocksDB state store. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
bytes-read-rate The average number of bytes read per second from the RocksDB state store. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
bytes-read-total The total number of bytes read from the RocksDB state store. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
memtable-bytes-flushed-rate The average number of bytes flushed per second from the memtable to disk. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
memtable-bytes-flushed-total The total number of bytes flushed from the memtable to disk. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
memtable-hit-ratio The ratio of memtable hits relative to all lookups to the memtable. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
memtable-flush-time-avg The average duration of memtable flushes to disc in ms. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
memtable-flush-time-min The minimum duration of memtable flushes to disc in ms. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
memtable-flush-time-max The maximum duration of memtable flushes to disc in ms. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
block-cache-data-hit-ratio The ratio of block cache hits for data blocks relative to all lookups for data blocks to the block cache. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
block-cache-index-hit-ratio The ratio of block cache hits for index blocks relative to all lookups for index blocks to the block cache. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
block-cache-filter-hit-ratio The ratio of block cache hits for filter blocks relative to all lookups for filter blocks to the block cache. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
write-stall-duration-avg The average duration of write stalls in ms. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
write-stall-duration-total The total duration of write stalls in ms. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
bytes-read-compaction-rate The average number of bytes read per second during compaction. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
bytes-written-compaction-rate The average number of bytes written per second during compaction. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
compaction-time-avg The average duration of disc compactions in ms. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
compaction-time-min The minimum duration of disc compactions in ms. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
compaction-time-max The maximum duration of disc compactions in ms. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
number-open-files The number of current open files. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
number-file-errors-total The total number of file errors occurred. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)

原文引用:RocksDB Properties-based Metrics: All of the following properties-based metrics have a recording level of info and are recorded when the metrics are accessed. If a state store consists of multiple RocksDB instances, as is the case for WindowStores and SessionStores, each metric reports the sum over all the RocksDB instances of the state store, except for the block cache metrics block-cache-*. The block cache metrics report the sum over all RocksDB instances if each instance uses its own block cache, and they report the recorded value from only one instance if a single block cache is shared among all instances.

    基于 RocksDB 属性的指标:以下所有基于属性的指标都有 info 级别的信息,并在访问这些指标时进行记录。如果一个状态存储由多个 RocksDB 实例组成,就像 WindowStores 和SessionStores 的情况一样,每个指标都会报告状态存储的所有 RocksDB 实例的总和,但块缓存指标 metrics block-cache-*  除外。如果每个实例使用自己的块缓存,则块缓存指标报告所有RocksDB 实例的总和;如果在所有实例之间共享单个块缓存,那么块缓存指标仅报告一个实例的记录值。

METRIC/ATTRIBUTE NAME DESCRIPTION MBEAN NAME
num-immutable-mem-table The number of immutable memtables that have not yet been flushed. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
cur-size-active-mem-table The approximate size of the active memtable in bytes. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
cur-size-all-mem-tables The approximate size of active and unflushed immutable memtables in bytes. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
size-all-mem-tables The approximate size of active, unflushed immutable, and pinned immutable memtables in bytes. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-entries-active-mem-table The number of entries in the active memtable. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-entries-imm-mem-tables The number of entries in the unflushed immutable memtables. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-deletes-active-mem-table The number of delete entries in the active memtable. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-deletes-imm-mem-tables The number of delete entries in the unflushed immutable memtables. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
mem-table-flush-pending This metric reports 1 if a memtable flush is pending, otherwise it reports 0. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-running-flushes The number of currently running flushes. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
compaction-pending This metric reports 1 if at least one compaction is pending, otherwise it reports 0. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-running-compactions The number of currently running compactions. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
estimate-pending-compaction-bytes The estimated total number of bytes a compaction needs to rewrite on disk to get all levels down to under target size (only valid for level compaction). kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
total-sst-files-size The total size in bytes of all SST files. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
live-sst-files-size The total size in bytes of all SST files that belong to the latest LSM tree. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-live-versions Number of live versions of the LSM tree. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
block-cache-capacity The capacity of the block cache in bytes. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
block-cache-usage The memory size of the entries residing in block cache in bytes. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
block-cache-pinned-usage The memory size for the entries being pinned in the block cache in bytes. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
estimate-num-keys The estimated number of keys in the active and unflushed immutable memtables and storage. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
estimate-table-readers-mem The estimated memory in bytes used for reading SST tables, excluding memory used in block cache. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
background-errors The total number of background errors. kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)

2.8. 记录缓存指标(Record Cache Metrics)

原文引用:All of the following metrics have a recording level of debug:

以下所有指标都具有 debug 级别的记录:

METRIC/ATTRIBUTE NAME DESCRIPTION MBEAN NAME
hit-ratio-avg The average cache hit ratio defined as the ratio of cache read hits over the total cache read requests. kafka.streams:type=stream-record-cache-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),record-cache-id=([-.\w]+)
hit-ratio-min The minimum cache hit ratio. kafka.streams:type=stream-record-cache-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),record-cache-id=([-.\w]+)
hit-ratio-max The maximum cache hit ratio. kafka.streams:type=stream-record-cache-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),record-cache-id=([-.\w]+)

三. 其他(Other)

原文引用:We recommend monitoring GC time and other stats and various server stats such as CPU utilization, I/O service time, etc. On the client side, we recommend monitoring the message/byte rate (global and per topic), request rate/size/time, and on the consumer side, max lag in messages among all partitions and min fetch request rate. For a consumer to keep up, max lag needs to be less than a threshold and min fetch rate needs to be larger than 0.

    我们建议监控 GC 时间和其他统计数据以及各种服务器统计数据,如 CPU 利用率、I/O 服务时间等。在客户端,我们建议监控 message/byte 速率(全局和每个 Topic)、请求速率/大小/时间,在消费者端,监控所有分区之间消息的最大滞后和最小获取请求速率。为了让消费者跟上,最大滞后需要小于阈值,最小获取速率需要大于0。

相关推荐

  1. (一)Kafka 监控 Streams 监控Streams Monitoring)

    2024-06-06 08:52:07       6 阅读
  2. Kafka 监控分层存储监控 KRaft 监控指标

    2024-06-06 08:52:07       12 阅读

最近更新

  1. TCP协议是安全的吗?

    2024-06-06 08:52:07       18 阅读
  2. 阿里云服务器执行yum,一直下载docker-ce-stable失败

    2024-06-06 08:52:07       19 阅读
  3. 【Python教程】压缩PDF文件大小

    2024-06-06 08:52:07       18 阅读
  4. 通过文章id递归查询所有评论(xml)

    2024-06-06 08:52:07       20 阅读

热门阅读

  1. 面试高频问题----3

    2024-06-06 08:52:07       9 阅读
  2. IO转换流

    2024-06-06 08:52:07       9 阅读
  3. springboot项目Redis统计在线用户

    2024-06-06 08:52:07       10 阅读
  4. 怎么排查native层的bug

    2024-06-06 08:52:07       8 阅读
  5. 【k8s的三种探针】

    2024-06-06 08:52:07       8 阅读
  6. Scala学习笔记7: 对象

    2024-06-06 08:52:07       7 阅读
  7. 小程序真题合集

    2024-06-06 08:52:07       7 阅读
  8. HW面试应急响应之场景题

    2024-06-06 08:52:07       8 阅读
  9. IDEA 开发中一些好用的插件

    2024-06-06 08:52:07       8 阅读
  10. 微信小程序中实现录音功能及其功效

    2024-06-06 08:52:07       6 阅读