Kubernetes的有状态应用示例:ZooKeeper

环境

  • RHEL 9.3
  • Docker Community 24.0.7
  • minikube v1.32.0

ZooKeeper简介

Apache ZooKeeper是一个分布式的开源协调服务,用于分布式系统。ZooKeeper允许你读、写数据以及发现数据更新。数据按层次结构组织在文件系统中,并复制到ensemble(ZooKeeper服务器集合)中所有的ZooKeeper服务器。对数据的所有操作都是原子的和顺序一致的。ZooKeeper通过Zab共识协议在ensemble的所有服务器之间复制状态机,来确保这个特性。

Ensemble使用Zab协议选举一个leader,在选举出leader前不能写入数据。选举出leader后,ensemble使用Zab来确保所有写操作被复制到一个quorum,然后这些写操作才会被确认并对客户端可见。如果没有遵照加权quorums,quorum是包含当前leader的ensemble的主要组件。 例如,如果ensemble有3个服务器,一个包含leader和另一个服务器的组件构成一个quorum。如果ensemble不能达成一个quorum,数据将不能被写入。

ZooKeeper在内存中保存它们的整个状态机,并把每个改变都写入一个持久的WAL(Write Ahead Log)。当一个服务器宕机时,它能够通过回放WAL恢复之前的状态。为了防止WAL无限制的增长,ZooKeeper服务器会定期的将内存状态快照保存到存储介质。这些快照能够直接加载到内存,快照之前的所有WAL条目都可以被丢弃。

准备

清理环境:

minikube delete --all

重启电脑。

启动minikube:

minikube start

确认环境干净:

$ kubectl get all
NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   11m
$ kubectl get pvc
No resources found in default namespace.
$ kubectl get pv
No resources found

创建文件 zookeeper.yaml 如下:

apiVersion: v1
kind: Service
metadata:
  name: zk-hs
  labels:
    app: zk
spec:
  ports:
  - port: 2888
    name: server
  - port: 3888
    name: leader-election
  clusterIP: None
  selector:
    app: zk
---
apiVersion: v1
kind: Service
metadata:
  name: zk-cs
  labels:
    app: zk
spec:
  ports:
  - port: 2181
    name: client
  selector:
    app: zk
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: zk-pdb
spec:
  selector:
    matchLabels:
      app: zk
  maxUnavailable: 1
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: zk
spec:
  selector:
    matchLabels:
      app: zk
  serviceName: zk-hs
  replicas: 3
  updateStrategy:
    type: RollingUpdate
  podManagementPolicy: OrderedReady
  template:
    metadata:
      labels:
        app: zk
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                    - zk
              topologyKey: "kubernetes.io/hostname"
      containers:
      - name: kubernetes-zookeeper
        imagePullPolicy: Always
        # image: "registry.k8s.io/kubernetes-zookeeper:1.0-3.4.10"
        image: "docker.io/kaiding1/kubernetes-zookeeper:1.0-3.4.10"
        resources:
          requests:
            memory: "1Gi"
            cpu: "0.5"
        ports:
        - containerPort: 2181
          name: client
        - containerPort: 2888
          name: server
        - containerPort: 3888
          name: leader-election
        command:
        - sh
        - -c
        - "start-zookeeper \
          --servers=3 \
          --data_dir=/var/lib/zookeeper/data \
          --data_log_dir=/var/lib/zookeeper/data/log \
          --conf_dir=/opt/zookeeper/conf \
          --client_port=2181 \
          --election_port=3888 \
          --server_port=2888 \
          --tick_time=2000 \
          --init_limit=10 \
          --sync_limit=5 \
          --heap=512M \
          --max_client_cnxns=60 \
          --snap_retain_count=3 \
          --purge_interval=12 \
          --max_session_timeout=40000 \
          --min_session_timeout=4000 \
          --log_level=INFO"
        readinessProbe:
          exec:
            command:
            - sh
            - -c
            - "zookeeper-ready 2181"
          initialDelaySeconds: 10
          timeoutSeconds: 5
        livenessProbe:
          exec:
            command:
            - sh
            - -c
            - "zookeeper-ready 2181"
          initialDelaySeconds: 10
          timeoutSeconds: 5
        volumeMounts:
        - name: datadir
          mountPath: /var/lib/zookeeper
      securityContext:
        runAsUser: 1000
        fsGroup: 1000
  volumeClaimTemplates:
  - metadata:
      name: datadir
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 10Gi

注:因为访问不了 registry.k8s.io ,所以事先把image pull下来,并push到了可访问的位置。

部署

尝试1

$ kubectl apply -f zookeeper.yaml
service/zk-hs created
service/zk-cs created
poddisruptionbudget.policy/zk-pdb created
statefulset.apps/zk created

检查pod:

$ kubectl get pod
NAME   READY   STATUS    RESTARTS   AGE
zk-0   1/1     Running   0          57s
zk-1   0/1     Pending   0          35s

进入 zk-0 pod:

kubectl exec -it zk-0 -- bash

查看当前身份以及 /var/lib/zookeeper 目录:

zookeeper@zk-0:/$ whoami
zookeeper
zookeeper@zk-0:/$ ls -l /var/lib
total 0
drwxr-xr-x. 1 root root  42 Jun 13  2017 apt
......
drwxrwxrwx. 3 root root  18 Feb  6 07:13 zookeeper
zookeeper@zk-0:/$ ls -l /var/lib/zookeeper/
total 0
drwxr-xr-x. 4 zookeeper zookeeper 46 Feb  6 07:13 data

可见, /var/lib/zookeeper 目录是 777 ,所以 zookeeper 能在此创建目录。

退出容器,再次查看pod:

$ kubectl get pod
NAME   READY   STATUS    RESTARTS   AGE
zk-0   1/1     Running   0          7m37s
zk-1   0/1     Pending   0          7m15s

可见, zk-1 始终处于 Pending 状态。

$ kubectl describe pod zk-1
......
Events:
  Type     Reason            Age                   From               Message
  ----     ------            ----                  ----               -------
  Warning  FailedScheduling  8m4s                  default-scheduler  0/1 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..
  Warning  FailedScheduling  2m39s (x2 over 8m3s)  default-scheduler  0/1 nodes are available: 1 node(s) didn't match pod anti-affinity rules. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod..

可见,在affinity规则下,没有符合条件的node。

尝试2

清理环境:

$ kubectl delete -f zookeeper.yaml 
service "zk-hs" deleted
service "zk-cs" deleted
poddisruptionbudget.policy "zk-pdb" deleted
statefulset.apps "zk" deleted
$ kubectl get pvc
NAME           STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
datadir-zk-0   Bound    pvc-f4a42476-b3e7-48ca-ad8d-b8f6ec278f1e   10Gi       RWO            standard       15m
datadir-zk-1   Bound    pvc-924378b6-5810-4ebc-86a7-90037221381f   10Gi       RWO            standard       14m
$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                  STORAGECLASS   REASON   AGE
pvc-924378b6-5810-4ebc-86a7-90037221381f   10Gi       RWO            Delete           Bound    default/datadir-zk-1   standard                14m
pvc-f4a42476-b3e7-48ca-ad8d-b8f6ec278f1e   10Gi       RWO            Delete           Bound    default/datadir-zk-0   standard                15m
$ kubectl delete pvc datadir-zk-0 datadir-zk-1
persistentvolumeclaim "datadir-zk-0" deleted
persistentvolumeclaim "datadir-zk-1" deleted

修改 zookeeper.yaml ,把 requiredDuringSchedulingIgnoredDuringExecution 改为 preferredDuringSchedulingIgnoredDuringExecution

......
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: "app"
                      operator: In
                      values:
                      - zookeeper
                topologyKey: "kubernetes.io/hostname"
......
$ kubectl apply -f zookeeper2.yaml
service/zk-hs created
service/zk-cs created
poddisruptionbudget.policy/zk-pdb created
statefulset.apps/zk created
$ kubectl get pod
NAME   READY   STATUS    RESTARTS   AGE
zk-0   1/1     Running   0          2m1s
zk-1   1/1     Running   0          99s
zk-2   1/1     Running   0          77s
$ kubectl get sts
NAME   READY   AGE
zk     3/3     6m31s

注:如果又遇到pod Pending 问题,有以下错误:

Events:
  Type     Reason            Age        From               Message
  ----     ------            ----       ----               -------
  Warning  FailedScheduling  <unknown>  default-scheduler  0/1 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  <unknown>  default-scheduler  0/1 nodes are available: 1 Insufficient memory.

则需要先 minikube delete ,然后在启动minikube时,增大内存,比如:

minikube start --memory='7500mb'

查看存储:

$ kubectl get pvc
NAME           STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
datadir-zk-0   Bound    pvc-7b501343-085f-48d8-b88a-ade5686b26c9   10Gi       RWO            standard       7m18s
datadir-zk-1   Bound    pvc-4af3ab29-a3c8-4b26-addf-40656e998021   10Gi       RWO            standard       6m56s
datadir-zk-2   Bound    pvc-60869d4a-389d-4f86-8446-4fda9858fe40   10Gi       RWO            standard       6m34s
$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                  STORAGECLASS   REASON   AGE
pvc-4af3ab29-a3c8-4b26-addf-40656e998021   10Gi       RWO            Delete           Bound    default/datadir-zk-1   standard                6m58s
pvc-60869d4a-389d-4f86-8446-4fda9858fe40   10Gi       RWO            Delete           Bound    default/datadir-zk-2   standard                6m36s
pvc-7b501343-085f-48d8-b88a-ade5686b26c9   10Gi       RWO            Delete           Bound    default/datadir-zk-0   standard                7m20s

验证

进入容器:

kubectl exec -it zk-0 -- bash

检查连通性:

zookeeper@zk-0:/$ echo "Are you ok? $(echo ruok | nc 127.0.0.1 2181)"
Are you ok? imok

检查模式:

zookeeper@zk-0:/$ echo srvr | nc localhost 2181
Zookeeper version: 3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
Latency min/avg/max: 0/0/0
Received: 123
Sent: 122
Connections: 1
Outstanding: 0
Zxid: 0x0
Mode: follower
Node count: 4

可见,pod zk-0 是一个follower。

退出容器,用同样的方法检查另外两个pod:

zookeeper@zk-1:/$ echo srvr | nc localhost 2181 | grep -i mode
Mode: leader
zookeeper@zk-2:/$ echo srvr | nc localhost 2181 | grep -i mode
Mode: follower

在leader pod里:

zookeeper@zk-1:/$ echo dump | nc localhost 2181
SessionTracker dump:
Session Sets (0):
ephemeral nodes dump:
Sessions with Ephemerals (0):
zookeeper@zk-1:/$ echo stat | nc localhost 2181
Zookeeper version: 3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
Clients:
 /127.0.0.1:60780[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0
Received: 221
Sent: 220
Connections: 1
Outstanding: 0
Zxid: 0x100000000
Mode: leader
Node count: 4
zookeeper@zk-1:/$ echo envi | nc localhost 2181
Environment:
zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
host.name=zk-1.zk-hs.default.svc.cluster.local
java.version=1.8.0_131
java.vendor=Oracle Corporation
java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
java.class.path=/usr/bin/../build/classes:/usr/bin/../build/lib/*.jar:/usr/bin/../share/zookeeper/zookeeper-3.4.10.jar:/usr/bin/../share/zookeeper/slf4j-log4j12-1.6.1.jar:/usr/bin/../share/zookeeper/slf4j-api-1.6.1.jar:/usr/bin/../share/zookeeper/netty-3.10.5.Final.jar:/usr/bin/../share/zookeeper/log4j-1.2.16.jar:/usr/bin/../share/zookeeper/jline-0.9.94.jar:/usr/bin/../src/java/lib/*.jar:/usr/bin/../etc/zookeeper:
java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
java.io.tmpdir=/tmp
java.compiler=<NA>
os.name=Linux
os.arch=amd64
os.version=5.14.0-362.18.1.el9_3.x86_64
user.name=zookeeper
user.home=/home/zookeeper
user.dir=/
zookeeper@zk-1:/$ echo conf | nc localhost 2181
clientPort=2181
dataDir=/var/lib/zookeeper/data/version-2
dataLogDir=/var/lib/zookeeper/data/log/version-2
tickTime=2000
maxClientCnxns=60
minSessionTimeout=4000
maxSessionTimeout=40000
serverId=2
initLimit=10
syncLimit=5
electionAlg=3
electionPort=3888
quorumPort=2888
peerType=0
zookeeper@zk-1:/$ echo mntr | nc localhost 2181
zk_version	3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
zk_avg_latency	0
zk_max_latency	0
zk_min_latency	0
zk_packets_received	242
zk_packets_sent	241
zk_num_alive_connections	1
zk_outstanding_requests	0
zk_server_state	leader
zk_znode_count	4
zk_watch_count	0
zk_ephemerals_count	0
zk_approximate_data_size	27
zk_open_file_descriptor_count	41
zk_max_file_descriptor_count	1048576
zk_followers	2
zk_synced_followers	2
zk_pending_syncs	0

事实上,在spec中,使用了 zookeeper-ready 2181 来作为 readinessProbelivenessProbe

......
        readinessProbe:
          exec:
            command:
            - sh
            - -c
            - "zookeeper-ready 2181"
          initialDelaySeconds: 10
          timeoutSeconds: 5
        livenessProbe:
          exec:
            command:
            - sh
            - -c
            - "zookeeper-ready 2181"
          initialDelaySeconds: 10
          timeoutSeconds: 5
......

查看 zookeeper-ready

zookeeper@zk-1:/$ which zookeeper-ready
/usr/bin/zookeeper-ready
zookeeper@zk-1:/$ cat /usr/bin/zookeeper-ready
#!/usr/bin/env bash
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# zkOk.sh uses the ruok ZooKeeper four letter work to determine if the instance
# is health. The $? variable will be set to 0 if server responds that it is 
# healthy, or 1 if the server fails to respond.

OK=$(echo ruok | nc 127.0.0.1 $1)
if [ "$OK" == "imok" ]; then
	exit 0
else
	exit 1
fi

可见,它就是用上面的方法来检测连通性的。

深入了解ZooKeeper

leader和follower

每个ZooKeeper ensemble里的服务器节点都有一个唯一的ID与其网络地址关联。每个node都知道该ID。

hostname:

$ for i in 0 1 2; do kubectl exec zk-$i -- hostname; done
zk-0
zk-1
zk-2

ZooKeeper node在数据目录的 myid 文件中存储其服务器ID(本例中数据目录是 /var/lib/zookeeper )。

$ for i in 0 1 2; do echo "myid zk-$i"; kubectl exec zk-$i -- cat /var/lib/zookeeper/data/myid; done
myid zk-0
1
myid zk-1
2
myid zk-2
3

注意:Kubernetes从0开始计数,而ZooKeeper是从1开始计数。

查看FQDN(Fully Qualified Domain Name):

$ for i in 0 1 2; do kubectl exec zk-$i -- hostname -f; done
zk-0.zk-hs.default.svc.cluster.local
zk-1.zk-hs.default.svc.cluster.local
zk-2.zk-hs.default.svc.cluster.local

ZooKeeper headless service为StatefulSet中的每个pod创建一个域名。

Kubernetes DNS中的A记录将FQDN解析为pod的IP地址。将来如果pod被重新调度或者升级,则A记录会放置新的IP地址,但是名字保持不变。

ZooKeeper使用一个 zoo.cfg 的配置文件( /opt/zookeeper/conf/zoo.cfg )。我们来查看该文件:

$ kubectl exec zk-0 -- cat /opt/zookeeper/conf/zoo.cfg
#This file was autogenerated DO NOT EDIT
clientPort=2181
dataDir=/var/lib/zookeeper/data
dataLogDir=/var/lib/zookeeper/data/log
tickTime=2000
initLimit=10
syncLimit=5
maxClientCnxns=60
minSessionTimeout=4000
maxSessionTimeout=40000
autopurge.snapRetainCount=3
autopurge.purgeInteval=12
server.1=zk-0.zk-hs.default.svc.cluster.local:2888:3888
server.2=zk-1.zk-hs.default.svc.cluster.local:2888:3888
server.3=zk-2.zk-hs.default.svc.cluster.local:2888:3888

该文件是由 start-zookeeper 创建的:

......
        - sh
        - -c
        - "start-zookeeper \
          --servers=3 \
          --data_dir=/var/lib/zookeeper/data \
          --data_log_dir=/var/lib/zookeeper/data/log \
          --conf_dir=/opt/zookeeper/conf \
......

测试

使用ZooKeeper的命令行工具 zkCli.sh 来测试。它可以:

  • 创建znode
  • 获取数据
  • 监视znode变化
  • 把数据放置于znode
  • 为znode创建子节点
  • 列出子节点
  • 检查状态
  • 删除znode

ZooKeeper里的znode既像文件(可以有内容),也像目录(可以有子znode)。

我们来把 world 写入 zk-0/hello 里。

$ kubectl exec zk-0 zkCli.sh create /hello world
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Connecting to localhost:2181
2024-02-06 08:35:34,023 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
2024-02-06 08:35:34,027 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=zk-0.zk-hs.default.svc.cluster.local
2024-02-06 08:35:34,027 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_131
2024-02-06 08:35:34,028 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2024-02-06 08:35:34,028 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
2024-02-06 08:35:34,029 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/bin/../build/classes:/usr/bin/../build/lib/*.jar:/usr/bin/../share/zookeeper/zookeeper-3.4.10.jar:/usr/bin/../share/zookeeper/slf4j-log4j12-1.6.1.jar:/usr/bin/../share/zookeeper/slf4j-api-1.6.1.jar:/usr/bin/../share/zookeeper/netty-3.10.5.Final.jar:/usr/bin/../share/zookeeper/log4j-1.2.16.jar:/usr/bin/../share/zookeeper/jline-0.9.94.jar:/usr/bin/../src/java/lib/*.jar:/usr/bin/../etc/zookeeper:
2024-02-06 08:35:34,029 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2024-02-06 08:35:34,029 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2024-02-06 08:35:34,029 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2024-02-06 08:35:34,029 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2024-02-06 08:35:34,029 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2024-02-06 08:35:34,029 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=5.14.0-362.18.1.el9_3.x86_64
2024-02-06 08:35:34,030 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=zookeeper
2024-02-06 08:35:34,030 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/zookeeper
2024-02-06 08:35:34,030 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/
2024-02-06 08:35:34,031 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@22d8cfe0
2024-02-06 08:35:34,051 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2024-02-06 08:35:34,096 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2024-02-06 08:35:34,114 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x18d7d581b330000, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
Created /hello

然后从另一个pod读取该值:

$ kubectl exec zk-1 -- zkCli.sh get /hello
Connecting to localhost:2181
2024-02-06 08:36:34,736 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
2024-02-06 08:36:34,738 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=zk-1.zk-hs.default.svc.cluster.local
2024-02-06 08:36:34,738 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_131
2024-02-06 08:36:34,740 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2024-02-06 08:36:34,740 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
2024-02-06 08:36:34,740 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/bin/../build/classes:/usr/bin/../build/lib/*.jar:/usr/bin/../share/zookeeper/zookeeper-3.4.10.jar:/usr/bin/../share/zookeeper/slf4j-log4j12-1.6.1.jar:/usr/bin/../share/zookeeper/slf4j-api-1.6.1.jar:/usr/bin/../share/zookeeper/netty-3.10.5.Final.jar:/usr/bin/../share/zookeeper/log4j-1.2.16.jar:/usr/bin/../share/zookeeper/jline-0.9.94.jar:/usr/bin/../src/java/lib/*.jar:/usr/bin/../etc/zookeeper:
2024-02-06 08:36:34,740 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2024-02-06 08:36:34,740 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2024-02-06 08:36:34,740 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2024-02-06 08:36:34,740 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2024-02-06 08:36:34,740 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2024-02-06 08:36:34,741 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=5.14.0-362.18.1.el9_3.x86_64
2024-02-06 08:36:34,741 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=zookeeper
2024-02-06 08:36:34,741 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/zookeeper
2024-02-06 08:36:34,741 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/
2024-02-06 08:36:34,742 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@22d8cfe0
2024-02-06 08:36:34,755 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2024-02-06 08:36:34,820 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2024-02-06 08:36:34,833 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x28d7d581b360000, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
world
cZxid = 0x100000002
ctime = Tue Feb 06 08:35:34 UTC 2024
mZxid = 0x100000002
mtime = Tue Feb 06 08:35:34 UTC 2024
pZxid = 0x100000002
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 0

事实上,ZooKeeper ensemble里的每个服务器都可以读取 /hello

$ for i in 0 1 2; do kubectl exec zk-$i -- zkCli.sh get /hello | grep world; done
world
cZxid = 0x100000002
ctime = Tue Feb 06 08:35:34 UTC 2024
mZxid = 0x100000002
mtime = Tue Feb 06 08:35:34 UTC 2024
pZxid = 0x100000002
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 0
world
cZxid = 0x100000002
ctime = Tue Feb 06 08:35:34 UTC 2024
mZxid = 0x100000002
mtime = Tue Feb 06 08:35:34 UTC 2024
pZxid = 0x100000002
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 0
cZxid = 0x100000002
world
ctime = Tue Feb 06 08:35:34 UTC 2024
mZxid = 0x100000002
mtime = Tue Feb 06 08:35:34 UTC 2024
pZxid = 0x100000002
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 0

容忍node故障

我们来看一下共识算法。删除一个服务器,然后尝试向ZooKeeper ensemble中写入数据,由于还有两个ZooKeeper服务器,所以写入操作还能继续工作。

$ kubectl delete --force=true --grace-period=0 pod zk-2  &
sleep 1; kubectl delete --force=true --grace-period=0 pod zk-2  &
sleep 1
kubectl exec zk-0 zkCli.sh set /hello world_should_work
sleep 1
kubectl exec zk-1 zkCli.sh get /hello
[1] 148438
Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "zk-2" force deleted
[1]+  Done                    kubectl delete --force=true --grace-period=0 pod zk-2
[1] 148686
Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "zk-2" force deleted
[1]+  Done                    kubectl delete --force=true --grace-period=0 pod zk-2
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Connecting to localhost:2181
2024-02-06 08:45:21,133 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
2024-02-06 08:45:21,135 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=zk-0.zk-hs.default.svc.cluster.local
2024-02-06 08:45:21,135 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_131
2024-02-06 08:45:21,137 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2024-02-06 08:45:21,137 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
2024-02-06 08:45:21,137 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/bin/../build/classes:/usr/bin/../build/lib/*.jar:/usr/bin/../share/zookeeper/zookeeper-3.4.10.jar:/usr/bin/../share/zookeeper/slf4j-log4j12-1.6.1.jar:/usr/bin/../share/zookeeper/slf4j-api-1.6.1.jar:/usr/bin/../share/zookeeper/netty-3.10.5.Final.jar:/usr/bin/../share/zookeeper/log4j-1.2.16.jar:/usr/bin/../share/zookeeper/jline-0.9.94.jar:/usr/bin/../src/java/lib/*.jar:/usr/bin/../etc/zookeeper:
2024-02-06 08:45:21,137 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2024-02-06 08:45:21,138 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2024-02-06 08:45:21,138 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2024-02-06 08:45:21,138 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2024-02-06 08:45:21,140 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2024-02-06 08:45:21,140 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=5.14.0-362.18.1.el9_3.x86_64
2024-02-06 08:45:21,140 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=zookeeper
2024-02-06 08:45:21,140 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/zookeeper
2024-02-06 08:45:21,140 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/
2024-02-06 08:45:21,141 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@22d8cfe0
2024-02-06 08:45:21,157 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2024-02-06 08:45:21,201 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2024-02-06 08:45:21,206 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x18d7d581b330003, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
cZxid = 0x100000002
ctime = Tue Feb 06 08:35:34 UTC 2024
mZxid = 0x100000017
mtime = Tue Feb 06 08:45:21 UTC 2024
pZxid = 0x100000002
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 17
numChildren = 0
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Connecting to localhost:2181
2024-02-06 08:45:22,821 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
2024-02-06 08:45:22,824 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=zk-1.zk-hs.default.svc.cluster.local
2024-02-06 08:45:22,824 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_131
2024-02-06 08:45:22,827 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2024-02-06 08:45:22,827 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
2024-02-06 08:45:22,827 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/bin/../build/classes:/usr/bin/../build/lib/*.jar:/usr/bin/../share/zookeeper/zookeeper-3.4.10.jar:/usr/bin/../share/zookeeper/slf4j-log4j12-1.6.1.jar:/usr/bin/../share/zookeeper/slf4j-api-1.6.1.jar:/usr/bin/../share/zookeeper/netty-3.10.5.Final.jar:/usr/bin/../share/zookeeper/log4j-1.2.16.jar:/usr/bin/../share/zookeeper/jline-0.9.94.jar:/usr/bin/../src/java/lib/*.jar:/usr/bin/../etc/zookeeper:
2024-02-06 08:45:22,827 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2024-02-06 08:45:22,827 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2024-02-06 08:45:22,827 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2024-02-06 08:45:22,828 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2024-02-06 08:45:22,828 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2024-02-06 08:45:22,828 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=5.14.0-362.18.1.el9_3.x86_64
2024-02-06 08:45:22,828 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=zookeeper
2024-02-06 08:45:22,828 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/zookeeper
2024-02-06 08:45:22,828 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/
2024-02-06 08:45:22,830 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@22d8cfe0
2024-02-06 08:45:22,845 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2024-02-06 08:45:22,897 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2024-02-06 08:45:22,905 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x28d7d581b360005, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
cZxid = 0x100000002
world_should_work
ctime = Tue Feb 06 08:35:34 UTC 2024
mZxid = 0x100000017
mtime = Tue Feb 06 08:45:21 UTC 2024
pZxid = 0x100000002
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 17
numChildren = 0

验证:

$ for i in 0 1 2; do kubectl exec zk-$i -- zkCli.sh get /hello | grep world; done
cZxid = 0x100000002
world_should_work
ctime = Tue Feb 06 08:35:34 UTC 2024
mZxid = 0x100000017
mtime = Tue Feb 06 08:45:21 UTC 2024
pZxid = 0x100000002
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 17
numChildren = 0
cZxid = 0x100000002
world_should_work
ctime = Tue Feb 06 08:35:34 UTC 2024
mZxid = 0x100000017
mtime = Tue Feb 06 08:45:21 UTC 2024
pZxid = 0x100000002
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 17
numChildren = 0
cZxid = 0x100000002
world_should_work
ctime = Tue Feb 06 08:35:34 UTC 2024
mZxid = 0x100000017
mtime = Tue Feb 06 08:45:21 UTC 2024
pZxid = 0x100000002
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 17
numChildren = 0

写入操作还能工作,是因为ZooKeeper ensemble的quorum(法定人数),如果删除两个服务器,就没有quorum了。

$ kubectl delete --force=true --grace-period=0 pod zk-2  &
kubectl delete --force=true --grace-period=0 pod zk-1  &
sleep 1; kubectl delete --force=true --grace-period=0 pod zk-2  &
sleep 1; kubectl delete --force=true --grace-period=0 pod zk-1  &
sleep 1
kubectl exec zk-0 zkCli.sh set /hello world_should_not_work
sleep 1
kubectl exec zk-0 zkCli.sh get /hello
sleep 20 # If you are running manually use kubectl get pods to see status of pods restarting
kubectl exec zk-0 zkCli.sh get /hello
[1] 159326
[2] 159327
Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "zk-2" force deleted
pod "zk-1" force deleted
[1]-  Done                    kubectl delete --force=true --grace-period=0 pod zk-2
[2]+  Done                    kubectl delete --force=true --grace-period=0 pod zk-1
[1] 159519
Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
Error from server (NotFound): pods "zk-2" not found
[1]+  Exit 1                  kubectl delete --force=true --grace-period=0 pod zk-2
[1] 159526
Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "zk-1" force deleted
[1]+  Done                    kubectl delete --force=true --grace-period=0 pod zk-1
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Connecting to localhost:2181
2024-02-06 08:51:54,920 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
2024-02-06 08:51:54,923 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=zk-0.zk-hs.default.svc.cluster.local
2024-02-06 08:51:54,923 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_131
2024-02-06 08:51:54,925 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2024-02-06 08:51:54,925 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
2024-02-06 08:51:54,925 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/bin/../build/classes:/usr/bin/../build/lib/*.jar:/usr/bin/../share/zookeeper/zookeeper-3.4.10.jar:/usr/bin/../share/zookeeper/slf4j-log4j12-1.6.1.jar:/usr/bin/../share/zookeeper/slf4j-api-1.6.1.jar:/usr/bin/../share/zookeeper/netty-3.10.5.Final.jar:/usr/bin/../share/zookeeper/log4j-1.2.16.jar:/usr/bin/../share/zookeeper/jline-0.9.94.jar:/usr/bin/../src/java/lib/*.jar:/usr/bin/../etc/zookeeper:
2024-02-06 08:51:54,925 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2024-02-06 08:51:54,925 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2024-02-06 08:51:54,925 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2024-02-06 08:51:54,926 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2024-02-06 08:51:54,926 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2024-02-06 08:51:54,926 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=5.14.0-362.18.1.el9_3.x86_64
2024-02-06 08:51:54,926 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=zookeeper
2024-02-06 08:51:54,926 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/zookeeper
2024-02-06 08:51:54,926 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/
2024-02-06 08:51:54,929 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@22d8cfe0
2024-02-06 08:51:54,949 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2024-02-06 08:51:55,025 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2024-02-06 08:51:55,036 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x18d7d581b330005, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
cZxid = 0x100000002
ctime = Tue Feb 06 08:35:34 UTC 2024
mZxid = 0x100000022
mtime = Tue Feb 06 08:51:55 UTC 2024
pZxid = 0x100000002
cversion = 0
dataVersion = 2
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 21
numChildren = 0
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Connecting to localhost:2181
2024-02-06 08:51:56,664 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
2024-02-06 08:51:56,666 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=zk-0.zk-hs.default.svc.cluster.local
2024-02-06 08:51:56,666 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_131
2024-02-06 08:51:56,668 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2024-02-06 08:51:56,668 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
2024-02-06 08:51:56,668 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/bin/../build/classes:/usr/bin/../build/lib/*.jar:/usr/bin/../share/zookeeper/zookeeper-3.4.10.jar:/usr/bin/../share/zookeeper/slf4j-log4j12-1.6.1.jar:/usr/bin/../share/zookeeper/slf4j-api-1.6.1.jar:/usr/bin/../share/zookeeper/netty-3.10.5.Final.jar:/usr/bin/../share/zookeeper/log4j-1.2.16.jar:/usr/bin/../share/zookeeper/jline-0.9.94.jar:/usr/bin/../src/java/lib/*.jar:/usr/bin/../etc/zookeeper:
2024-02-06 08:51:56,668 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2024-02-06 08:51:56,668 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2024-02-06 08:51:56,668 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2024-02-06 08:51:56,668 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2024-02-06 08:51:56,668 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2024-02-06 08:51:56,668 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=5.14.0-362.18.1.el9_3.x86_64
2024-02-06 08:51:56,668 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=zookeeper
2024-02-06 08:51:56,669 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/zookeeper
2024-02-06 08:51:56,669 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/
2024-02-06 08:51:56,670 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@22d8cfe0
2024-02-06 08:51:56,690 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2024-02-06 08:51:56,748 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2024-02-06 08:51:56,755 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x18d7d581b330006, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
world_should_not_work
cZxid = 0x100000002
ctime = Tue Feb 06 08:35:34 UTC 2024
mZxid = 0x100000022
mtime = Tue Feb 06 08:51:55 UTC 2024
pZxid = 0x100000002
cversion = 0
dataVersion = 2
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 21
numChildren = 0
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Connecting to localhost:2181
2024-02-06 08:52:17,354 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
2024-02-06 08:52:17,355 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=zk-0.zk-hs.default.svc.cluster.local
2024-02-06 08:52:17,356 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_131
2024-02-06 08:52:17,357 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2024-02-06 08:52:17,357 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
2024-02-06 08:52:17,357 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/bin/../build/classes:/usr/bin/../build/lib/*.jar:/usr/bin/../share/zookeeper/zookeeper-3.4.10.jar:/usr/bin/../share/zookeeper/slf4j-log4j12-1.6.1.jar:/usr/bin/../share/zookeeper/slf4j-api-1.6.1.jar:/usr/bin/../share/zookeeper/netty-3.10.5.Final.jar:/usr/bin/../share/zookeeper/log4j-1.2.16.jar:/usr/bin/../share/zookeeper/jline-0.9.94.jar:/usr/bin/../src/java/lib/*.jar:/usr/bin/../etc/zookeeper:
2024-02-06 08:52:17,357 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2024-02-06 08:52:17,358 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2024-02-06 08:52:17,358 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2024-02-06 08:52:17,358 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2024-02-06 08:52:17,358 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2024-02-06 08:52:17,358 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=5.14.0-362.18.1.el9_3.x86_64
2024-02-06 08:52:17,358 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=zookeeper
2024-02-06 08:52:17,358 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/zookeeper
2024-02-06 08:52:17,358 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/
2024-02-06 08:52:17,359 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@22d8cfe0
2024-02-06 08:52:17,374 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2024-02-06 08:52:17,436 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2024-02-06 08:52:17,444 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x18d7d581b330007, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
cZxid = 0x100000002
world_should_not_work
ctime = Tue Feb 06 08:35:34 UTC 2024
mZxid = 0x100000022
mtime = Tue Feb 06 08:51:55 UTC 2024
pZxid = 0x100000002
cversion = 0
dataVersion = 2
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 21
numChildren = 0

可见,还是写入了。我测试了几次,都是这样。难道是因为pod恢复的太快了?

持久化存储

删除StatefulSet:

kubectl delete statefulset zk

验证:

$ kubectl get pod
No resources found in default namespace.

重建StatefulSet:

$ kubectl apply -f zookeeper2.yaml
service/zk-hs unchanged
service/zk-cs unchanged
poddisruptionbudget.policy/zk-pdb configured
statefulset.apps/zk created
$ kubectl exec zk-0 -- zkCli.sh get /hello
......
WATCHER::

WatchedEvent state:SyncConnected type:None path:null
ccccc
......

注: ccccc 是最近一次写入的数据。

可见,重建StatefulSet和其pod之后,原来的数据还在。

当ZooKeeper的StatefulSet的pod被重新调度或者升级时,会将PV mount到ZooKeeper服务器的数据目录。持久化数据仍然保存在那里。对于Cassandra或者Consul或者etcd或其它任何数据库也都一样。

参考

  • https://github.com/cloudurable/kube-zookeeper-statefulsets/wiki/Tutorial-Part-1:--Managing-Kubernetes-StatefulSets-using-ZooKeeper-and-Minikube
  • https://kubernetes.io/docs/tutorials/stateful-application/zookeeper

相关推荐

  1. Kubernetes状态应用示例ZooKeeper

    2024-02-09 14:44:02       22 阅读
  2. Flink状态应用测试程序示例

    2024-02-09 14:44:02       33 阅读
  3. 什么是状态模式,哪些应用?

    2024-02-09 14:44:02       8 阅读

最近更新

  1. TCP协议是安全的吗?

    2024-02-09 14:44:02       18 阅读
  2. 阿里云服务器执行yum,一直下载docker-ce-stable失败

    2024-02-09 14:44:02       19 阅读
  3. 【Python教程】压缩PDF文件大小

    2024-02-09 14:44:02       18 阅读
  4. 通过文章id递归查询所有评论(xml)

    2024-02-09 14:44:02       20 阅读

热门阅读

  1. JVM体系

    2024-02-09 14:44:02       31 阅读
  2. c语言_实现类class的功能 实例

    2024-02-09 14:44:02       31 阅读
  3. 贪心_分类讨论_边界问题_1921_C. Sending Messages

    2024-02-09 14:44:02       27 阅读
  4. c实现链表

    2024-02-09 14:44:02       28 阅读
  5. deepin20.9安装及配置

    2024-02-09 14:44:02       28 阅读
  6. 高精度加法 取余 分类讨论 AcWing 791. 高精度加法

    2024-02-09 14:44:02       32 阅读
  7. 【LeetCode每日一题】1122. 数组的相对排序

    2024-02-09 14:44:02       31 阅读
  8. LeetCode639. Decode Ways II——动态规划

    2024-02-09 14:44:02       24 阅读
  9. C++ .h文件类的调用

    2024-02-09 14:44:02       28 阅读
  10. 机器学习原理到Python代码实现之PolynomialRegression

    2024-02-09 14:44:02       28 阅读