K8S 上部署 Emqx

安装方式一:

1. 快速部署一个简单的 EMQX 集群:

  添加 helm 仓库:

$ helm repo add emqx https://repos.emqx.io/charts

# 查询 EMQX
$ helm search repo emqx

  启动 EMQX 集群,设置 service.type=NodePort

$ helm install my-emqx emqx/emqx --set service.type=NodePort
NAME: my-emqx
LAST DEPLOYED: Thu Jul 11 10:56:16 2024
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None

  查看 EMQX 集群情况:

$ kubectl get pods -o wide
NAME                       READY   STATUS              RESTARTS       AGE    IP              NODE        NOMINATED NODE   READINESS GATES
my-emqx-0                  0/1     ContainerCreating   0              12s    <none>          master-01   <none>           <none>
my-emqx-1                  0/1     ContainerCreating   0              12s    <none>          worker-02   <none>           <none>
my-emqx-2                  0/1     ContainerCreating   0              12s    <none>          worker-01   <none>           <none>

$ kubectl get pods -o wide
NAME                       READY   STATUS    RESTARTS       AGE    IP              NODE        NOMINATED NODE   READINESS GATES
my-emqx-0                  1/1     Running   0              3m7s   172.20.32.158   master-01   <none>           <none>
my-emqx-1                  1/1     Running   0              3m7s   172.20.58.232   worker-02   <none>           <none>
my-emqx-2                  1/1     Running   0              3m7s   172.20.85.250   worker-01   <none>           <none>

$ kubectl exec -it my-emqx-0 -- emqx_ctl status
Node 'my-emqx@my-emqx-0.my-emqx-headless.default.svc.cluster.local' 5.7.1 is started

$ kubectl exec -it my-emqx-0 -- emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['my-emqx@my-emqx-0.my-emqx-headless.default.svc.cluster.local',
                       'my-emqx@my-emqx-1.my-emqx-headless.default.svc.cluster.local',
                       'my-emqx@my-emqx-2.my-emqx-headless.default.svc.cluster.local'],
                  stopped_nodes => []}

  查看 EMQX service:

$ kubectl get svc -o wide
NAME                             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)
service/my-emqx                  NodePort    10.68.183.22    <none>        1883:32596/TCP,8883:31933/TCP,8083:32522/TCP,8084:32535/TCP,18083:32717/TCP   10m     app.kubernetes.io/instance=my-emqx,app.kubernetes.io/name=emqx
service/my-emqx-headless         ClusterIP   None            <none>        1883/TCP,8883/TCP,8083/TCP,8084/TCP,18083/TCP,4370/TCP                        10m     app.kubernetes.io/instance=my-emqx,app.kubernetes.io/name=emqx

# 也可以执行该命令查看全部
$ sudo kubectl get all -o wide
NAME                           READY   STATUS    RESTARTS       AGE    IP              NODE        NOMINATED NODE   READINESS GATES
pod/my-emqx-0                  1/1     Running   0              10m    172.20.32.158   master-01   <none>           <none>
pod/my-emqx-1                  1/1     Running   0              10m    172.20.58.232   worker-02   <none>           <none>
pod/my-emqx-2                  1/1     Running   0              10m    172.20.85.250   worker-01   <none>           <none>

NAME                             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)
service/my-emqx                  NodePort    10.68.183.22    <none>        1883:32596/TCP,8883:31933/TCP,8083:32522/TCP,8084:32535/TCP,18083:32717/TCP   10m     app.kubernetes.io/instance=my-emqx,app.kubernetes.io/name=emqx
service/my-emqx-headless         ClusterIP   None            <none>        1883/TCP,8883/TCP,8083/TCP,8084/TCP,18083/TCP,4370/TCP                        10m     app.kubernetes.io/instance=my-emqx,app.kubernetes.io/name=emqx

NAME                                      READY   AGE    CONTAINERS   IMAGES
statefulset.apps/my-emqx                  3/3     10m    emqx         emqx/emqx:5.7.1

  可以看到 my-emqx 的 18083 端口对应的宿主机的 32717 端口。(NodePort 在每次部署的时候都会变化,以实际部署时为准。)

  访问 Kubernetes 的任意一台节点 IP 的 32717 端口,输入默认用户名:admin,默认密码:public,登陆 EMQX dashboard。

在这里插入图片描述

在这里插入图片描述

在这里插入图片描述
  删除 EMQX 集群:

$ helm uninstall my-emqx
release "my-emqx" uninstalled
2. 部署一个持久化的 EMQX 集群:

  EMQX 通过 创建 PVC 资源挂载 /opt/emqx/data/mnesia 目录实现持久化 pods,在部署 EMQX 之前,用户需要部署 Haproxy 或 Nginx-PLUS 等负载均衡器,并自行在 Kubernetes 中创建 PVC 资源或是 Storage Classes 资源

  如果用户部署了 PVC 资源,那么设置 persistence.existingClaim=your_pv_name

$ helm install my-emqx emqx/emqx --set persistence.enabled=true --set persistence.existingClaim=your_pv_name

  如果用户部署了 Storage Classes 资源,那么设置 persistence.storageClass=your_storageClass_name

$ helm install my-emqx emqx/emqx --set persistence.enabled=true --set persistence.storageClass=your_storageClass_name

  查看 EMQX 集群情况

$ kubectl get pods
NAME       READY  STATUS             RESTARTS  AGE
my-emqx-0  1/1     Running   0          56
smy-emqx-1  1/1     Running   0          40s
my-emqx-2  1/1     Running   0          21s

$ kubectl exec -it my-emqx-0 -- emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['my-emqx@my-emqx-0.my-emqx-headless.default.svc.cluster.local',                       
                      'my-emqx@my-emqx-1.my-emqx-headless.default.svc.cluster.local',                       
                      'my-emqx@my-emqx-2.my-emqx-headless.default.svc.cluster.local'],                  stopped_nodes => []}

  以 Storage Classes 为例,可以看到 PVC 资源已经成功的建立

$ kubectl get pvc
NAME                  STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
emqx-data-my-emqx-0   Bound     pvc-8094cd75-adb5-11e9-80cc-0697b59e8064   1Gi        RWO            gp2            2m11s
emqx-data-my-emqx-1   Bound     pvc-9325441d-adb5-11e9-80cc-0697b59e8064   1Gi        RWO            gp2            99s
emqx-data-my-emqx-2   Bound     pvc-ad425e9d-adb5-11e9-80cc-0697b59e8064   1Gi        RWO            gp2            56s

  集群会将 EMQX 的 /opt/emqx/data/mnesia 目录挂载到 PVC 中,当 Pods 被重新调度之后,EMQX 会从 /opt/emqx/data/mnesia 目录中获取数据并恢复。查看 EMQX 的 ClusterIP

$ kubectl get svc
NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                                  AGE
my-emqx              ClusterIP   10.100.205.13   <none>        1883/TCP,8883/TCP,8081/TCP,8083/TCP,8084/TCP,18083/TCP   26m
my-emqx-headless     ClusterIP   None            <none>        1883/TCP,8883/TCP,8081/TCP,8083/TCP,8084/TCP,18083/TCP   26m

  可以看到 my-emqx 的 ClusterIP 为 10.100.205.13 (ClusterIP 在每次部署的时候都会变化,以实际部署时为准。)

  将负载均衡监听的 URL 的 1883、8883、8081、8083、8084、18083 端口转发到 my-emqx 的 ClusterIP,如果有 TLS 连接的需要,推荐在负载均衡器终结 SSL 连接。客户端与负载均衡器之间 TLS 安全连接,LB 与 EMQX 之间普通 TCP 连接。

  访问 URL:18083,输入默认用户名:admin,默认密码:public,登陆 EMQX dashboard。

  使用 helm upgrade 命令可以轻松扩展 EMQX 集群,下面以增加 EMQX 节点为例展示 helm upgrade 命令

# 将 EMQX 的节点数量变更为5个
# 注意:EMQX 的节点数量建议为单数
$ helm upgrade --set replicaCount=5 my-emqx emqx/emqx
Release "my-emqx" has been upgraded. Happy Helming!
$ kubectl get pods  NAME       READY  STATUS             RESTARTS  AGE
my-emqx-0  1/1    Running            0         4m25s
my-emqx-1  1/1    Running            0         4m14s
my-emqx-2  1/1    Running            0         4m
my-emqx-3  1/1    Running            0         31s
my-emqx-4  1/1    Running            0         15s

$ kubectl exec -it my-emqx-0 -- emqx_ctl cluster status  Cluster status: #{running_nodes =>
                        ['my-emqx@my-emqx-0.my-emqx-headless.default.svc.cluster.local',   
                        'my-emqx@my-emqx-1.my-emqx-headless.default.svc.cluster.local',
                        'my-emqx@my-emqx-2.my-emqx-headless.default.svc.cluster.local',
                        'my-emqx@my-emqx-3.my-emqx-headless.default.svc.cluster.local',
                        'my-emqx@my-emqx-4.my-emqx-headless.default.svc.cluster.local'],
                    stopped_nodes => []}

  删除 EMQX 集群

$ helm uninstall my-emqx
release "my-emqx" uninstalled

  注意:EMQX 集群删除掉之后 PVC 资源不会自动释放掉,以便恢复 EMQX,确认不需要恢复后需要手动删除 PVC 资源

$ kubectl get pvc
NAME                  STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
emqx-data-my-emqx-0   Bound     pvc-8094cd75-adb5-11e9-80cc-0697b59e8064   1Gi        RWO            gp2            84m
emqx-data-my-emqx-1   Bound     pvc-9325441d-adb5-11e9-80cc-0697b59e8064   1Gi        RWO            gp2            84m
emqx-data-my-emqx-2   Bound     pvc-ad425e9d-adb5-11e9-80cc-0697b59e8064   1Gi        RWO            gp2            83m
emqx-data-my-emqx-3   Bound     pvc-b6c5a565-adbd-11e9-80cc-0697b59e8064   1Gi        RWO            gp2            25m
emqx-data-my-emqx-4   Bound     pvc-c626cafd-adbd-11e9-80cc-0697b59e8064   1Gi        RWO            gp2            25m

$ kubectl delete pvc emqx-data-my-emqx-0 emqx-data-my-emqx-1 emqx-data-my-emqx-2 emqx-data-my-emqx-3 emqx-data-my-emqx-4                    
persistentvolumeclaim "emqx-data-my-emqx-0" deleted
persistentvolumeclaim "emqx-data-my-emqx-1" deleted
persistentvolumeclaim "emqx-data-my-emqx-2" deleted
persistentvolumeclaim "emqx-data-my-emqx-3" deleted
persistentvolumeclaim "emqx-data-my-emqx-4" deleted

参考:
在K8S上部署EMQX企业版集群

  实测可以执行的命令:

helm install my-emqx emqx/emqx -n emqx --create-namespace --set service.type=NodePort --set service.nodePorts.dashboard=32717 --set service.nodePorts.mqtt=32596 --set persistence.storageClass=nfs
3. 部署 EMQX Edge 集群和 EMQX 企业版集群:

  EMQX Edge:部署 EMQX Edge 集群指定 image.repository=emqx/emqx-edge,其他设置与部署 EMQX 集群保持一致

$ helm install my-emqx-edge emqx/emqx --set image.repository=emqx/emqx$ kubectl get pods
NAME            READY   STATUS    RESTARTS   AGEmy-emqx-edge-0  1/1     Running   0          35smy-emqx-edge-1  1/1     Running   0          23smy-emqx-edge-2  1/1     Running   0          9s

  EMQX EE:部署 EMQX 企业版 集群首先需要前往 www.emqx.com 申请并下载 License 文件,并将 License 文件创建为 Secret 资源

$ kubectl create secret generic your-license-secret-name --from-file=/path/to/emqx.lic

  然后在部署时指定 repo 为 emqx/emqx-ee, emqxLicneseSecretName=your-license-secret-name, 其他设置与部署 EMQX 集群保持一致

$ helm install my-emqx-ee emqx/emqx-ee emqxLicneseSecretName=your-license-secret-name

安装方式二:定制化部署

  参考自:从零开始建立 EMQX MQTT 服务器的 K8S 集群

1. 使用 Pod 直接部署 EMQX Broker

  EMQX Broker 在 docker hub 上提供了镜像, 因此可以很方便的在单个的 pod 上部署 EMQX Broker,使用 kubectl run 命令创建一个运行着 EMQX Broker 的 Pod:

$ kubectl run emqx --image=emqx/emqx:v4.1-rc.1  --generator=run-pod/v1
error: unknown flag: --generator
See 'kubectl run --help' for usage.

$ kubectl run emqx --image=emqx/emqx:v4.1-rc.1
pod/emqx created

  查看 EMQX Broker 的状态:

$ kubectl get pods -o wide
NAME                       READY   STATUS              RESTARTS       AGE    IP              NODE        NOMINATED NODE   READINESS GATES
emqx                       0/1     ContainerCreating   0              8s     <none>          master-01   <none>           <none>

$ kubectl get pods -o wide
NAME                       READY   STATUS    RESTARTS       AGE    IP              NODE        NOMINATED NODE   READINESS GATES
emqx                       1/1     Running   0              97s    172.20.32.139   master-01   <none>           <none>

$ kubectl exec emqx -- emqx_ctl status
Node 'emqx@172.20.32.139' is started
emqx 4.1-rc.1 is running

  删除 Pod:

$ kubectl delete pods emqx
pod "emqx" deleted

  Pod 并不是被设计成一个持久化的资源,它不会在调度失败,节点崩溃,或者其他回收中(比如因为资源的缺乏,或者其他的维护中)幸存下来,因此,还需要一个控制器来管理 Pod。

2. 使用 Deoloyment 部署 Pod

  Deployment 为 Pod 和 ReplicaSet 提供了一个声明式定义(declarative)方法,用来替代以前的 ReplicationController 来方便的管理应用。典型的应用场景包括:

  • 定义Deployment来创建Pod和ReplicaSet
  • 滚动升级和回滚应用
  • 扩容和缩容
  • 暂停和继续Deployment

  使用 Deployment 部署一个 EMQX Broker Pod:

$ vim deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: emqx-deployment
  labels:
    app: emqx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: emqx
  template:
    metadata:
      labels:
        app: emqx
    spec:
      containers:
      - name: emqx
        image: emqx/emqx:v4.1-rc.1
        ports:
        - name: mqtt
          containerPort: 1883
        - name: mqttssl
          containerPort: 8883
        - name: mgmt
          containerPort: 8081
        - name: ws
          containerPort: 8083
        - name: wss
          containerPort: 8084
        - name: dashboard
          containerPort: 18083

  部署 Deployment:

$ kubectl apply -f deployment.yaml
deployment.apps/emqx-deployment created

  查看部署情况:

$ kubectl get deployment
NAME              READY   UP-TO-DATE   AVAILABLE   AGE
emqx-deployment   0/1     1            0           49s

$ kubectl get pods
NAME                               READY   STATUS    RESTARTS       AGE
emqx-deployment-75bd4f75b6-wf9sh   1/1     Running   0              73s

$ kubectl exec pod/emqx-deployment-75bd4f75b6-wf9sh -- emqx_ctl status
Node 'emqx-deployment-75bd4f75b6-wf9sh@172.20.58.199' is started
emqx 4.1-rc.1 is running

$ kubectl get pods
NAME                               READY   STATUS    RESTARTS       AGE
emqx-deployment-75bd4f75b6-8njhv   1/1     Running   0              55s

  尝试手动删除 Pod:

$ kubectl delete pods emqx-deployment-75bd4f75b6-8njhv
pod "emqx-deployment-75bd4f75b6-8njhv" deleted

$ kubectl get pods
NAME                              READY   STATUS    RESTARTS   AGE
emqx-deployment-68fcb4bfd6-2nhh6   1/1     Running   0          59s

  输出结果表明成功用 Deployment 部署了 EMQX Broker Pod,即使是此 Pod 被意外终止,Deployment 也会重新创建一个新的 Pod。

3. 使用 Services 公开 EMQX Broker Pod 服务

  Kubernetes Pods 是有生命周期的。他们可以被创建,而且销毁不会再启动。 如果使用 Deployment 来运行应用程序,则它可以动态创建和销毁 Pod。

  每个 Pod 都有自己的 IP 地址,但是在 Deployment 中,在同一时刻运行的 Pod 集合可能与稍后运行该应用程序的 Pod 集合不同。

  这导致了一个问题:如果使用 EMQX Broker Pod 为 MQTT 客户端提供服务,那么客户端应该如何如何找出并跟踪要连接的 IP 地址,以便客户端使用 EMQX Broker 服务呢?

  答案是:Service

  Service 是将运行在一组 Pods 上的应用程序公开为网络服务的抽象方法。

  使用 Service 将 EMQX Broker Pod 公开为网络服务:

vim service.yaml
apiVersion: v1
kind: Service
metadata:
  name: emqx-service
spec:
  selector:
    app: emqx
  ports:
    - name: mqtt
      port: 1883
      protocol: TCP
      targetPort: mqtt
    - name: mqttssl
      port: 8883
      protocol: TCP
      targetPort: mqttssl
    - name: mgmt
      port: 8081
      protocol: TCP
      targetPort: mgmt
    - name: ws
      port: 8083
      protocol: TCP
      targetPort: ws
    - name: wss
      port: 8084
      protocol: TCP
      targetPort: wss
    - name: dashboard
      port: 18083
      protocol: TCP
      targetPort: dashboard

  部署 Service:

$ kubectl apply -f service.yaml
service/emqx-service created

  查看部署情况

$ kubectl get svc
NAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                                  AGE
emqx-service             ClusterIP   10.68.228.164   <none>        1883/TCP,8883/TCP,8081/TCP,8083/TCP,8084/TCP,18083/TCP   16s

  使用 Service 提供的 IP 查看 EMQX Broker 的 API

$ curl 10.68.228.164:8081/status
Node emqx-deployment-75bd4f75b6-8njhv@172.20.32.140 is started
emqx is running

  至此,单个 EMQX Broker 节点在 kubernetes 上部署完毕,通过 Deployment 管理 EMQX Broker Pod,通过 Service 将 EMQX Broker 服务暴露出去。

4. 通过 kubernetes 自动集群 EMQX MQTT 服务器

  上文中通过 Deployment 部署了单个的 EMQX Broker Pod,通过 Deployment 扩展 Pod 的数量是极为方便的,执行 kubectl scale deployment ${deployment_name} --replicas ${numer} 命令即可扩展 Pod 的数量,下面将 EMQX Broker Pod 扩展为 3 个:

$ kubectl scale deployment emqx-deployment --replicas 3
deployment.apps/emqx-deployment scaled

$ kubectl get pods
NAME                               READY   STATUS              RESTARTS       AGE
emqx-deployment-75bd4f75b6-8njhv   1/1     Running             0              8m41s
emqx-deployment-75bd4f75b6-kjv22   0/1     ContainerCreating   0              12s
emqx-deployment-75bd4f75b6-thcbv   1/1     Running             0              12s

$ kubectl exec emqx-deployment-75bd4f75b6-8njhv -- emqx_ctl status
Node 'emqx-deployment-75bd4f75b6-8njhv@172.20.32.140' is started
emqx 4.1-rc.1 is running

$ kubectl exec emqx-deployment-75bd4f75b6-8njhv -- emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['emqx-deployment-75bd4f75b6-8njhv@172.20.32.140'],
                  stopped_nodes => []}

  可以看到 EMQX Broker Pod 的数量被扩展为 3 个,但是每个 Pod 都是独立的,并没有集群,接下来尝试通过 kubernetes 自动集群 EMQX Broker Pod。

5. 修改 EMQX Broker 的配置

  查看 EMQX Broker 文档中关于 自动集群 的内容,可以看到需要修改 EMQX Broker 的配置:

cluster.discovery = kubernetes
cluster.kubernetes.apiserver = http://10.110.111.204:8080
cluster.kubernetes.service_name = ekka
cluster.kubernetes.address_type = ip
cluster.kubernetes.app_name = ekka

  其中 cluster.kubernetes.apiserverkubernetes apiserver 的地址,可以通过 kubectl cluster-info 命令获取,cluster.kubernetes.service_name 为上文中 Service 的 name, cluster.kubernetes.app_name 为 EMQX Broker 的 node.name@ 符号之前的部分,所以还需要将集群中 EMQX Broker 设置为统一的 node.name 的前缀。

  EMQX Broker 的 docker 镜像提供了通过环境变量修改配置的功能,具体可以查看 docker hub 或 Github。

$ vim deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: emqx-deployment
  labels:
    app: emqx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: emqx
  template:
    metadata:
      labels:
        app: emqx
    spec:
      containers:
      - name: emqx
        image: emqx/emqx:v4.1-rc.1
        ports:
        - name: mqtt
          containerPort: 1883
        - name: mqttssl
          containerPort: 8883
        - name: mgmt
          containerPort: 8081
        - name: ws
          containerPort: 8083
        - name: wss
          containerPort: 8084
        - name: dashboard
          containerPort: 18083
        env:
        - name: EMQX_NAME
          value: emqx
        - name: EMQX_CLUSTER__DISCOVERY
          value: k8s
        - name: EMQX_CLUSTER__K8S__APP_NAME
          value: emqx
        - name: EMQX_CLUSTER__K8S__SERVICE_NAME
          value: emqx-service
        - name: EMQX_CLUSTER__K8S__APISERVER
          value: "https://kubernetes.default.svc:443"
        - name: EMQX_CLUSTER__K8S__NAMESPACE
          value: default

  因为 kubectl scale deployment ${deployment_name} --replicas ${numer} 命令不会修改 yaml 文件,所以修改 yaml 时需要设置 spec.replicas: 3

  Pod 中内建 kubernetes 的 DNS 规则,所以 https://kubernetes.default.svc:443 会被解析为 kubernetes apiserver 的地址。

  删除之前的 Deployment,重新部署:

$ kubectl delete deployment emqx-deployment
deployment.apps "emqx-deployment" deleted

$ root@k8s-master01:/home/ubuntu/huiq/emqx# kubectl apply -f deployment.yaml
deployment.apps/emqx-deployment created
6. 赋予 Pod 访问 kubernetes apiserver 的权限

  上文部署 Deployment 之后,查看 EMQX Broker 的状态,可以看到 EMQX Broker 虽然成功启动了,但是依然没有集群成功,查看 EMQX Broker Pod 的 log:

$ kubectl get pods
NAME                               READY   STATUS    RESTARTS       AGE
emqx-deployment-64d5cc575c-dsrgj   1/1     Running   0              41s
emqx-deployment-64d5cc575c-k4twn   1/1     Running   0              41s
emqx-deployment-64d5cc575c-ktdqk   1/1     Running   0              41s

$ kubectl exec emqx-deployment-64d5cc575c-dsrgj -- emqx_ctl status
Node 'emqx@172.20.58.200' is started
emqx 4.1-rc.1 is running

$ kubectl exec emqx-deployment-64d5cc575c-dsrgj -- emqx_ctl cluster status
Cluster status: #{running_nodes => ['emqx@172.20.58.200'],stopped_nodes => []}

root@k8s-master01:/home/ubuntu/huiq/emqx# kubectl logs emqx-deployment-64d5cc575c-dsrgj
...
(emqx@172.20.58.200)1> 2024-07-11 08:14:37.525 [error] Ekka(AutoCluster): Discovery error: {403,"{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"endpoints \\\"emqx-service\\\" is forbidden: User \\\"system:serviceaccount:default:default\\\" cannot get resource \\\"endpoints\\\" in API group \\\"\\\" in the namespace \\\"default\\\"\",\"reason\":\"Forbidden\",\"details\":{\"name\":\"emqx-service\",\"kind\":\"endpoints\"},\"code\":403}\n"}
...

  Pod 因为权限问题在访问 kubernetes apiserver 的时候被拒绝,返回 HTTP 403,所以集群失败。

  普通 Pod 是无法访问 kubernetes apiserver 的,解决这个问题有两种方法,一种是开放 kubernetes apiserver 的 http 接口,但是这种方法存在一定的安全隐患,另外一种是通过 ServiceAccount、Role 和 RoleBinding 配置 RBAC 鉴权。

$ vim rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: default
  name: emqx
---
kind: Role
apiVersion: rbac.authorization.kubernetes.io/v1beta1
metadata:
  namespace: default
  name: emqx
rules:
- apiGroups:
  - ""
  resources:
  - endpoints 
  verbs: 
  - get
  - watch
  - list
---
kind: RoleBinding
apiVersion: rbac.authorization.kubernetes.io/v1beta1
metadata:
  namespace: default
  name: emqx
subjects:
  - kind: ServiceAccount
    name: emqx
    namespace: default
roleRef:
  kind: Role
  name: emqx
  apiGroup: rbac.authorization.kubernetes.io

  部署相应的资源:

$ kubectl apply -f rbac.yaml
serviceaccount/emqx created
resource mapping not found for name: "emqx" namespace: "default" from "rbac.yaml": no matches for kind "Role" in version "rbac.authorization.kubernetes.io/v1beta1"
ensure CRDs are installed first
resource mapping not found for name: "emqx" namespace: "default" from "rbac.yaml": no matches for kind "RoleBinding" in version "rbac.authorization.kubernetes.io/v1beta1"
ensure CRDs are installed first

$ kubectl explain role
GROUP:      rbac.authorization.k8s.io
KIND:       Role
VERSION:    v1

$ kubectl explain RoleBinding
GROUP:      rbac.authorization.k8s.io
KIND:       RoleBinding
VERSION:    v1

  解决上面的问题,重新编辑 rbac.yaml 文件:

apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: default
  name: emqx
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: default
  name: emqx
rules:
- apiGroups:
  - ""
  resources:
  - endpoints
  verbs:
  - get
  - watch
  - list
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: default
  name: emqx
subjects:
  - kind: ServiceAccount
    name: emqx
    namespace: default
roleRef:
  kind: Role
  name: emqx
  apiGroup: rbac.authorization.k8s.io

$ kubectl delete serviceaccount emqx
serviceaccount "emqx" deleted
$ kubectl delete role emqx
role.rbac.authorization.k8s.io "emqx" deleted
$ kubectl delete rolebinding emqx
rolebinding.rbac.authorization.k8s.io "emqx" deleted

$ kubectl apply -f rbac.yaml
serviceaccount/emqx created
role.rbac.authorization.k8s.io/emqx created
rolebinding.rbac.authorization.k8s.io/emqx created

$ kubectl get serviceaccount
NAME                    SECRETS   AGE
emqx                    0         16s

$ kubectl get role
NAME   CREATED AT
emqx   2024-07-11T08:45:48Z

$ kubectl get rolebinding
NAME   ROLE        AGE
emqx   Role/emqx   51s

  修改 Deployment 的 yaml 文件,增加 spec.template.spec.serviceAccountName,并重新部署:

 vim deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: emqx-deployment
  labels:
    app: emqx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: emqx
  template:
    metadata:
      labels:
        app: emqx
    spec:
      serviceAccountName: emqx
      containers:
      - name: emqx
        image: emqx/emqx:v4.1-rc.1
        ports:
        - name: mqtt
          containerPort: 1883
        - name: mqttssl
          containerPort: 8883
        - name: mgmt
          containerPort: 8081
        - name: ws
          containerPort: 8083
        - name: wss
          containerPort: 8084
        - name: dashboard
          containerPort: 18083
        env:
        - name: EMQX_NAME
          value: emqx
        - name: EMQX_CLUSTER__DISCOVERY
          value: kubernetes
        - name: EMQX_CLUSTER__K8S__APP_NAME
          value: emqx
        - name: EMQX_CLUSTER__K8S__SERVICE_NAME
          value: emqx-service
        - name: EMQX_CLUSTER__K8S__APISERVER
          value: "https://kubernetes.default.svc:443"
        - name: EMQX_CLUSTER__K8S__NAMESPACE
          value: default

$ kubectl delete deployment emqx-deployment
deployment.apps "emqx-deployment" deleted
$ kubectl apply -f deployment.yaml
deployment.apps/emqx-deployment created

  查看状态:

$ kubectl get pods
NAME                               READY   STATUS    RESTARTS       AGE
emqx-deployment-7c4675f785-6r6dg   1/1     Running   0              28s
emqx-deployment-7c4675f785-cqbgh   1/1     Running   0              28s
emqx-deployment-7c4675f785-nghf6   1/1     Running   0              28s

$ kubectl exec emqx-deployment-7c4675f785-6r6dg  -- emqx_ctl status
Node 'emqx@127.0.0.1' not responding to pings.
command terminated with exit code 1

  尝试解决:本以为是文章里下划线写多了导致的,虽然 emqx 可以运行了但不是集群模式

$ vim deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: emqx-deployment
  labels:
    app: emqx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: emqx
  template:
    metadata:
      labels:
        app: emqx
    spec:
      serviceAccountName: emqx
      containers:
      - name: emqx
        image: emqx/emqx:v4.1-rc.1
        ports:
        - name: mqtt
          containerPort: 1883
        - name: mqttssl
          containerPort: 8883
        - name: mgmt
          containerPort: 8081
        - name: ws
          containerPort: 8083
        - name: wss
          containerPort: 8084
        - name: dashboard
          containerPort: 18083
        env:
        - name: EMQX_NAME
          value: emqx
        - name: EMQX_CLUSTER_DISCOVERY
          value: kubernetes
        - name: EMQX_CLUSTER_K8S_APP_NAME
          value: emqx
        - name: EMQX_CLUSTER_K8S_SERVICE_NAME
          value: emqx-service
        - name: EMQX_CLUSTER_K8S_APISERVER
          value: "https://kubernetes.default.svc:443"
        - name: EMQX_CLUSTER_K8S_NAMESPACE
          value: default

$ kubectl delete deployment emqx-deployment
deployment.apps "emqx-deployment" deleted
$ kubectl apply -f deployment.yaml
deployment.apps/emqx-deployment created
$ kubectl get pods
NAME                               READY   STATUS    RESTARTS       AGE
emqx-deployment-794478ff56-2rqwp   1/1     Running   0              19s
emqx-deployment-794478ff56-4jnmt   1/1     Running   0              19s
emqx-deployment-794478ff56-m4mk4   1/1     Running   0              19s
$ kubectl exec emqx-deployment-794478ff56-2rqwp -- emqx_ctl status
Node 'emqx@172.20.85.238' is started
emqx 4.1-rc.1 is running
$ kubectl exec emqx-deployment-794478ff56-bzck8 -- emqx_ctl cluster status
Cluster status: #{running_nodes => ['emqx@172.20.85.244'],stopped_nodes => []}

  最终解决:文章里给留了个大坑,需要将 kubernetes 改为 k8s

$ vim deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: emqx-deployment
  labels:
    app: emqx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: emqx
  template:
    metadata:
      labels:
        app: emqx
    spec:
      serviceAccountName: emqx
      containers:
      - name: emqx
        image: emqx/emqx:v4.1-rc.1
        ports:
        - name: mqtt
          containerPort: 1883
        - name: mqttssl
          containerPort: 8883
        - name: mgmt
          containerPort: 8081
        - name: ws
          containerPort: 8083
        - name: wss
          containerPort: 8084
        - name: dashboard
          containerPort: 18083
        env:
        - name: EMQX_NAME
          value: emqx
        - name: EMQX_CLUSTER__DISCOVERY
          value: k8s
        - name: EMQX_CLUSTER__K8S__APP_NAME
          value: emqx
        - name: EMQX_CLUSTER__K8S__SERVICE_NAME
          value: emqx-service
        - name: EMQX_CLUSTER__K8S__APISERVER
          value: "https://kubernetes.default.svc:443"
        - name: EMQX_CLUSTER__K8S__NAMESPACE
          value: default

$ kubectl delete deployment emqx-deployment
deployment.apps "emqx-deployment" deleted
$ kubectl apply -f deployment.yaml
deployment.apps/emqx-deployment created

$ kubectl get pods
NAME                               READY   STATUS    RESTARTS     AGE
emqx-deployment-5884fd896f-b6hz9   1/1     Running   0            3s
emqx-deployment-5884fd896f-hvd6v   1/1     Running   0            3s
emqx-deployment-5884fd896f-jv4ck   1/1     Running   0            3s

$ kubectl exec emqx-deployment-5884fd896f-b6hz9 -- emqx_ctl status
Node 'emqx@172.20.32.168' is started
emqx 4.1-rc.1 is running
$ kubectl exec emqx-deployment-5884fd896f-b6hz9 -- emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['emqx@172.20.32.168','emqx@172.20.58.237',
                       'emqx@172.20.85.197'],
                  stopped_nodes => []}

  中止一个 Pod:

$ kubectl delete pods emqx-deployment-5884fd896f-b6hz9 --force
Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "emqx-deployment-5884fd896f-b6hz9" force deleted

$ kubectl get pods
NAME                               READY   STATUS    RESTARTS     AGE
emqx-deployment-5884fd896f-84wxw   1/1     Running   0            34s
emqx-deployment-5884fd896f-hvd6v   1/1     Running   0            5m11s
emqx-deployment-5884fd896f-jv4ck   1/1     Running   0            5m11s

$ kubectl exec emqx-deployment-5884fd896f-84wxw -- emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['emqx@172.20.32.169','emqx@172.20.58.237',
                       'emqx@172.20.85.197'],
                  stopped_nodes => ['emqx@172.20.32.168']}

  输出结果表明 EMQX Broker 会正确的显示已经停掉的 Pod,并将 Deployment 新建的 Pod 加入集群。至此,EMQX Broker 在 kubernetes 上成功建立集群。

7. 持久化 EMQX Broker 集群

  上文中使用的 Deployment 来管理 Pod,但是 Pod 的网络是不停变动的,而且当 Pod 被销毁重建时,储存在 EMQX Broker 的数据和配置也就随之消失了,这在生产中是不能接受的,接下来尝试把 EMQX Broker 的集群持久化,即使 Pod 被销毁重建,EMQX Broker 的数据依然可以保存下来。

  ConfigMap 是 configMap 是一种 API 对象,用来将非机密性的数据保存到健值对中。使用时可以用作环境变量、命令行参数或者存储卷中的配置文件。

  ConfigMap 将您的环境配置信息和 容器镜像 解耦,便于应用配置的修改。

  接下来使用 ConfigMap 记录 EMQX Broker 的配置,并将它们以环境变量的方式导入到 Deployment 中。定义 Configmap,并部署:

vim configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: emqx-config
data:
  EMQX_CLUSTER__K8S__ADDRESS_TYPE: "hostname"
  EMQX_CLUSTER__K8S__APISERVER: "https://kubernetes.default.svc:443"
  EMQX_CLUSTER__K8S__SUFFIX: "svc.cluster.local"

$ kubectl apply -f configmap.yaml
configmap/emqx-config created

  配置 Deployment 来使用 Configmap

$ vim deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: emqx-deployment
  labels:
    app: emqx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: emqx
  template:
    metadata:
      labels:
        app: emqx
    spec:
      serviceAccountName: emqx
      containers:
      - name: emqx
        image: emqx/emqx:v4.1-rc.1
        ports:
        - name: mqtt
          containerPort: 1883
        - name: mqttssl
          containerPort: 8883
        - name: mgmt
          containerPort: 8081
        - name: ws
          containerPort: 8083
        - name: wss
          containerPort: 8084
        - name: dashboard
          containerPort: 18083
        envFrom:
          - configMapRef:
              name: emqx-config

  重新部署 Deployment,查看状态:

$ kubectl delete deployment emqx-deployment
deployment.apps "emqx-deployment" deleted
$ kubectl apply -f deployment.yaml
deployment.apps/emqx-deployment created

$ kubectl get pods
NAME                               READY   STATUS    RESTARTS     AGE
emqx-deployment-5958799d95-7lfw6   1/1     Running   0            20s
emqx-deployment-5958799d95-hfrvt   1/1     Running   0            20s
emqx-deployment-5958799d95-q6d7p   1/1     Running   0            20s

$ kubectl exec emqx-deployment-5958799d95-7lfw6 -- emqx_ctl status
Node 'emqx-deployment-5958799d95-7lfw6@172.20.58.246' is started
emqx 4.1-rc.1 is running

$ kubectl exec emqx-deployment-5958799d95-7lfw6 -- emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['emqx-deployment-5958799d95-7lfw6@172.20.58.246'],
                  stopped_nodes => []}

  解决:

$ vim configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: emqx-config
data:
  EMQX_NAME: "emqx"
  EMQX_CLUSTER__DISCOVERY: "k8s"
  EMQX_CLUSTER__K8S__APP_NAME: "emqx"
  EMQX_CLUSTER__K8S__SERVICE_NAME: "emqx-service"
  EMQX_CLUSTER__K8S__APISERVER: "https://kubernetes.default.svc:443"
  EMQX_CLUSTER__K8S__NAMESPACE: "default"

$ kubectl delete configmap emqx-config
configmap "emqx-config" deleted
$ kubectl delete deployment emqx-deployment
deployment.apps "emqx-deployment" deleted

$ kubectl get pods
NAME                               READY   STATUS    RESTARTS       AGE
emqx-deployment-5958799d95-6vkth   1/1     Running   0              76m
emqx-deployment-5958799d95-dtnwf   1/1     Running   0              76m
emqx-deployment-5958799d95-j4vcf   1/1     Running   0              76m
  
$ kubectl exec emqx-deployment-5958799d95-6vkth -- emqx_ctl status
Node 'emqx@172.20.58.245' is started
emqx 4.1-rc.1 is running

$ kubectl exec emqx-deployment-5958799d95-6vkth -- emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['emqx@172.20.32.143','emqx@172.20.58.245',
                       'emqx@172.20.85.211'],
                  stopped_nodes => []}

  EMQX Broker 的配置文件已经解耦到 Configmap 中了,如果有需要,可以自由的配置一个或多个 Configmap,并把它们作为环境变量或是文件引入到 Pod 内。

8. StatefulSet

  StatefulSet 是为了解决有状态服务的问题(对应 Deployments 和 ReplicaSets 是为无状态服务而设计),其应用场景包括

  • 稳定的持久化存储,即 Pod 重新调度后还是能访问到相同的持久化数据,基于 PVC 来实现
  • 稳定的网络标志,即 Pod 重新调度后其 PodName 和 HostName 不变,基于 Headless Service(即没有Cluster IP的Service)来实现
  • 稳定的网络标志,即 Pod 重新调度后其 PodName 和 HostName 不变,基于 Headless Service(即没有Cluster IP的Service)来实现
  • 有序收缩,有序删除(即从N-1到0)

  从上面的应用场景可以发现,StatefulSet由以下几个部分组成:

  • 用于定义网络标志(DNS domain)的 Headless Service
  • 用于创建 PersistentVolumes 的 volumeClaimTemplates
  • 定义具体应用的 StatefulSet

  StatefulSet 中每个 Pod 的 DNS 格式为 statefulSetName-{0..N-1}.serviceName.namespace.svc.cluster.local ,其中

  • serviceName 为 Headless Service 的名字
  • 0..N-1 为 Pod 所在的序号,从 0 开始到 N-1
  • statefulSetName 为StatefulSet的名字
  • namespace 为服务所在的 namespace,Headless Servic 和 StatefulSet 必须在相同的 namespace
  • .cluster.local 为 Cluster Domain

  接下来使用 StatefulSet 代替 Deployment 来管理 Pod。删除 Deployment:

$ kubectl delete deployment emqx-deployment
deployment.apps "emqx-deployment" deleted

  定义 StatefulSet:

vim statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: emqx-statefulset
  labels:
    app: emqx
spec:
  serviceName: emqx-headless
  updateStrategy:
    type: RollingUpdate
  replicas: 3
  selector:
    matchLabels:
      app: emqx
  template:
    metadata:
      labels:
        app: emqx
    spec:
      serviceAccountName: emqx
      containers:
      - name: emqx
        image: emqx/emqx:v4.1-rc.1
        ports:
        - name: mqtt
          containerPort: 1883
        - name: mqttssl
          containerPort: 8883
        - name: mgmt
          containerPort: 8081
        - name: ws
          containerPort: 8083
        - name: wss
          containerPort: 8084
        - name: dashboard
          containerPort: 18083
        envFrom:
          - configMapRef:
              name: emqx-config

  注意,StatefulSet 需要 Headless Service 来实现稳定的网络标志,因此需要再定义一个 Service

$ vim headless.yaml
apiVersion: v1
kind: Service
metadata:
  name: emqx-headless
spec:
  type: ClusterIP
  clusterIP: None
  selector:
    app: emqx
  ports:
  - name: mqtt
    port: 1883
    protocol: TCP
    targetPort: 1883
  - name: mqttssl
    port: 8883
    protocol: TCP
    targetPort: 8883
  - name: mgmt
    port: 8081
    protocol: TCP
    targetPort: 8081
  - name: websocket
    port: 8083
    protocol: TCP
    targetPort: 8083
  - name: wss
    port: 8084
    protocol: TCP
    targetPort: 8084
  - name: dashboard
    port: 18083
    protocol: TCP
    targetPort: 18083

  因为 Headless Service 并不需要 IP,所以配置了 clusterIP: None 。部署相应的资源:

$ kubectl apply -f headless.yaml
service/emqx-headless created

$ kubectl get svc
NAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                                                       AGE
emqx-headless            ClusterIP   None            <none>        1883/TCP,8883/TCP,8081/TCP,8083/TCP,8084/TCP,18083/TCP                        53s
    emqx-service             ClusterIP   10.68.228.164   <none>        1883/TCP,8883/TCP,8081/TCP,8083/TCP,8084/TCP,18083/TCP                        21h

$ kubectl apply -f statefulset.yaml
statefulset.apps/emqx-statefulset created

$ kubectl get pods
NAME                       READY   STATUS    RESTARTS       AGE
emqx-statefulset-0         1/1     Running   0              2m
emqx-statefulset-1         1/1     Running   0              24s
emqx-statefulset-2         1/1     Running   0              22s

$ kubectl exec emqx-statefulset-0 -- emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['emqx@172.20.32.174','emqx@172.20.58.248',
                       'emqx@172.20.85.214'],
                  stopped_nodes => []}

  更新 Configmap:StatefulSet 提供了稳定的网络标志,EMQX Broker 支持使用 hostname 和 dns 规则来代提 IP 实现集群,以 hostname 为例,需要修改 emqx.conf:

cluster.kubernetes.address_type = hostname
cluster.kubernetes.suffix = svc.cluster.local

  kubernetes 集群中 Pod 的 DNS 规则可以由用户自定义,EMQX Broker 提供了 cluster.kubernetes.suffix 方便用户匹配自定的 DNS 规则,本文使用默认的 DNS 规则:statefulSetName-{0..N-1}.serviceName.namespace.svc.cluster.local ,DNS 规则中的 serviceName 为 StatefulSet 使用的 Headless Service,所以还需要将 cluster.kubernetes.service_name 修改为 Headless Service Name。

  将配置项转为环境变量,需要在 Configmap 中配置:

EMQX_CLUSTER__K8S__ADDRESS_TYPE: "hostname"
EMQX_CLUSTER__K8S__SUFFIX: "svc.cluster.local"
EMQX_CLUSTER__K8S__SERVICE_NAME: emqx-headless

  Configmap 提供了热更新功能,执行 $ kubectl edit configmap emqx-config 来热更新 Configmap。

$ kubectl edit configmap emqx-config
# 修改前
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
  EMQX_CLUSTER__DISCOVERY: k8s
  EMQX_CLUSTER__K8S__APISERVER: https://kubernetes.default.svc:443
  EMQX_CLUSTER__K8S__APP_NAME: emqx
  EMQX_CLUSTER__K8S__NAMESPACE: default
  EMQX_CLUSTER__K8S__SERVICE_NAME: emqx-service
  EMQX_NAME: emqx
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"EMQX_CLUSTER__DISCOVERY":"k8s","EMQX_CLUSTER__K8S__APISERVER":"https://kubernetes.default.svc:443","EMQX_CLUSTER__K8S__APP_NAME":"emqx","EMQX_CLUSTER__K8S__NAMESPACE":"default","EMQX_CLUSTER__K8S__SERVICE_NAME":"emqx-service","EMQX_NAME":"emqx"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"emqx-config","namespace":"default"}}
  creationTimestamp: "2024-07-12T05:26:16Z"
  name: emqx-config
  namespace: default
  resourceVersion: "1734159"
  uid: 13ccc86a-9ab3-450d-8a91-d73a7745d75b
  
# 修改后
apiVersion: v1
data:
  EMQX_CLUSTER__DISCOVERY: k8s
  EMQX_CLUSTER__K8S__APISERVER: https://kubernetes.default.svc:443
  EMQX_CLUSTER__K8S__APP_NAME: emqx
  EMQX_CLUSTER__K8S__NAMESPACE: default
  EMQX_CLUSTER__K8S__SERVICE_NAME: emqx-headless
  EMQX_NAME: emqx
  EMQX_CLUSTER__K8S__ADDRESS_TYPE: hostname
  EMQX_CLUSTER__K8S__SUFFIX: svc.cluster.local
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"EMQX_CLUSTER__DISCOVERY":"k8s","EMQX_CLUSTER__K8S__APISERVER":"https://kubernetes.default.svc:443","EMQX_CLUSTER__K8S__APP_NAME":"emqx","EMQX_CLUSTER__K8S__NAMESPACE":"default","EMQX_CLUSTER__K8S__SERVICE_NAME":"emqx-headless","EMQX_NAME":"emqx","EMQX_CLUSTER__K8S__ADDRESS_TYPE":"hostname","EMQX_CLUSTER__K8S__SUFFIX":"svc.cluster.local"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"emqx-config","namespace":"default"}}
  creationTimestamp: "2024-07-12T05:26:16Z"
  name: emqx-config
  namespace: default
  resourceVersion: "1734159"
  uid: 13ccc86a-9ab3-450d-8a91-d73a7745d75b

  重新部署 StatefulSet:Configmap 更新之后 Pod 并不会重启,需要我们手动更新 StatefulSet

$ kubectl delete statefulset emqx-statefulset
statefulset.apps "emqx-statefulset" deleted
$ kubectl apply -f statefulset.yaml
statefulset.apps/emqx-statefulset created

$ kubectl exec emqx-statefulset-0 -- emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['emqx@emqx-statefulset-0.emqx-headless.default.svc.cluster.local'],
                  stopped_nodes => []}
                  
$ kubectl exec emqx-statefulset-0 -- emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['emqx@emqx-statefulset-0.emqx-headless.default.svc.cluster.local'],
                  stopped_nodes => []}
                  
$ kubectl logs pod/emqx-statefulset-0 -n default
...
(emqx@emqx-statefulset-0.emqx-headless.default.svc.cluster.local)1> 2024-07-12 07:59:16.261 [error] Ekka(AutoCluster): Discover error: {badkey,<<"hostname">>}
[{maps,get,
       [<<"hostname">>,
        #{<<"ip">> => <<"172.20.32.158">>,<<"nodeName">> => <<"master-01">>,
          <<"targetRef">> =>
              #{<<"kind">> => <<"Pod">>,<<"name">> => <<"my-emqx-0">>,
                <<"namespace">> => <<"default">>,
                <<"uid">> => <<"de590aa1-d6da-42e0-9515-fe7f1fa68dfe">>}}],
       []},
 {ekka_cluster_k8s,extract_host,2,
                   [{file,"/emqx_rel/_build/emqx/lib/ekka/src/ekka_cluster_k8s.erl"},
                    {line,114}]},
 {ekka_cluster_k8s,'-extract_addresses/2-lc$^1/1-1-',2,
                   [{file,"/emqx_rel/_build/emqx/lib/ekka/src/ekka_cluster_k8s.erl"},
                    {line,110}]},
 {ekka_cluster_k8s,'-extract_addresses/2-lc$^1/1-1-',2,
                   [{file,"/emqx_rel/_build/emqx/lib/ekka/src/ekka_cluster_k8s.erl"},
                    {line,111}]},
 {ekka_cluster_k8s,'-extract_addresses/2-lc$^0/1-0-',2,
                   [{file,"/emqx_rel/_build/emqx/lib/ekka/src/ekka_cluster_k8s.erl"},
                    {line,111}]},
 {ekka_cluster_k8s,extract_addresses,2,
                   [{file,"/emqx_rel/_build/emqx/lib/ekka/src/ekka_cluster_k8s.erl"},
                    {line,112}]},
 {ekka_cluster_k8s,discover,1,
                   [{file,"/emqx_rel/_build/emqx/lib/ekka/src/ekka_cluster_k8s.erl"},
                    {line,48}]},
 {ekka_autocluster,discover_and_join,2,
                   [{file,"/emqx_rel/_build/emqx/lib/ekka/src/ekka_autocluster.erl"},
                    {line,125}]}]

  原因:我当时是已经存在一套 emqx 集群在 k8s 上且在正常使用,将这套 my-emqx 的删除即可。

$ kubectl get pods
NAME                 READY   STATUS    RESTARTS   AGE
emqx-statefulset-0   1/1     Running   0          115s
emqx-statefulset-1   1/1     Running   0          112s
emqx-statefulset-2   1/1     Running   0          110s

$ kubectl exec emqx-statefulset-2 -- emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['emqx@emqx-statefulset-0.emqx-headless.default.svc.cluster.local',
                       'emqx@emqx-statefulset-1.emqx-headless.default.svc.cluster.local',
                       'emqx@emqx-statefulset-2.emqx-headless.default.svc.cluster.local'],
                  stopped_nodes => []}

  注意:当改为版本 5.7.1 版本的时候,修改 statefulset.yaml 相关内容为 image: emqx/emqx:5.7.1;configmap.yaml 文件里的配置参数名称有变化,需要改为

apiVersion: v1
kind: ConfigMap
metadata:
  name: emqx-config
data:
  EMQX_NAME: "emqx"
  EMQX_CLUSTER__DISCOVERY_STRATEGY: "k8s"
  EMQX_CLUSTER__K8S__APISERVER: "https://kubernetes.default.svc:443"
  EMQX_CLUSTER__K8S__SERVICE_NAME: "emqx-headless"
  EMQX_CLUSTER__K8S__NAMESPACE: "default"
  EMQX_CLUSTER__K8S__ADDRESS_TYPE: "hosname"
  EMQX_CLUSTER__K8S__SUFFIX: "svc.cluster.local"

  可以看到新的 EMQX Broker 集群已经成功的建立起来了。中止一个 Pod:StatefulSet 中的 Pod 重新调度后其 PodName 和 HostName 不变,下面来尝试一下:

$ kubectl get pods
kuNAME                 READY   STATUS    RESTARTS   AGE
emqx-statefulset-0   1/1     Running   0          6m20s
emqx-statefulset-1   1/1     Running   0          6m17s
emqx-statefulset-2   1/1     Running   0          6m15s

$ kubectl delete pod emqx-statefulset-0
pod "emqx-statefulset-0" deleted

$ kubectl get pods
NAME                 READY   STATUS    RESTARTS   AGE
emqx-statefulset-0   1/1     Running   0          27s
emqx-statefulset-1   1/1     Running   0          9m45s
emqx-statefulset-2   1/1     Running   0          9m43s

$ kubectl exec emqx-statefulset-2 -- emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['emqx@emqx-statefulset-0.emqx-headless.default.svc.cluster.local',
                       'emqx@emqx-statefulset-1.emqx-headless.default.svc.cluster.local',
                       'emqx@emqx-statefulset-2.emqx-headless.default.svc.cluster.local'],
                  stopped_nodes => []}

  跟预期的一样,StatefulSet 重新调度了一个具有相同网络标志的 Pod,Pod 中的 EMQX Broker 也成功的加入了集群。

9. StorageClasses、PersistentVolume 和 PersistentVolumeClaim

  PersistentVolume(PV)是由管理员设置的存储,它是群集的一部分。就像节点是集群中的资源一样,PV 也是集群中的资源。 PV 是 Volume 之类的卷插件,但具有独立于使用 PV 的 Pod 的生命周期。此 API 对象包含存储实现的细节,即 NFS、iSCSI 或特定于云供应商的存储系统。

  PersistentVolumeClaim(PVC)是用户存储的请求。它与 Pod 相似。Pod 消耗节点资源,PVC 消耗 PV 资源。Pod 可以请求特定级别的资源(CPU 和内存)。声明可以请求特定的大小和访问模式(例如,可以以读/写一次或 只读多次模式挂载)。

  StorageClass 为管理员提供了描述存储 “class(类)” 的方法。 不同的 class 可能会映射到不同的服务质量等级或备份策略,或由群集管理员确定的任意策略。 Kubernetes 本身不清楚各种 class 代表的什么。这个概念在其他存储系统中有时被称为“配置文件”。

  在部署 EMQX Broker 的时候,可以预先创建好 PV 或 StorageClass,然后利用 PVC 将 EMQX Broker 的 /opt/emqx/data/mnesia 目录挂载出来,当Pods被重新调度之后,EMQX 会从 /opt/emqx/data/mnesia 目录中获取数据并恢复,从而实现 EMQX Broker 的持久化。

  定义 StatefulSet:

$ vim statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: emqx-statefulset
  labels:
    app: emqx
spec:
  replicas: 3
  serviceName: emqx-headless
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app: emqx
  template:
    metadata:
      labels:
        app: emqx
    spec:
      volumes:
      - name: emqx-data
        persistentVolumeClaim:
          claimName: emqx-pvc
      serviceAccountName: emqx
      containers:
      - name: emqx
        image: emqx/emqx:v4.1-rc.1
        ports:
        - name: mqtt
          containerPort: 1883
        - name: mqttssl
          containerPort: 8883
        - name: mgmt
          containerPort: 8081
        - name: ws
          containerPort: 8083
        - name: wss
          containerPort: 8084
        - name: dashboard
          containerPort: 18083
        envFrom:
          - configMapRef:
              name: emqx-config
        volumeMounts:
        - name: emqx-data
          mountPath: "/opt/emqx/data/mnesia"
  volumeClaimTemplates:
  - metadata:
      name: emqx-pvc
      annotations:
        volume.alpha.kubernetes.io/storage-class: manual
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

  该文件首先通过 volumeClaimTemplates 指定了使用 StorageClass 的 name 为 manual 的存储类创建名称为 emqx-pvc 的 PVC 资源,PVC 资源的读写模式为 ReadWriteOnce,需要 1Gi 的空间,然后将此 PVC 定义为 name 为 emqx-data 的 volumes,并将此 volumes 挂载在 Pod 中的 /opt/emqx/data/mnesia 目录下。

$ kubectl get pvc
NAME                                  STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
emqx-pvc-emqx-statefulset-0           Pending

$ kubectl get pods
NAME                       READY   STATUS    RESTARTS   AGE
emqx-statefulset-0         0/1     Pending   0          75s

$ kubectl describe pvc emqx-pvc-emqx-statefulset-0
Events:
  Type    Reason         Age               From                         Message
  ----    ------         ----              ----                         -------
  Normal  FailedBinding  4s (x2 over 15s)  persistentvolume-controller  no persistent volumes available for this claim and no storage class is set

$ kubectl describe pod emqx-statefulset-0
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  109s  default-scheduler  0/3 nodes are available: persistentvolumeclaim "local-pv-sdc" not found. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod..

  解决:手动创建 emqx-pvc。参考:在K8S中使用Local持久卷

$ kubectl delete statefulset emqx-statefulset
statefulset.apps "emqx-statefulset" deleted

$ kubectl delete pvc emqx-pvc-emqx-statefulset-0
persistentvolumeclaim "emqx-pvc-emqx-statefulset-0" deleted

$ vim emqx-pvc.yml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: emqx-pvc
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: local-redis-storage
  resources:
    requests:
      storage: 1Gi

$ kubectl create -f emqx-pvc.yml
persistentvolumeclaim/emqx-pvc created

$ kubectl get pvc
NAME                                  STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS          AGE
emqx-pvc           Pending                                                                        local-redis-storage   5s

$ vim statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: emqx-statefulset
  labels:
    app: emqx
spec:
  replicas: 3
  serviceName: emqx-headless
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app: emqx
  template:
    metadata:
      labels:
        app: emqx
    spec:
      volumes:
      - name: emqx-data
        persistentVolumeClaim:
          claimName: emqx-pvc
      serviceAccountName: emqx
      containers:
      - name: emqx
        image: emqx/emqx:v4.1-rc.1
        ports:
        - name: mqtt
          containerPort: 1883
        - name: mqttssl
          containerPort: 8883
        - name: mgmt
          containerPort: 8081
        - name: ws
          containerPort: 8083
        - name: wss
          containerPort: 8084
        - name: dashboard
          containerPort: 18083
        envFrom:
          - configMapRef:
              name: emqx-config
        volumeMounts:
        - name: emqx-data
          mountPath: "/opt/emqx/data/mnesia"

$ kubectl apply -f statefulset.yaml
statefulset.apps/emqx-statefulset created

$ kubectl get pvc
NAME                                  STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS          AGE
emqx-pvc           Bound    pvc-3fd906c3-5372-4e92-b971-062bc235b268   1Gi        RWO            local-redis-storage   22s

$ kubectl get pods
NAME                       READY   STATUS    RESTARTS   AGE
emqx-statefulset-0         1/1     Running   0          11s
emqx-statefulset-1         1/1     Running   0          6s
emqx-statefulset-2         1/1     Running   0          5s

  总感觉原文章里配置的有问题,如果想有三个 pvc 的话也得手动创建,无法像原文章里那样,而且他们的前置条件是 emqx-pvc 已经存在。参考:helm或者k8s部署pod时遇到pod一直处于pending状态

$ kubectl get pvc emqx-pvc-emqx-statefulset-0 -o yaml > emqx-pvc-emqx-statefulset-0.yaml

$ vim emqx-pvc-emqx-statefulset-0.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    volume.alpha.kubernetes.io/storage-class: local-redis-storage
  creationTimestamp: "2024-07-18T05:09:03Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app: emqx
  name: emqx-pvc-emqx-statefulset-0
  namespace: default
  resourceVersion: "3204901"
  uid: 74f369fb-08d0-41a7-8230-83fddb0efe50
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  volumeMode: Filesystem
status:
  phase: Pending

# 添加如下内容:
spec:
  storageClassName: local-redis-storage

$ kubectl create -f emqx-pvc-emqx-statefulset-0.yaml
persistentvolumeclaim/emqx-pvc-emqx-statefulset-0 created

$ kubectl get pvc
NAME                                  STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS          AGE
emqx-pvc-emqx-statefulset-0           Pending                                                                        local-redis-storage   5s

$ kubectl get pvc emqx-pvc-emqx-statefulset-0 -o yaml > emqx-pvc-emqx-statefulset-1.yaml
# 同样添加 storageClassName: local-redis-storage 并修改 name 为 emqx-pvc-emqx-statefulset-1
$ kubectl create -f emqx-pvc-emqx-statefulset-1.yaml

$ kubectl get pvc emqx-pvc-emqx-statefulset-0 -o yaml > emqx-pvc-emqx-statefulset-2.yaml
# 同样添加 storageClassName: local-redis-storage 并修改 name 为 emqx-pvc-emqx-statefulset-2
$ kubectl create -f emqx-pvc-emqx-statefulset-2.yaml

$ kubectl get pvc
NAME                                  STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS          AGE
emqx-pvc                              Bound     pvc-c12d4f80-07c8-4c72-8a97-765f74f34d53   1Gi        RWO            local-redis-storage   50m
emqx-pvc-emqx-statefulset-0           Pending                                                                        local-redis-storage   5m22s
emqx-pvc-emqx-statefulset-1           Pending                                                                        local-redis-storage   3s
emqx-pvc-emqx-statefulset-2           Pending                                                                        local-redis-storage   70s

$ vim statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: emqx-statefulset
  labels:
    app: emqx
spec:
  replicas: 3
  serviceName: emqx-headless
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app: emqx
  template:
    metadata:
      labels:
        app: emqx
    spec:
      volumes:
      - name: emqx-data
        persistentVolumeClaim:
          claimName: emqx-pvc
      serviceAccountName: emqx
      containers:
      - name: emqx
        image: emqx/emqx:v4.1-rc.1
        ports:
        - name: mqtt
          containerPort: 1883
        - name: mqttssl
          containerPort: 8883
        - name: mgmt
          containerPort: 8081
        - name: ws
          containerPort: 8083
        - name: wss
          containerPort: 8084
        - name: dashboard
          containerPort: 18083
        envFrom:
          - configMapRef:
              name: emqx-config
        volumeMounts:
        - name: emqx-data
          mountPath: "/opt/emqx/data/mnesia"
  volumeClaimTemplates:
  - metadata:
      name: emqx-pvc
      annotations:
        volume.alpha.kubernetes.io/storage-class: local-redis-storage
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

$ kubectl apply -f statefulset.yaml
statefulset.apps/emqx-statefulset created

$ kubectl get pods
NAME                       READY   STATUS                   RESTARTS   AGE
emqx-statefulset-0         1/1     Running                  0          2m3s
emqx-statefulset-1         1/1     Running                  0          117s
emqx-statefulset-2         1/1     Running                  0          112s

$ kubectl get pvc
NAME                                  STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS          AGE
emqx-pvc                              Bound    pvc-c12d4f80-07c8-4c72-8a97-765f74f34d53   1Gi        RWO            local-redis-storage   54m
emqx-pvc-emqx-statefulset-0           Bound    pvc-a718e44c-0b2d-433c-be49-d264eb44b878   1Gi        RWO            local-redis-storage   8m39s
emqx-pvc-emqx-statefulset-1           Bound    pvc-1bb39cb2-4a23-49ab-b874-756cd1a2502f   1Gi        RWO            local-redis-storage   3m20s
emqx-pvc-emqx-statefulset-2           Bound    pvc-7df2cdf4-fc81-47f8-b02b-ffa8e8bb0ae6   1Gi        RWO            local-redis-storage   4m27s

  综上所述,也可以不用手动创建 emqx-pvc-emqx-statefulset-0emqx-pvc-emqx-statefulset-1emqx-pvc-emqx-statefulset-2,其实就是在 statefulset.yaml 中加一行配置的事,最终的文件内容为:

$ vim statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: emqx-statefulset
  labels:
    app: emqx
spec:
  replicas: 3
  serviceName: emqx-headless
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app: emqx
  template:
    metadata:
      labels:
        app: emqx
    spec:
      volumes:
      - name: emqx-data
        persistentVolumeClaim:
          claimName: emqx-pvc
      serviceAccountName: emqx
      containers:
      - name: emqx
        image: emqx/emqx:v4.1-rc.1
        ports:
        - name: mqtt
          containerPort: 1883
        - name: mqttssl
          containerPort: 8883
        - name: mgmt
          containerPort: 8081
        - name: ws
          containerPort: 8083
        - name: wss
          containerPort: 8084
        - name: dashboard
          containerPort: 18083
        envFrom:
          - configMapRef:
              name: emqx-config
        volumeMounts:
        - name: emqx-data
          mountPath: "/opt/emqx/data/mnesia"
  volumeClaimTemplates:
  - metadata:
      name: emqx-pvc
      annotations:
        volume.alpha.kubernetes.io/storage-class: local-redis-storage
    spec:
      storageClassName: local-redis-storage
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

$ kubectl get pvc
NAME                                  STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS          AGE
emqx-pvc                              Bound    pvc-ce861efc-751c-4fd5-b96f-2605fb41e7d2   1Gi        RWO            local-redis-storage   15m

$ kubectl apply -f statefulset.yaml
statefulset.apps/emqx-statefulset created

kubectl get pvc
NAME                                  STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS          AGE
emqx-pvc                              Bound    pvc-ce861efc-751c-4fd5-b96f-2605fb41e7d2   1Gi        RWO            local-redis-storage   15m
emqx-pvc-emqx-statefulset-0           Bound    pvc-21a89457-5c05-4df3-b447-c7fad1424df8   1Gi        RWO            local-redis-storage   3m39s
emqx-pvc-emqx-statefulset-1           Bound    pvc-234da635-d316-4256-b559-98dab56cc5a4   1Gi        RWO            local-redis-storage   3m33s
emqx-pvc-emqx-statefulset-2           Bound    pvc-ec3bcfee-27ac-4776-8662-aa504676ca6a   1Gi        RWO            local-redis-storage   3m28s

$ kubectl get pods
NAME                                 READY   STATUS                   RESTARTS   AGE
emqx-statefulset-0                   1/1     Running                  0          5m43s
emqx-statefulset-1                   1/1     Running                  0          5m37s
emqx-statefulset-2                   1/1     Running                  0          5m32s

  输出结果表明该 PVC 的状态为 Bound,PVC 存储已经成功的建立了,当 Pod 被重新调度时,EMQX Broker 会读取挂载到 PVC 中的数据,从而实现持久化。

相关推荐

  1. kubekey部署k8s

    2024-07-19 11:56:01       58 阅读

最近更新

  1. docker php8.1+nginx base 镜像 dockerfile 配置

    2024-07-19 11:56:01       66 阅读
  2. Could not load dynamic library ‘cudart64_100.dll‘

    2024-07-19 11:56:01       70 阅读
  3. 在Django里面运行非项目文件

    2024-07-19 11:56:01       57 阅读
  4. Python语言-面向对象

    2024-07-19 11:56:01       68 阅读

热门阅读

  1. 【.NET】图形库SkiaSharp

    2024-07-19 11:56:01       20 阅读
  2. OpenCV教程:cv2图像逻辑运算

    2024-07-19 11:56:01       19 阅读
  3. 学习补充008-xx-01 Migrations Overview(迁移概述)

    2024-07-19 11:56:01       19 阅读
  4. 最长上升子序列模板(LIS)

    2024-07-19 11:56:01       21 阅读
  5. Apache-BeanUtils VS SpringBean-Utils

    2024-07-19 11:56:01       15 阅读