记一次Kubernetes使用Cilium，coredns健康检查失败的排错

该问题最后在Github的Cilium项目中提了issue，被大佬解决了。

issue地址：https://github.com/cilium/cilium/issues/20498

问题描述

由于后面打算学习eBPF，将原来用的CNI组件从flannel换成cilium。Cilium作为基于eBPF的CNI实现，比起flannel有了更多的功能，包括但不限于高度定制化的网络策略、安全加固。

但是，我在使用cilium时，出现如下问题，以下是集群部署的过程，及问题发现和问题描述。

版本信息如下：

Kubernetes 1.23.0 （kubelet/kubeadm/kubectl都对应集群版本）

Cilium 1.11.6

使用kubeadm --config kubeadm.conf 初始化集群，初始化的配置文件如下。

kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.153.21
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  imagePullPolicy: IfNotPresent
  name: nm
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.23.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.5.0.0/16
scheduler: {}
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
resolvConf: /run/systemd/resolve/resolv.conf

然后像往常一样加入工作节点，测试集群使用1Master和2Worker，其IP如下。

master: 192.168.153.21
worker1: 192.168.153.22
worker2: 192.168.153.23

安装网络组件，使用cilium install进行安装，其中一些cilium的配置保持默认。

root@nm:/work-place/kubernetes/create-cluster# kubectl get pods -A -o wide
NAMESPACE     NAME                               READY   STATUS             RESTARTS        AGE   IP               NODE   NOMINATED NODE   READINESS GATES
kube-system   cilium-99lxc                       1/1     Running            0               18m   192.168.153.22   na     <none>           <none>
kube-system   cilium-ct5s7                       1/1     Running            0               18m   192.168.153.21   nm     <none>           <none>
kube-system   cilium-drtlh                       1/1     Running            0               18m   192.168.153.23   nb     <none>           <none>
kube-system   cilium-operator-5d67fc458d-zxgdd   1/1     Running            0               18m   192.168.153.22   na     <none>           <none>
kube-system   coredns-6d8c4cb4d-jkssb            0/1     Running            8 (2m55s ago)   19m   10.0.0.240       na     <none>           <none>
kube-system   coredns-6d8c4cb4d-psxvw            0/1     CrashLoopBackOff   8 (83s ago)     19m   10.0.2.176       nb     <none>           <none>
kube-system   etcd-nm                            1/1     Running            2               25m   192.168.153.21   nm     <none>           <none>
kube-system   kube-apiserver-nm                  1/1     Running            2               25m   192.168.153.21   nm     <none>           <none>
kube-system   kube-controller-manager-nm         1/1     Running            2               25m   192.168.153.21   nm     <none>           <none>
kube-system   kube-proxy-hv5nc                   1/1     Running            0               24m   192.168.153.22   na     <none>           <none>
kube-system   kube-proxy-pbzlx                   1/1     Running            0               24m   192.168.153.23   nb     <none>           <none>
kube-system   kube-proxy-rqpxw                   1/1     Running            0               25m   192.168.153.21   nm     <none>           <none>
kube-system   kube-scheduler-nm                  1/1     Running            2               25m   192.168.153.21   nm     <none>           <none>

安装cilium之前，我已经清理的之前的CNI配置，将/etc/cni/net.d/下面的网络配置全部清除。但是出现了上述情况，两个coredns的Pod虽然处于Running状态但是始终不Ready。

然后我查看了coredns的log以及describe。

root@nm:/work-place/kubernetes/create-cluster# kubectl logs  coredns-6d8c4cb4d-jkssb -n kube-system
[WARNING] plugin/kubernetes: starting server with unsynced Kubernetes API
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.8.6
linux/amd64, go1.17.1, 13a9191
[ERROR] plugin/errors: 2 7607030484537686268.4300248127207674545. HINFO: read udp 10.0.0.240:39983->192.168.153.2:53: i/o timeout
[ERROR] plugin/errors: 2 7607030484537686268.4300248127207674545. HINFO: read udp 10.0.0.240:53240->192.168.153.2:53: i/o timeout
[ERROR] plugin/errors: 2 7607030484537686268.4300248127207674545. HINFO: read udp 10.0.0.240:49802->192.168.153.2:53: i/o timeout
[ERROR] plugin/errors: 2 7607030484537686268.4300248127207674545. HINFO: read udp 10.0.0.240:54428->192.168.153.2:53: i/o timeout
[ERROR] plugin/errors: 2 7607030484537686268.4300248127207674545. HINFO: read udp 10.0.0.240:43974->192.168.153.2:53: i/o timeout
[ERROR] plugin/errors: 2 7607030484537686268.4300248127207674545. HINFO: read udp 10.0.0.240:37821->192.168.153.2:53: i/o timeout
[ERROR] plugin/errors: 2 7607030484537686268.4300248127207674545. HINFO: read udp 10.0.0.240:36545->192.168.153.2:53: i/o timeout
[ERROR] plugin/errors: 2 7607030484537686268.4300248127207674545. HINFO: read udp 10.0.0.240:56785->192.168.153.2:53: i/o timeout
[ERROR] plugin/errors: 2 7607030484537686268.4300248127207674545. HINFO: read udp 10.0.0.240:47913->192.168.153.2:53: i/o timeout
[ERROR] plugin/errors: 2 7607030484537686268.4300248127207674545. HINFO: read udp 10.0.0.240:38162->192.168.153.2:53: i/o timeout
[INFO] SIGTERM: Shutting down servers then terminating
[INFO] plugin/health: Going into lameduck mode for 5s

coredns对其上游的网关UDP服务不可达，这里的192.168.153.2是我虚拟机网络的网关。

root@nm:/work-place/kubernetes/create-cluster# kubectl describe pod  coredns-6d8c4cb4d-jkssb -n kube-system
Name:                 coredns-6d8c4cb4d-jkssb
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 na/192.168.153.22
Start Time:           Wed, 13 Jul 2022 00:37:51 +0800
Labels:               k8s-app=kube-dns
                      pod-template-hash=6d8c4cb4d
Annotations:          <none>
Status:               Running
IP:                   10.0.0.240
IPs:
  IP:           10.0.0.240
Controlled By:  ReplicaSet/coredns-6d8c4cb4d
Containers:
  coredns:
    Container ID:  docker://cc35b97903b120cb54765641da47c69ea8c833e6c72958407c7e605a5aa001b4
    Image:         registry.aliyuncs.com/google_containers/coredns:v1.8.6
    Image ID:      docker-pullable://registry.aliyuncs.com/google_containers/coredns@sha256:5b6ec0d6de9baaf3e92d0f66cd96a25b9edbce8716f5f15dcd1a616b3abd590e
    Ports:         53/UDP, 53/TCP, 9153/TCP
    Host Ports:    0/UDP, 0/TCP, 0/TCP
    Args:
      -conf
      /etc/coredns/Corefile
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 13 Jul 2022 00:57:12 +0800
      Finished:     Wed, 13 Jul 2022 00:59:06 +0800
    Ready:          False
    Restart Count:  8
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness:    http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /etc/coredns from config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-v8hzn (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      coredns
    Optional:  false
  kube-api-access-v8hzn:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 CriticalAddonsOnly op=Exists
                             node-role.kubernetes.io/control-plane:NoSchedule
                             node-role.kubernetes.io/master:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age   From               Message
  ----     ------                  ----  ----               -------
  Normal   Scheduled               21m   default-scheduler  Successfully assigned kube-system/coredns-6d8c4cb4d-jkssb to na
  Warning  FailedCreatePodSandBox  20m   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8ae4c118e4c3ff1c0bd2c601c808cae2c17cbc27552fb148b755b7d798f0bb71" network for pod "coredns-6d8c4cb4d-jkssb": networkPlugin cni failed to set up pod "coredns-6d8c4cb4d-jkssb_kube-system" network: unable to connect to Cilium daemon: failed to create cilium agent client after 30.000000 seconds timeout: Get "http:///var/run/cilium/cilium.sock/v1/config": dial unix /var/run/cilium/cilium.sock: connect: no such file or directory
Is the agent running?
  Normal   SandboxChanged  20m                   kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled          20m                   kubelet  Container image "registry.aliyuncs.com/google_containers/coredns:v1.8.6" already present on machine
  Normal   Created         20m                   kubelet  Created container coredns
  Normal   Started         20m                   kubelet  Started container coredns
  Warning  Unhealthy       20m (x2 over 20m)     kubelet  Readiness probe failed: Get "http://10.0.0.240:8181/ready": dial tcp 10.0.0.240:8181: i/o timeout (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy       18m (x13 over 20m)    kubelet  Readiness probe failed: Get "http://10.0.0.240:8181/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy       15m (x12 over 19m)    kubelet  Liveness probe failed: Get "http://10.0.0.240:8080/health": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Normal   Killing         14m                   kubelet  Container coredns failed liveness probe, will be restarted

而describe给出的信息则是健康检查失败。奇怪的是，cilium的组件的状态则正常，但由于coredns状态异常，显然集群网络无法正常工作。

root@nm:/work-place/kubernetes/create-cluster# cilium status
    /¯¯\
 /¯¯\__/¯¯\    Cilium:         3 errors
 \__/¯¯\__/    Operator:       OK
 /¯¯\__/¯¯\    Hubble:         disabled
 \__/¯¯\__/    ClusterMesh:    disabled
    \__/

DaemonSet         cilium             Desired: 3, Ready: 3/3, Available: 3/3
Deployment        cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
Containers:       cilium             Running: 3
                  cilium-operator    Running: 1
Cluster Pods:     2/2 managed by Cilium
Image versions    cilium-operator    quay.io/cilium/operator-generic:v1.11.6@sha256:9f6063c7bcaede801a39315ec7c166309f6a6783e98665f6693939cf1701bc17: 1
                  cilium             quay.io/cilium/cilium:v1.11.6@sha256:f7f93c26739b6641a3fa3d76b1e1605b15989f25d06625260099e01c8243f54c: 3
Errors:           cilium             cilium-hn9g5    controller cilium-health-ep is failing since 27s (21x): Get "http://10.0.2.134:4240/hello": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
                  cilium             cilium-7l6br    controller cilium-health-ep is failing since 27s (21x): Get "http://10.0.0.36:4240/hello": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
                  cilium             cilium-rzkb6    controller cilium-health-ep is failing since 27s (21x): Get "http://10.0.1.222:4240/hello": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

补充信息

另外，一些关键组件的补充信息如下。

cilium sysdump

root@nm:/work-place/kubernetes/create-cluster# cilium sysdump
🔍 Collecting sysdump with cilium-cli version: v0.11.11, args: [sysdump]
🔍 Collecting Kubernetes nodes
🔍 Collect Kubernetes nodes
🔍 Collecting Kubernetes events
🔍 Collecting Kubernetes pods
🔍 Collect Kubernetes version
🔍 Collecting Kubernetes namespaces
🔍 Collecting Kubernetes services
🔍 Collecting Kubernetes pods summary
🔍 Collecting Kubernetes endpoints
🔍 Collecting Kubernetes network policies
🔍 Collecting Cilium cluster-wide network policies
🔍 Collecting Cilium network policies
🔍 Collecting Cilium local redirect policies
🔍 Collecting Cilium egress NAT policies
🔍 Collecting Cilium endpoints
🔍 Collecting Cilium identities
🔍 Collecting Cilium nodes
🔍 Collecting Ingresses
🔍 Collecting CiliumEnvoyConfigs
🔍 Collecting CiliumClusterwideEnvoyConfigs
🔍 Collecting Cilium etcd secret
🔍 Collecting the Cilium configuration
🔍 Collecting the Cilium daemonset(s)
🔍 Collecting the Hubble daemonset
🔍 Collecting the Hubble Relay deployment
🔍 Collecting the Hubble Relay configuration
🔍 Collecting the Hubble UI deployment
🔍 Collecting the Cilium operator deployment
🔍 Collecting the CNI configuration files from Cilium pods
⚠️ Deployment "hubble-ui" not found in namespace "kube-system" - this is expected if Hubble UI is not enabled
🔍 Collecting the CNI configmap
🔍 Collecting the 'clustermesh-apiserver' deployment
⚠️ Deployment "hubble-relay" not found in namespace "kube-system" - this is expected if Hubble is not enabled
🔍 Collecting gops stats from Cilium pods
🔍 Collecting gops stats from Hubble pods
🔍 Collecting gops stats from Hubble Relay pods
🔍 Collecting 'cilium-bugtool' output from Cilium pods
🔍 Collecting logs from Cilium pods
🔍 Collecting logs from Cilium operator pods
⚠️ Deployment "clustermesh-apiserver" not found in namespace "kube-system" - this is expected if 'clustermesh-apiserver' isn't enabled
🔍 Collecting logs from 'clustermesh-apiserver' pods
🔍 Collecting logs from Hubble pods
🔍 Collecting logs from Hubble Relay pods
🔍 Collecting logs from Hubble UI pods
🔍 Collecting platform-specific data
🔍 Collecting Hubble flows from Cilium pods
⚠️ The following tasks failed, the sysdump may be incomplete:
⚠️ [11] Collecting Cilium egress NAT policies: failed to collect Cilium egress NAT policies: the server could not find the requested resource (get ciliumegressnatpolicies.cilium.io)
⚠️ [12] Collecting Cilium local redirect policies: failed to collect Cilium local redirect policies: the server could not find the requested resource (get ciliumlocalredirectpolicies.cilium.io)
⚠️ [17] Collecting CiliumClusterwideEnvoyConfigs: failed to collect CiliumClusterwideEnvoyConfigs: the server could not find the requested resource (get ciliumclusterwideenvoyconfigs.cilium.io)
⚠️ [18] Collecting CiliumEnvoyConfigs: failed to collect CiliumEnvoyConfigs: the server could not find the requested resource (get ciliumenvoyconfigs.cilium.io)
⚠️ [23] Collecting the Hubble Relay configuration: failed to collect the Hubble Relay configuration: configmaps "hubble-relay-config" not found
⚠️ cniconflist-cilium-7l6br: error dialing backend: dial tcp 192.168.153.23:10250: connect: no route to host
⚠️ cniconflist-cilium-hn9g5: command terminated with exit code 1
⚠️ cniconflist-cilium-rzkb6: command terminated with exit code 1
⚠️ gops-cilium-7l6br-memstats: failed to list processes "cilium-7l6br" ("cilium-agent") in namespace "kube-system": error dialing backend: dial tcp 192.168.153.23:10250: connect: no route to host
⚠️ gops-cilium-7l6br-stack: failed to list processes "cilium-7l6br" ("cilium-agent") in namespace "kube-system": error dialing backend: dial tcp 192.168.153.23:10250: connect: no route to host
⚠️ gops-cilium-7l6br-stats: failed to list processes "cilium-7l6br" ("cilium-agent") in namespace "kube-system": error dialing backend: dial tcp 192.168.153.23:10250: connect: no route to host
⚠️ cilium-bugtool-cilium-7l6br: failed to collect 'cilium-bugtool' output for "cilium-7l6br" in namespace "kube-system": error dialing backend: dial tcp 192.168.153.23:10250: connect: no route to host: 
⚠️ logs-cilium-7l6br-cilium-agent: failed to collect logs for "cilium-7l6br" ("cilium-agent") in namespace "kube-system": Get "https://192.168.153.23:10250/containerLogs/kube-system/cilium-7l6br/cilium-agent?limitBytes=1073741824&sinceTime=2021-07-13T08%3A20%3A54Z&timestamps=true": dial tcp 192.168.153.23:10250: connect: no route to host
⚠️ logs-cilium-operator-5d67fc458d-gjdc6-cilium-operator: failed to collect logs for "cilium-operator-5d67fc458d-gjdc6" ("cilium-operator") in namespace "kube-system": Get "https://192.168.153.23:10250/containerLogs/kube-system/cilium-operator-5d67fc458d-gjdc6/cilium-operator?limitBytes=1073741824&sinceTime=2021-07-13T08%3A20%3A55Z&timestamps=true": dial tcp 192.168.153.23:10250: connect: no route to host
⚠️ logs-cilium-7l6br-mount-cgroup: failed to collect logs for "cilium-7l6br" ("mount-cgroup") in namespace "kube-system": Get "https://192.168.153.23:10250/containerLogs/kube-system/cilium-7l6br/mount-cgroup?limitBytes=1073741824&sinceTime=2021-07-13T08%3A20%3A54Z&timestamps=true": dial tcp 192.168.153.23:10250: connect: no route to host
⚠️ logs-cilium-7l6br-clean-cilium-state: failed to collect logs for "cilium-7l6br" ("clean-cilium-state") in namespace "kube-system": Get "https://192.168.153.23:10250/containerLogs/kube-system/cilium-7l6br/clean-cilium-state?limitBytes=1073741824&sinceTime=2021-07-13T08%3A20%3A54Z&timestamps=true": dial tcp 192.168.153.23:10250: connect: no route to host
⚠️ hubble-flows-cilium-7l6br: failed to collect hubble flows for "cilium-7l6br" in namespace "kube-system": error dialing backend: dial tcp 192.168.153.23:10250: connect: no route to host: 
⚠️ Please note that depending on your Cilium version and installation options, this may be expected
🗳 Compiling sysdump
✅ The sysdump has been saved to /work-place/kubernetes/create-cluster/cilium-sysdump-20220713-162053.zip

coredns的configMap

root@nm:/work-place/kubernetes/create-cluster# kubectl describe cm coredns -n kube-system
Name:         coredns
Namespace:    kube-system
Labels:       <none>
Annotations:  <none>

Data
====
Corefile:
----
.:53 {
    errors
    health {
       lameduck 5s
    }
    ready
    kubernetes cluster.local in-addr.arpa ip6.arpa {
       pods insecure
       fallthrough in-addr.arpa ip6.arpa
       ttl 30
    }
    prometheus :9153
    forward . /etc/resolv.conf {
       max_concurrent 1000
    }
    cache 30
    loop
    reload
    loadbalance
}


BinaryData
====

Events:  <none>

kubelet的config信息

root@nm:/home/lzl# cat /var/lib/kubelet/config.yaml 
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 0s
    enabled: true
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 0s
    cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
logging:
  flushFrequency: 0
  options:
    json:
      infoBufferSize: "0"
  verbosity: 0
memorySwap: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
resolvConf: /run/systemd/resolve/resolv.conf
rotateCertificates: true
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s

kubeadm对kubelet添加的flag

root@nm:/home/lzl# cat /var/lib/kubelet/kubeadm-flags.env 
KUBELET_KUBEADM_ARGS="--network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.6"

OS信息

$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
$ uname -a
Linux nm 5.15.0-41-generic #44-Ubuntu SMP Wed Jun 22 14:20:53 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

sysctl -a | grep -w rp_filter信息如下

root@nm:/work-place/kubernetes/create-cluster# sysctl -a | grep -w rp_filter
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.cilium_host.rp_filter = 0
net.ipv4.conf.cilium_net.rp_filter = 0
net.ipv4.conf.cilium_vxlan.rp_filter = 0
net.ipv4.conf.default.rp_filter = 2
net.ipv4.conf.docker0.rp_filter = 2
net.ipv4.conf.ens33.rp_filter = 2
net.ipv4.conf.lo.rp_filter = 2
net.ipv4.conf.lxc_health.rp_filter = 2

此外，补充一点，在我使用flannel做集群网络时，一切都是正常的。

我所做的尝试

在大佬指点之前，我按照下面的链接做出一些调整。

coredns在k8s集群中的troubleshooting：https://github.com/coredns/coredns/blob/master/plugin/loop/README.md#troubleshooting-loops-in-kubernetes-clusters
和我这个问题比较相似的plugin/loop导致的集群网络异常：https://github.com/coredns/coredns/issues/2790
本次排错中也出现的coredns [ERROR] plugin/errors: 2 read udp上游不可达：https://github.com/kubernetes/kubernetes/issues/86762

然后我又通过busybox进到容器里面，分别在正常的使用flannel的网络环境和异常的使用cilium的网络环境下去尝试ping我的网关。

在使用flannel的网络中，一切正常：

root@master:/home/lzl/work-place/kubernetes/deploy-k8s# kubectl run -it --rm --restart=Never busybox --image=docker.io/library/busybox sh
If you don't see a command prompt, try pressing enter.
/ # ping 10.96.0.10
PING 10.96.0.10 (10.96.0.10): 56 data bytes
^C
--- 10.96.0.10 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
/ # ping 192.168.153.2
PING 192.168.153.2 (192.168.153.2): 56 data bytes
64 bytes from 192.168.153.2: seq=0 ttl=127 time=0.458 ms
64 bytes from 192.168.153.2: seq=1 ttl=127 time=0.405 ms
64 bytes from 192.168.153.2: seq=2 ttl=127 time=1.041 ms
^C
--- 192.168.153.2 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.405/0.634/1.041 ms

而使用cilium中异常网络，则无法ping通：

root@nm:/work-place/kubernetes/create-cluster# kubectl run -it --rm --restart=Never busybox --image=docker.io/library/busybox sh
If you don't see a command prompt, try pressing enter.
/ # ping 192.168.153.2
PING 192.168.153.2 (192.168.153.2): 56 data bytes

我所做的尝试都没有用。

问题解决

最终的解决办法是vincentmli给出的，如下。

手工在 /etc/sysctl.d/ 中写入下面的文件，然后重启节点。

cat /etc/sysctl.d/99-zzz-override_cilium.conf
# Disable rp_filter on Cilium interfaces since it may cause mangled packets to be dropped
net.ipv4.conf.lxc*.rp_filter = 0
net.ipv4.conf.cilium_*.rp_filter = 0
# The kernel uses max(conf.all, conf.{dev}) as its value, so we need to set .all. to 0 as well.
# Otherwise it will overrule the device specific settings.
net.ipv4.conf.all.rp_filter = 0

所有节点执行上面的工作后，coredns终于正常了，而且我尝试部署了测试应用，可以通过网络访问。

关于为什么这样做请看这条issue：https://github.com/cilium/cilium/pull/20072

另外，其他信息请查看：https://github.com/cilium/cilium/issues/20498