Etcd命令随记
切换 ETCDCTL API version
with docker exec
docker exec -e "ETCDCTL_API=2" etcd_container_name_or_id etcdctl --help
env
export ETCDCTL_API=2
排障相关命令
- 如果etcd是基于k8s manifest启动的,且该manifest初始化了ETCDCTL_ENDPOINTS变量,在指定endpoints只能通过环境变量指定,或者在bash环境下unset "ETCDCTL_ENDPOINTS" 取消环境变量定义然后再通过参数传入,不然会提示配置冲突:
2021-10-19 03:25:28.520086 C | pkg/flags: conflicting environment variable "ETCDCTL_ENDPOINTS" is shadowed by corresponding command-line flag (either unset environment variable or disable flag)
etcdctl v3可用的环境变量可见文档 https://github.com/etcd-io/etcd/blob/main/etcdctl/README.md
其中常见的包括
export ETCDCTL_ENDPOINTS=https://127.0.0.1:2379,https://10.255.251.102:2379,https://10.255.251.103:2379 export ETCDCTL_CACERT=/etc/kubernetes/ssl/etcd/ca.crt export ETCDCTL_CERT=/etc/kubernetes/ssl/etcd/peer.crt export ETCDCTL_KEY=/etc/kubernetes/ssl/etcd/peer.key
k8s 的 etcd 可以通过下述命令直接在k8s controller shell 上初始化 etcdctl 环境变量
export ETCD_PEER_PORT=2379 eval $(kubectl get pods -owide -n kube-system|awk '/etcd/{printf "https://"$6":'$ETCD_PEER_PORT',"}'|awk '{gsub(",$","");print "export ETCDCTL_ENDPOINTS=\""$1"\""}') export ETCDCTL_CACERT=/etc/kubernetes/ssl/etcd/ca.crt export ETCDCTL_CERT=/etc/kubernetes/ssl/etcd/peer.crt export ETCDCTL_KEY=/etc/kubernetes/ssl/etcd/peer.key etcdctl member list -w table etcdctl endpoint status -w table
etcd member list (v3 API)
- etcdctl member list
[root@test kubelet]# docker exec -e "ETCDCTL_ENDPOINTS=https://192.168.150.12:12379,https://192.168.150.13:12379,https://192.168.150.14:12379" `docker ps | awk '/etcd /{print $1}'` etcdctl member list -w table +------------------+---------+-------------------------+-----------------------------+------------------------------+------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+-------------------------+-----------------------------+------------------------------+------------+ | 8a8a69237e3e00ef | started | dce-etcd-192.168.150.12 | http://192.168.150.12:12380 | https://192.168.150.12:12379 | false | | d58cc05313738455 | started | dce-etcd-192.168.150.13 | http://192.168.150.13:12380 | https://192.168.150.13:12379 | false | | ed2566e796a749a6 | started | dce-etcd-192.168.150.14 | http://192.168.150.14:12380 | https://192.168.150.14:12379 | false | +------------------+---------+-------------------------+-----------------------------+------------------------------+------------+
etcd endpoint health (v3 API)
- etcdctl endpoint health
docker exec -e "ETCDCTL_API=3" -e "ETCDCTL_ENDPOINTS=https://192.168.155.22:12379,https://192.168.155.23:12379,https://192.168.155.24:12379" etcd_container_name_or_id etcdctl endpoint health -w table +------------------------------+--------+--------------+-------+ | ENDPOINT | HEALTH | TOOK | ERROR | +------------------------------+--------+--------------+-------+ | https://192.168.155.23:12379 | true | 14.2308ms | | | https://192.168.155.22:12379 | true | 14.572283ms | | | https://192.168.155.24:12379 | true | 351.572429ms | | +------------------------------+--------+--------------+-------+
etcd endpoint status (v3 API)
- etcdctl endpoint status
docker exec -e "ETCDCTL_API=3" -e "ETCDCTL_ENDPOINTS=https://192.168.155.22:12379,https://192.168.155.23:12379,https://192.168.155.24:12379" etcd_container_name_or_id etcdctl endpoint status -w table +------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | https://192.168.155.22:12379 | 5710b6824446f271 | 3.4.1 | 35 MB | false | false | 52 | 7925759 | 7925759 | | | https://192.168.155.23:12379 | 2a8509b66bfae6b6 | 3.4.1 | 35 MB | true | false | 52 | 7925759 | 7925759 | | | https://192.168.155.24:12379 | 72f4884011f8a2b | 3.4.1 | 35 MB | false | false | 52 | 7925760 | 7925760 | | +------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
etcd cluster-health (v2 API)
etcdctl v2可用的环境变量可见文档 https://github.com/etcd-io/etcd/blob/main/etcdctl/READMEv2.md
etcdctl v2常用的环境变量
ETCDCTL_ENDPOINT ETCDCTL_CA_FILE ETCDCTL_KEY_FILE ETCDCTL_CERT_FILE
- etcdctl cluster-health
docker exec -e "ETCDCTL_API=2" -e "ETCDCTL_ENDPOINTS=https://192.168.155.22:12379,https://192.168.155.23:12379,https://192.168.155.24:12379" etcd_container_name_or_id etcdctl cluster-health member 72f4884011f8a2b is healthy: got healthy result from https://192.168.155.24:12379 member 2a8509b66bfae6b6 is healthy: got healthy result from https://192.168.155.23:12379 member 5710b6824446f271 is healthy: got healthy result from https://192.168.155.22:12379 cluster is healthy
docker exec -it -e ETCDCTL_API=2 `docker ps | awk '/etcd /{print $1}'` etcdctl cluster-health
磁盘/网络性能要求相关
etcd is very sensitive to disk write latency. Typically 50 sequential IOPS (e.g., a 7200 RPM disk) is required. For heavily loaded clusters, 500 sequential IOPS (e.g., a typical local SSD or a high performance virtualized block device) is recommended. Note that most cloud providers publish concurrent IOPS rather than sequential IOPS; the published concurrent IOPS can be 10x greater than the sequential IOPS. To measure actual sequential IOPS, we suggest using a disk benchmarking tool such as diskbench or fio.
openshift_container_platform 推荐的 etcd 实践
就延迟而言,应该在一个可最少以 50 IOPS 按顺序写入 8000 字节的块设备上运行。也就是说,当有一个 20ms 的延迟时,使用 fdatasync 来同步 WAL 中的写入操作。对于高负载的集群,建议使用 8000 字节的连续 500 IOPS (2 毫秒)。要测量这些数字,您可以使用基准测试工具,如 fio。
IO延迟与Queue Depth(队列深度)/Queue Length (队列长度)
IO延迟是指控制器将IO指令发出之后,直到IO完成的过程中总共花费的时间。早前业界不成文的规定为,只要IO延迟在20ms内,IO性能对于应用程序来说都是可以接受的,但是如果大于20ms,应用程序的性能将会受到较大影响。(JMF602的小文件随机写入IOPS是个位数,所以你们觉得卡)
这样算下来,存储设备应当满足最低的IOPS要求应该为1S/20ms=50IOPS,所以只要区区50IOPS就可以满足这个要求了。单块机械硬盘的IOPS一般在80附近(7200转),固态硬盘的话就比较夸张了,对于大型的存储设备,通过并行N个IO通道工作,达到几十万甚至几百万IOPS都不是问题。
然而不能总以最低标准来要求存储设备。当接收到的IO很少的时候,IO延迟也会很小。比如一块Intel X25-M Gen2 34nm 80G固态硬盘,即使延迟平均在0.1ms的话,每个IO通道的IOPS=1000/0.1=10000,但是这块固态硬盘被厂家标称35000的读取IOPS,这里就引出另一个概念:Queue Depth(队列深度,也可以叫队列长度)
控制器向存储设备发起的指令,不是一条条发送的,而是一批批的发送,存储目标设备批量执行IO,然后把数据和结果返回控制器。只要存储设备肚量和消化能力足够强,在IO比较少的时候,处理一条指令和同时处理多条指令将会消耗几乎相同的时间。控制器发出的批量指令的最大条数,由控制器上的Queue Depth(队列深度)决定。(一般好的固态硬盘主控,队列深度都支持到32了)
如果给出队列深度,IOPS,IO延迟三者中的任意两者,则可以推算出第三者,公式:IOPS=(队列深度)/ (IO延迟)。实际上,随着队列深度的增加,IO延迟也在增加,二者是互相促进的关系,所以,随着IO数目的增多,将很快达到存储设备提供的最大IOPS处理能力,此时IO延迟将会陡峭升高,而IOPS则增加缓慢。(消化不良)
wal_fsync_duration_seconds 官方建议
Usually this issue is caused by a slow disk. Before the leader sends heartbeats attached with metadata, it may need to persist the metadata to disk. The disk could be experiencing contention among etcd and other applications, or the disk is too simply slow (e.g., a shared virtualized disk). To rule out a slow disk from causing this warning, monitor wal_fsync_duration_seconds (p99 duration should be less than 10ms) to confirm the disk is reasonably fast. If the disk is too slow, assigning a dedicated disk to etcd or using faster disk will typically solve the problem. To tell whether a disk is fast enough for etcd, a benchmarking tool such as fio can be used. Read here for an example.