Elasticsearch的一些api随记:修订间差异
小无编辑摘要 |
小无编辑摘要 |
||
(未显示同一用户的22个中间版本) | |||
第17行: | 第17行: | ||
查看es各节点磁盘空间占用、分片数目等 | 查看es各节点磁盘空间占用、分片数目等 | ||
/_cat/allocation?v | /_cat/allocation?v | ||
/_cat/nodeattrs | |||
===Get master node=== | ===Get master node=== | ||
第22行: | 第24行: | ||
===Cluster allocation explain related=== | ===Cluster allocation explain related=== | ||
可以用于定位分片状态以及分片为何故障 | |||
/_cat/shards/index_name-*?v&s=state,index&h=index,shard,prirep,state,docs,store,ip,node,unassigned.reason | /_cat/shards/index_name-*?v&s=state,index&h=index,shard,prirep,state,docs,store,ip,node,unassigned.reason | ||
第28行: | 第30行: | ||
===Shards=== | ===Shards=== | ||
/_cat/shards | 粗略查看分片情况,特别是查看分片分布节点或大小/状态 | ||
GET /_cat/shards | |||
GET /_cat/shards?index=index_name | |||
GET /_cat/shards?index=index_na* | |||
查看分片分配失败原因 | |||
/_cat/shards/index_name-*?v&s=state,index&h=index,shard,prirep,state,docs,store,ip,node,unassigned.reason | |||
==== Recovery API ==== | |||
Returns information about ongoing and completed shard recoveries, similar to the index recovery API. | |||
For data streams, the API returns information about the stream’s backing indices | |||
可以查看当前正在 relocating 的分片,也能查到各分片处理进度百分比 | |||
GET /_cat/recovery?active_only=true&s=index&v | |||
=== Adds a data stream or index to an alias, and sets the write index or data stream for the alias === | |||
为别名设置可写索引或数据流 | |||
If the alias doesn’t exist, the <code>add</code> action creates it. | |||
POST /_aliases | |||
{ | |||
"actions": [ | |||
{ | |||
"add": { | |||
"index": "es-k8s-logs-000020", | |||
"alias": "es-k8s-logs-alias", | |||
"is_write_index": true | |||
} | |||
} | |||
] | |||
} | |||
===Thread pool related=== | ===Thread pool related=== | ||
第37行: | 第68行: | ||
/_cluster/settings?pretty&include_defaults=true | grep processors | /_cluster/settings?pretty&include_defaults=true | grep processors | ||
==== | ====Get maximum number of threads info==== | ||
curl "127.1:9200/_cat/thread_pool?v&h=ip,node_name,id,name,max,size,queue_size,queue,active,rejected&pretty" | curl "127.1:9200/_cat/thread_pool?v&h=ip,node_name,id,name,max,size,queue_size,queue,active,rejected&pretty" | ||
=== | === Templates 模板 === | ||
/_cat/templates?v | /_cat/templates?v | ||
⚠️ /_template/${template_name} is legacy index templates, which are deprecated and will be replaced by the composable templates introduced in Elasticsearch 7.8. | |||
/_template | 新版本中使用 <code>/_index_template</code> 取代 | ||
GET/PUT /_template/${template_name} | |||
==== Use template to change the replicas settings of all indexes (Legacy index template) ==== | |||
Multiple index templates can potentially match an index, in this case, both the settings and mappings are merged into the final configuration of the index. | |||
The order of the merging can be controlled using the <code>order</code> parameter, with lower order being applied first, '''and higher orders overriding them.''' | |||
legacy es template 中, 取值范围为 0 - 2^31-1 (0~2147483647) | |||
PUT /_template/${template_name} | |||
{ | { | ||
"order": 2147483647, | "order": 2147483647, | ||
第59行: | 第96行: | ||
} | } | ||
} | } | ||
[[使用jq批量修改es index template的lifecycle配置]] | |||
=== | === ILM (index lifecycle policy) 索引生命周期 === | ||
顾名思义,ilm另外也可用于做ES集群的冷热温架构。 | |||
不同的阶段(phase)能做哪些事可以在这个 [https://www.elastic.co/guide/en/elasticsearch/reference/current/ilm-index-lifecycle.html Document] 查看 | |||
比较难受的是,ilm目前没有类似 <code>_cat/templates</code> 的接口一次性只查看这个集群已配置的 ILM 策略名字,只能一次性获取全部策略具体定义 (不过可以利用浏览器的F12 json preview折叠来曲线救国) | |||
GET /_ilm/policy | |||
==== Get specific ilm policy detail 获取特定ILM策略定义 ==== | |||
GET /_ilm/policy/${ilm_name} | |||
PUT /_ilm/policy/ilm-30d-delete | PUT /_ilm/policy/ilm-30d-delete | ||
第73行: | 第117行: | ||
"actions": { | "actions": { | ||
"delete": { | "delete": { | ||
"delete_searchable_snapshot" : true | "delete_searchable_snapshot": true | ||
} | } | ||
} | } | ||
第81行: | 第125行: | ||
} | } | ||
===Cluster | ==== Get index's ilm status 获取索引当前 ILM 状态 ==== | ||
get /${index_name}/_ilm/explain | |||
==== Move index's ilm to step 修改索引的ILM阶段状态(人为触发ILM action执行) ==== | |||
https://www.elastic.co/guide/en/elasticsearch/reference/current/ilm-move-to-step.html | |||
POST _ilm/move/my-index-000001 | |||
{ | |||
"current_step": { | |||
"phase": "new", | |||
"action": "complete", | |||
"name": "complete" | |||
}, | |||
"next_step": { | |||
"phase": "warm", | |||
"action": "forcemerge", | |||
"name": "forcemerge" | |||
} | |||
} | |||
==== Manually create an index that is managed by template and ILM (including rollover operations by day) ==== | |||
===== 手动创建原本应由template和ilm管控的索引,且索引名内包含日期(动态索引名) ===== | |||
⚠️ 这种索引不能直接粗暴地 <code>PUT /index-name-2022.10.23-000022</code> 以创建索引,否则手动创建出来的索引,在rollover滚动 (例如rollover-max_age:1d)的时候,创建出来的新索引名字仍然是创建索引时定义的日期,而不是当天轮滚发生时的日期(如 <code>index-name-2022.10.23-000023</code>) | |||
这个现象可以通过判断 <code>GET /index-name/_settings</code> 中 <code>index.provided_name</code> 属性看出来 | |||
解法: | |||
PUT %3Cindex-name-%7Bnow%2Fd%7D-000099%3E | |||
注意在kibana-Dev Tools中不要做URL Decode,他就是这样的需要编码一下(解码后就是: <code><index-name-{now/d}-000099></code>) | |||
ps: 如果怕创建错名字的话,可以使用 <code>GET %3Cindex-name-%7Bnow%2Fd%7D-000099%3E/_settings</code> 预览一下生成的索引名效果 | |||
对于索引最后的这个序号,[https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-rollover-index.html#increment-index-names-for-alias 无论前一个索引的名称是什么,该编号始终为 6 个字符,且为零填充]。即使手动创建的索引结尾是<code>-00001</code>,在rollover发生以后,索引后缀序号依然会变成<code>-000002</code> | |||
==== 故障处理: 有的时候对现有的索引修改了其引用的 ilm policy 为别的 policy,或者修改了其引用的 ilm policy中的 phase 定义。会导致索引ilm故障 ==== | |||
有可能导致他的ilm处理会出问题(不记得怎么告警),没记错的话通过 <code>GET {index_name}/_ilm/explain</code> 能看到 error 信息,能看到卡在某个 phase 失败 | |||
这时候需要人为修改 index 的 ilm phase 修复,如 | |||
POST _ilm/move/insight-es-k8s-logs-dce5-aliyun-default-prd-2024.01.16 | |||
{ | |||
"current_step": { | |||
"phase": "hot", | |||
"action": "rollover", | |||
"name": "ERROR" | |||
}, | |||
"next_step": { | |||
"phase": "cold" | |||
} | |||
} | |||
或者尝试通过 <code>POST {index_name}/_ilm/retry</code> 接口重试 | |||
===Cluster setting related ES集群参数设置=== | |||
/_cluster/settings?include_defaults=true&pretty | /_cluster/settings?include_defaults=true&pretty | ||
第98行: | 第193行: | ||
} | } | ||
====primaries recovery settings==== | ====primaries recovery settings ==== | ||
=====控制索引恢复或者relocating的并发数===== | |||
{ | { | ||
"transient": { | "transient": { | ||
第141行: | 第237行: | ||
} | } | ||
} | } | ||
cluster.routing.allocation.node_concurrent_recoveries: A shortcut to set both <code>cluster.routing.allocation.node_concurrent_incoming_recoveries</code> and <code>cluster.routing.allocation.node_concurrent_outgoing_recoveries</code>. | |||
:cluster.routing.allocation. | |||
# PUT /_cluster/settings | # PUT /_cluster/settings | ||
{ | { | ||
第152行: | 第244行: | ||
"routing": { | "routing": { | ||
"allocation": { | "allocation": { | ||
" | "node_concurrent_recoveries": 8 | ||
} | } | ||
} | } | ||
第159行: | 第251行: | ||
} | } | ||
<br /> | |||
=== recovery.max_bytes_per_sec [https://www.elastic.co/guide/en/elasticsearch/reference/7.17/recovery.html 修改relocating时并发传输数据量] === | |||
加大此数值可以有效缩短es relocating index的耗时 | |||
indices.recovery.max_bytes_per_sec: Limits total inbound and outbound recovery traffic for each node. Applies to both peer recoveries as well as snapshot recoveries (i.e., restores from a snapshot). Defaults to <code>40mb</code> unless the node is a dedicated cold or frozen node, in which case the default relates to the total memory available to the node.<br /> | |||
===Index settings=== | ===Index settings=== | ||
====modify the number of replicas in bulk==== | ====modify the number of replicas in bulk==== | ||
===== 批量/单个 设置索引副本数 ===== | |||
PUT /index_name*/_settings | |||
{ | { | ||
"index": { | "index": { | ||
第169行: | 第268行: | ||
} | } | ||
} | } | ||
==== | === Search Documents === | ||
/ | |||
==== match_all 搜索 ==== | |||
GET /sw_segment-20230914/_search | |||
{ | |||
"query": { | |||
"match_all": {} | |||
}, | |||
"size": 1 | |||
} | |||
==== 单字段排序匹配搜索(match) ==== | |||
GET /sw_segment-20230914/_search | |||
{ | |||
"query": { | |||
"match": { | |||
"segment_id": "b7bb26fae59e4f45b101346cb83ff796.69.16946808855979526" | |||
} | |||
}, | |||
"sort": [ | |||
{ | |||
"start_time": { | |||
"order": "desc" | |||
} | |||
} | |||
], | |||
"size": 1 | |||
} | |||
=== Elastic Cloud on Kubernetes (ECK / Elastic operator) === | |||
ECK operator下管理的Elasticsearch如果要修改<code>cluster.routing.allocation.exclude</code> 的参数配置,需要先为 elasticsearch 实例配置annotation: 'eck.k8s.elastic.co/managed=false',不然会配置一会就会被刷回原状 | |||
第207行: | 第334行: | ||
=== Error === | === Error === | ||
==== | ==== 集群分片数达到maximum错误 ==== | ||
集群分片数达到maximum错误会有如下log信息,但是集群的健康状态不会改变 | |||
2022-11-10T10:26:03.643184618Z org.elasticsearch.common.ValidationException: Validation Failed: 1: this action would add [3] shards, but this cluster currently has [1999]/[2000] maximum normal shards open; | 2022-11-10T10:26:03.643184618Z org.elasticsearch.common.ValidationException: Validation Failed: 1: this action would add [3] shards, but this cluster currently has [1999]/[2000] maximum normal shards open; | ||
解决: | 解决: |
2024年8月10日 (六) 12:10的最新版本
Health
/_cat/health
/_cluster/health
Indices health
按条件查看索引状态
/_cat/indices?help /_cat/indices?health=red&v&s=store.size:desc,index
/_cat/indices?health=yellow&v&s=store.size:desc,index
/_cat/indices?health=green&v&s=store.size:desc,index
Nodes
/_cat/nodes?v
查看es各节点磁盘空间占用、分片数目等
/_cat/allocation?v
/_cat/nodeattrs
Get master node
/_cat/master?v
可以用于定位分片状态以及分片为何故障
/_cat/shards/index_name-*?v&s=state,index&h=index,shard,prirep,state,docs,store,ip,node,unassigned.reason
/_cluster/allocation/explain
Shards
粗略查看分片情况,特别是查看分片分布节点或大小/状态
GET /_cat/shards
GET /_cat/shards?index=index_name
GET /_cat/shards?index=index_na*
查看分片分配失败原因
/_cat/shards/index_name-*?v&s=state,index&h=index,shard,prirep,state,docs,store,ip,node,unassigned.reason
Recovery API
Returns information about ongoing and completed shard recoveries, similar to the index recovery API.
For data streams, the API returns information about the stream’s backing indices
可以查看当前正在 relocating 的分片,也能查到各分片处理进度百分比
GET /_cat/recovery?active_only=true&s=index&v
Adds a data stream or index to an alias, and sets the write index or data stream for the alias
为别名设置可写索引或数据流
If the alias doesn’t exist, the add
action creates it.
POST /_aliases { "actions": [ { "add": { "index": "es-k8s-logs-000020", "alias": "es-k8s-logs-alias", "is_write_index": true } } ] }
https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-thread-pool.html
/_cluster/settings?pretty&include_defaults=true | grep processors
Get maximum number of threads info
curl "127.1:9200/_cat/thread_pool?v&h=ip,node_name,id,name,max,size,queue_size,queue,active,rejected&pretty"
Templates 模板
/_cat/templates?v
⚠️ /_template/${template_name} is legacy index templates, which are deprecated and will be replaced by the composable templates introduced in Elasticsearch 7.8.
新版本中使用 /_index_template
取代
GET/PUT /_template/${template_name}
Use template to change the replicas settings of all indexes (Legacy index template)
Multiple index templates can potentially match an index, in this case, both the settings and mappings are merged into the final configuration of the index.
The order of the merging can be controlled using the order
parameter, with lower order being applied first, and higher orders overriding them.
legacy es template 中, 取值范围为 0 - 2^31-1 (0~2147483647)
PUT /_template/${template_name} { "order": 2147483647, "index_patterns": [ "*" ], "settings": { "index": { "number_of_replicas": "0" } } }
使用jq批量修改es index template的lifecycle配置
ILM (index lifecycle policy) 索引生命周期
顾名思义,ilm另外也可用于做ES集群的冷热温架构。
不同的阶段(phase)能做哪些事可以在这个 Document 查看
比较难受的是,ilm目前没有类似 _cat/templates
的接口一次性只查看这个集群已配置的 ILM 策略名字,只能一次性获取全部策略具体定义 (不过可以利用浏览器的F12 json preview折叠来曲线救国)
GET /_ilm/policy
Get specific ilm policy detail 获取特定ILM策略定义
GET /_ilm/policy/${ilm_name}
PUT /_ilm/policy/ilm-30d-delete { "policy": { "phases": { "delete": { "min_age": "30d", "actions": { "delete": { "delete_searchable_snapshot": true } } } } } }
Get index's ilm status 获取索引当前 ILM 状态
get /${index_name}/_ilm/explain
Move index's ilm to step 修改索引的ILM阶段状态(人为触发ILM action执行)
https://www.elastic.co/guide/en/elasticsearch/reference/current/ilm-move-to-step.html
POST _ilm/move/my-index-000001 { "current_step": { "phase": "new", "action": "complete", "name": "complete" }, "next_step": { "phase": "warm", "action": "forcemerge", "name": "forcemerge" } }
Manually create an index that is managed by template and ILM (including rollover operations by day)
手动创建原本应由template和ilm管控的索引,且索引名内包含日期(动态索引名)
⚠️ 这种索引不能直接粗暴地 PUT /index-name-2022.10.23-000022
以创建索引,否则手动创建出来的索引,在rollover滚动 (例如rollover-max_age:1d)的时候,创建出来的新索引名字仍然是创建索引时定义的日期,而不是当天轮滚发生时的日期(如 index-name-2022.10.23-000023
)
这个现象可以通过判断 GET /index-name/_settings
中 index.provided_name
属性看出来
解法:
PUT %3Cindex-name-%7Bnow%2Fd%7D-000099%3E
注意在kibana-Dev Tools中不要做URL Decode,他就是这样的需要编码一下(解码后就是: <index-name-{now/d}-000099>
)
ps: 如果怕创建错名字的话,可以使用 GET %3Cindex-name-%7Bnow%2Fd%7D-000099%3E/_settings
预览一下生成的索引名效果
对于索引最后的这个序号,无论前一个索引的名称是什么,该编号始终为 6 个字符,且为零填充。即使手动创建的索引结尾是-00001
,在rollover发生以后,索引后缀序号依然会变成-000002
故障处理: 有的时候对现有的索引修改了其引用的 ilm policy 为别的 policy,或者修改了其引用的 ilm policy中的 phase 定义。会导致索引ilm故障
有可能导致他的ilm处理会出问题(不记得怎么告警),没记错的话通过 GET {index_name}/_ilm/explain
能看到 error 信息,能看到卡在某个 phase 失败
这时候需要人为修改 index 的 ilm phase 修复,如
POST _ilm/move/insight-es-k8s-logs-dce5-aliyun-default-prd-2024.01.16 { "current_step": { "phase": "hot", "action": "rollover", "name": "ERROR" }, "next_step": { "phase": "cold" } }
或者尝试通过 POST {index_name}/_ilm/retry
接口重试
/_cluster/settings?include_defaults=true&pretty
/_cluster/settings?include_defaults=true
Wildcard expressions or all indices are not allowed
允许泛匹配删除索引
PUT /_cluster/settings { "persistent": { "action": { "destructive_requires_name": "false" } } }
primaries recovery settings
控制索引恢复或者relocating的并发数
{ "transient": { "cluster": { "routing": { "allocation": { "node_initial_primaries_recoveries": 10, "node_concurrent_incoming_recoveries": null, "node_concurrent_outgoing_recoveries": null, "node_concurrent_recoveries": 20 } } } } }
{ "transient": { "cluster": { "routing": { "allocation": { "node_initial_primaries_recoveries": null, "node_concurrent_incoming_recoveries": null, "node_concurrent_recoveries": null } } } } }
{ "persistent": { "cluster": { "routing": { "allocation": { "node_initial_primaries_recoveries": 30, "node_concurrent_incoming_recoveries": null, "node_concurrent_recoveries": 10 } } } } }
cluster.routing.allocation.node_concurrent_recoveries: A shortcut to set both cluster.routing.allocation.node_concurrent_incoming_recoveries
and cluster.routing.allocation.node_concurrent_outgoing_recoveries
.
# PUT /_cluster/settings { "persistent": { "cluster": { "routing": { "allocation": { "node_concurrent_recoveries": 8 } } } } }
recovery.max_bytes_per_sec 修改relocating时并发传输数据量
加大此数值可以有效缩短es relocating index的耗时
indices.recovery.max_bytes_per_sec: Limits total inbound and outbound recovery traffic for each node. Applies to both peer recoveries as well as snapshot recoveries (i.e., restores from a snapshot). Defaults to 40mb
unless the node is a dedicated cold or frozen node, in which case the default relates to the total memory available to the node.
Index settings
modify the number of replicas in bulk
批量/单个 设置索引副本数
PUT /index_name*/_settings { "index": { "number_of_replicas": 1 } }
Search Documents
match_all 搜索
GET /sw_segment-20230914/_search { "query": { "match_all": {} }, "size": 1 }
单字段排序匹配搜索(match)
GET /sw_segment-20230914/_search { "query": { "match": { "segment_id": "b7bb26fae59e4f45b101346cb83ff796.69.16946808855979526" } }, "sort": [ { "start_time": { "order": "desc" } } ], "size": 1 }
Elastic Cloud on Kubernetes (ECK / Elastic operator)
ECK operator下管理的Elasticsearch如果要修改cluster.routing.allocation.exclude
的参数配置,需要先为 elasticsearch 实例配置annotation: 'eck.k8s.elastic.co/managed=false',不然会配置一会就会被刷回原状
Other
对于有大量索引的刚重启的es集群
(主分片在1w-2w)
加快es集群恢复速度
结合es节点资源监控图,观测节点cpu压力,以及cpu IO wait
适当通过update cluster settings接口动态增加node_initial_primaries_recoveries (Defaults to 4
)
和 node_concurrent_recoveries
(A shortcut to set both cluster.routing.allocation.node_concurrent_incoming_recoveries
and cluster.routing.allocation.node_concurrent_outgoing_recoveries
Defaults to 2
)数值
通过使用 cluster settings + include_defaults=true 筛选查到当前配置值
减少集群从red状态到yellow状态的耗时:增加索引副本数量,增加node_initial_primaries_recoveries值
减少集群从yellow状态到green状态的耗时:增加 node_concurrent_recoveries 值
通过访问 /_cluster/allocation/explain
接口查到阻碍集群 to green(yellow)的原因
在es集群恢复期间因节点内存压力大(node was low on resources: memory.)而被k8s Evicted
调整缩小 jvm 配置值,尽量不超配(requests 和 limit尽量一致或提高requests值)
Error
集群分片数达到maximum错误
集群分片数达到maximum错误会有如下log信息,但是集群的健康状态不会改变
2022-11-10T10:26:03.643184618Z org.elasticsearch.common.ValidationException: Validation Failed: 1: this action would add [3] shards, but this cluster currently has [1999]/[2000] maximum normal shards open;
解决:
调整index ilm 策略或者调整集群的max_shards_per_node配置
临时生效配置:
curl -H "content-type: application/json" -X PUT "127.0.0.1:9200/_cluster/settings" -d '{"transient": {"cluster.max_shards_per_node": "5000"}}'
永久更改性配置:
curl -H "content-type: application/json" -X PUT "127.0.0.1:9200/_cluster/settings" -d '{"persistent": {"cluster.max_shards_per_node": "5000"}}'