Elasticsearch的一些api随记

来自三线的随记
Admin讨论 | 贡献2023年12月12日 (二) 20:58的版本

Health

/_cat/health
/_cluster/health

Indices health

按条件查看索引状态

/_cat/indices?help
/_cat/indices?health=red&v&s=store.size:desc,index
/_cat/indices?health=yellow&v&s=store.size:desc,index
/_cat/indices?health=green&v&s=store.size:desc,index

Nodes

/_cat/nodes?v

查看es各节点磁盘空间占用、分片数目等

/_cat/allocation?v

Get master node

/_cat/master?v

Cluster allocation explain related

用于定位分片状态以及分片为何故障

/_cat/shards/index_name-*?v&s=state,index&h=index,shard,prirep,state,docs,store,ip,node,unassigned.reason
/_cluster/allocation/explain

Shards

粗略查看分片情况,特别是查看分片分布节点或大小/状态

GET /_cat/shards
GET /_cat/shards?index=index_name
GET /_cat/shards?index=index_na*

查看分片分配失败原因

/_cat/shards/index_name-*?v&s=state,index&h=index,shard,prirep,state,docs,store,ip,node,unassigned.reason

Adds a data stream or index to an alias, and sets the write index or data stream for the alias

为别名设置可写索引或数据流

If the alias doesn’t exist, the add action creates it.

POST /_aliases 
{
  "actions": [
      {
            "add": {
               "index": "es-k8s-logs-000020",    
               "alias": "es-k8s-logs-alias",    
               "is_write_index": true 
          }
      }
    ]
}

Thread pool related

https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-thread-pool.html

/_cluster/settings?pretty&include_defaults=true | grep processors

Get maximum number of threads info

curl "127.1:9200/_cat/thread_pool?v&h=ip,node_name,id,name,max,size,queue_size,queue,active,rejected&pretty"

Templates 模板

/_cat/templates?v

⚠️ /_template/${template_name} is legacy index templates, which are deprecated and will be replaced by the composable templates introduced in Elasticsearch 7.8.

新版本中使用 /_index_template 取代

GET/PUT /_template/${template_name}

Use template to change the replicas settings of all indexes (Legacy index template)

PUT /_template/${template_name}
{
    "order": 2147483647,
    "index_patterns": [
        "*"
    ],
    "settings": {
        "index": {
            "number_of_replicas": "0"
        }
    }
}

ILM (index lifecycle policy) 索引生命周期

顾名思义,ilm另外也可用于做ES集群的冷热温架构。

不同的阶段(phase)能做哪些事可以在这个 Document 查看

比较难受的是,ilm目前没有类似 _cat/templates 的接口一次性只查看这个集群已配置的 ILM 策略名字,只能一次性获取全部策略具体定义

GET /_ilm/policy

Get specific ilm policy detail 获取特定ILM策略定义

GET /_ilm/policy/${ilm_name}
PUT /_ilm/policy/ilm-30d-delete
{
  "policy": {
    "phases": {
      "delete": {
        "min_age": "30d",
        "actions": {
          "delete": {
            "delete_searchable_snapshot": true
          }
        }
      }
    }
  }
}

Get index's ilm status 获取索引当前 ILM 状态

get /${index_name}/_ilm/explain

Move index's ilm to step 修改索引的ILM阶段状态(人为触发ILM action执行)

https://www.elastic.co/guide/en/elasticsearch/reference/current/ilm-move-to-step.html

POST _ilm/move/my-index-000001
{
  "current_step": { 
    "phase": "new",
    "action": "complete",
    "name": "complete"
  },
  "next_step": { 
    "phase": "warm",
    "action": "forcemerge", 
    "name": "forcemerge" 
  }
}

Manually create an index that is managed by template and ILM (including rollover operations by day) 手动创建原本应由template和ilm管控的索引,且索引名内包含日期(动态索引名)

这种索引不能直接粗暴地 PUT /index-name-2022.10.23-000022 以创建索引,否则手动创建出来的索引,在rollover滚动 (例如rollover-max_age:1d)的时候,创建出来的新索引名字仍然是创建索引时定义的日期,而不是当天轮滚发生时的日期(如 index-name-2022.10.23-000023)

这个现象可以通过判断 GET /index-name/_settingsindex.provided_name 属性看出来

解法:

PUT %3Cindex-name-%7Bnow%2Fd%7D-000099%3E

注意在kibana-Dev Tools中不要做URL Decode,他就是这样的需要编码一下(解码后就是: <index-name-{now/d}-000099>)

ps: 如果怕创建错名字的话,可以使用 GET %3Cindex-name-%7Bnow%2Fd%7D-000099%3E/_settings 预览一下生成的索引名效果

Cluster settings

/_cluster/settings?include_defaults=true&pretty
/_cluster/settings?include_defaults=true

Wildcard expressions or all indices are not allowed

允许泛匹配删除索引
PUT /_cluster/settings
{
  "persistent": {
    "action": {
      "destructive_requires_name": "false"
    }
  }
}

primaries recovery settings

控制索引恢复或者relocating的并发数
{
    "transient": {
        "cluster": {
            "routing": {
                "allocation": {
                    "node_initial_primaries_recoveries": 10,
                    "node_concurrent_incoming_recoveries": null,
                    "node_concurrent_outgoing_recoveries": null,
                    "node_concurrent_recoveries": 20
                }
            }
        }
    }
}
{
    "transient": {
        "cluster": {
            "routing": {
                "allocation": {
                    "node_initial_primaries_recoveries": null,
                    "node_concurrent_incoming_recoveries": null,
                    "node_concurrent_recoveries": null
                }
            }
        }
    }
}
{
    "persistent": {
        "cluster": {
            "routing": {
                "allocation": {
                    "node_initial_primaries_recoveries": 30,
                    "node_concurrent_incoming_recoveries": null,
                    "node_concurrent_recoveries": 10
                }
            }
        }
    }
}

cluster.routing.allocation.node_concurrent_recoveries: A shortcut to set both cluster.routing.allocation.node_concurrent_incoming_recoveries and cluster.routing.allocation.node_concurrent_outgoing_recoveries.

# PUT /_cluster/settings
{
    "persistent": {
        "cluster": {
            "routing": {
                "allocation": {
                    "node_concurrent_recoveries": 8
                }
            }
        }
    }
}
   

recovery.max_bytes_per_sec 修改relocating时并发传输数据量

加大此数值可以有效缩短es relocating index的耗时

indices.recovery.max_bytes_per_sec: Limits total inbound and outbound recovery traffic for each node. Applies to both peer recoveries as well as snapshot recoveries (i.e., restores from a snapshot). Defaults to 40mb unless the node is a dedicated cold or frozen node, in which case the default relates to the total memory available to the node.

Index settings

modify the number of replicas in bulk

批量/单个 设置索引副本数

PUT /index_name*/_settings
{
  "index": {
    "number_of_replicas": 1
  }
}

Search Documents

match_all 搜索

GET /sw_segment-20230914/_search
{
  "query": {
    "match_all": {}
  },
  "size": 1
}

单字段排序匹配搜索(match)

GET /sw_segment-20230914/_search
{
  "query": {
    "match": {
      "segment_id": "b7bb26fae59e4f45b101346cb83ff796.69.16946808855979526"
    }
  },
  "sort": [
    {
      "start_time": {
        "order": "desc"
      }
    }
  ],
  "size": 1
}

Elastic Cloud on Kubernetes (ECK / Elastic operator)

ECK operator下管理的Elasticsearch如果要修改cluster.routing.allocation.exclude 的参数配置,需要先为 elasticsearch 实例配置annotation: 'eck.k8s.elastic.co/managed=false',不然会配置一会就会被刷回原状


Other

对于有大量索引的刚重启的es集群

(主分片在1w-2w)

加快es集群恢复速度

结合es节点资源监控图,观测节点cpu压力,以及cpu IO wait

适当通过update cluster settings接口动态增加node_initial_primaries_recoveries (Defaults to 4)

和 node_concurrent_recoveries

(A shortcut to set both cluster.routing.allocation.node_concurrent_incoming_recoveries and cluster.routing.allocation.node_concurrent_outgoing_recoveries

Defaults to 2)数值

通过使用 cluster settings + include_defaults=true 筛选查到当前配置值


减少集群从red状态到yellow状态的耗时:增加索引副本数量,增加node_initial_primaries_recoveries值


减少集群从yellow状态到green状态的耗时:增加 node_concurrent_recoveries 值


通过访问 /_cluster/allocation/explain 接口查到阻碍集群 to green(yellow)的原因

在es集群恢复期间因节点内存压力大(node was low on resources: memory.)而被k8s Evicted

调整缩小 jvm 配置值,尽量不超配(requests 和 limit尽量一致或提高requests值)

Error

集群分片数达到maximum错误

集群分片数达到maximum错误会有如下log信息,但是集群的健康状态不会改变

2022-11-10T10:26:03.643184618Z org.elasticsearch.common.ValidationException: Validation Failed: 1: this action would add [3] shards, but this cluster currently has [1999]/[2000] maximum normal shards open;

解决:

调整index ilm 策略或者调整集群的max_shards_per_node配置

临时生效配置:

curl -H "content-type: application/json" -X PUT "127.0.0.1:9200/_cluster/settings" -d '{"transient": {"cluster.max_shards_per_node": "5000"}}'

永久更改性配置:

curl -H "content-type: application/json" -X PUT "127.0.0.1:9200/_cluster/settings" -d '{"persistent": {"cluster.max_shards_per_node": "5000"}}'