tencent cloud

Elasticsearch Service

User Guide
Release Notes and Announcements
Release Notes
Product Announcements
Security Announcement
Product Introduction
Overview
Elasticsearch Version Support Notes
Features
Elastic Stack (X-Pack)
Strengths
Scenarios
Capabilities and Restrictions
Related Concepts
Purchase Guide
Billing Overview
Pricing
Elasticsearch Service Serverless Pricing
Notes on Arrears
ES Kernel Enhancement
Kernel Release Notes
Targeted Routing Optimization
Compression Algorithm Optimization
FST Off-Heap Memory Optimization
Getting Started
Evaluation of Cluster Specification and Capacity Configuration
Creating Clusters
Accessing Clusters
ES Serverless Guide
Service Overview
Basic Concepts
5-Minute Quick Experience
Quick Start
Access Control
Writing Data
Data Query
Index Management
Alarm Management
ES API References
Related Issues
Data Application Guide
Data Application Overview
Data Management
Elasticsearch Guide
Managing Clusters
Access Control
Multi-AZ Cluster Deployment
Cluster Scaling
Cluster Configuration
Plugin Configuration
Monitoring and Alarming
Log Query
Data Backup
Upgrade
Practical Tutorial
Data Migration and Sync
Use Case Construction
Index Configuration
SQL Support
Receiving Watcher Alerts via WeCom Bot
API Documentation
History
Introduction
API Category
Instance APIs
Making API Requests
Data Types
Error Codes
FAQs
Product
ES Cluster
Service Level Agreement
Glossary
New Version Introduction
Elasticsearch Service July 2020 Release
Elasticsearch Service February 2020 Release
Elasticsearch Service December 2019 Release

Uneven Cluster Load

PDF
Modo Foco
Tamanho da Fonte
Última atualização: 2021-08-11 11:17:23

Problem Description

In some cases, the CPU utilization of certain nodes in a cluster is much higher than that of other nodes, which can be clearly observed in node monitoring in the ES console.

Causes

The design of index shards is inappropriate.
The sizes of segments are uneven.
There are typical scenarios with requirements for storage of hot and warm data.

Troubleshooting

Inappropriate shard settings

1. Log in to the Kibana console and run the following command in Dev Tools to view the index shard information. If nodes with a high load have more index shards, the shards are unevenly assigned:
GET _cat/shards?v
2. Log in to the Kibana console and run the following command in Dev Tools to view the index information. Check whether the node shards are unevenly assigned based on the cluster configuration:
GET _cat/indices?v
3. Assign shards again and appropriately plan them to ensure that the total number of primary and replica shards is an integer multiple of the number of data nodes in the cluster.
Note:
Elasticsearch also searches .del files during search and filters documents marked with .del, which reduces the search efficiency and wastes the specification resources. We recommend you force a merge during off-peak hours. For more information, please see Force merge API.

Suggestions for shard planning

The shard size and quantity are two important factors affecting the stability and performance of an Elasticsearch cluster. All indexes in the cluster require appropriate shard planning; otherwise, large shards in unspecific businesses may incur excessive Elasticsearch performance overheads. The following are some suggestions for shard planning:
Keep the size of a single shard of an index between 20 and 50 GB.
1. Add a time suffix to the index, so you can implement scrolling by time for easier management.
2. While following the principle of single shard design, predict the final index size and estimate the number of index shards based on the number of cluster nodes so as to distribute the shards among the nodes as evenly as possible.
Note:
More primary shards are not necessarily better, as the more the primary shards, the higher the Elasticsearch performance overheads. We recommend you keep the total number of shards per node 30 times of the node memory. If there are too many shards, file handles are very likely to be used up, causing cluster failures.

Uneven segment sizes

1. Add "profile": true in the query body to check whether the test index has a shard whose query time is longer than that of other shards.
2. Specify preference=_primary and preference=_replica in the query separately and add "profile": true in the body to view the time it takes the primary and replica shards to make the query respectively. Check whether the primary or replica shard uses more time.
3. Log in to the Kibana console, run the following command in Dev Tools to view the shards, analyze the problem based on the segment information, and check whether the uneven loads are related to the uneven segment sizes.
GET _cat/segments/index?v&h=shard,segment,size,size.momery,ip
GET _cat/shards?v
4. Solve the problem in either of the following methods:
Force a merge during off-peak hours. For more information, please see Force merge API. Delete delete.doc in the cache completely and merge small segments into big ones.
Restart the node where the primary shard resides to trigger promoting the replica shard to the primary shard and generate a new replica shard. Data in the new primary shard is replicated to the new replica shard to ensure that they have the same segments.

Typical scenarios with requirements for storage of hot and warm data

If you add routing in a query or query hotspot data with a high query frequency, uneven data load will definitely occur.

Ajuda e Suporte

Esta página foi útil?

comentários