Symptom
When the ES cluster has a large amount of data, restarting the ES cluster causes slow data recovery and instance faults.
Solution
1. Check memory configuration parameters.
Log in to the FusionInsight Manager and choose Elasticsearch > Service Configuration. On the service configuration page that is displayed, search for GC_OPTS.
Check the values of -Xms and -Xmx. The values of the two parameters must be the same.
The recommended values of -Xms and -Xmx for a single instance are as follows: Total memory of the node /2/ Number of real numbers on the node. If the value is greater than 31G, you are advised to set this parameter to 31G.
2. Set parameters by referring to https://forum.huawei.com/enterprise/en/Data-Cannot-Be-Recovered-After-the-ES-Cluster-Is-Restarted-for-a-Large-Amount-of-Data/thread/544257-899.
3.Log in to the FusionInsight Manager and choose Elasticsearch > Service Configuration. On the Service Configuration page that is displayed, choose Configuration > All Configuration > Customized and add the following customized parameters: cluster.routing.allocation.node_initial_primaries_recoveries 120
cluster.routing.allocation.node_concurrent_recoveries 60
cluster.routing.allocation.cluster_concurrent_rebalance 60
4. Run the following command to query the cluster status and wait until the cluster status changes to green:
Run the following command in common mode: curl -XGET "http://IP:HTTPPORT/_cluster/health?pretty"
Run the following command in security mode: curl -XGET --negotiate -k -u: "https://IP:HTTPPORT/_cluster/health?pretty"