Got it

Elasticsearch Development Specifications

Latest reply: Nov 29, 2018 05:15:13 705 2 1 0 0

1.1 Rules

Elasticsearch Application Scenarios

1.         Types of the data to be searched are as follows: structured data (RDS), semi-structured data (web pages and XML files), and unstructured data. (logs, pictures, and images). Elasticsearch can perform a series of operations such as cleaning, word segmentation, and establishment of inverted indexes for the preceding data types, and then provide the full-text search capability.

2.         The search criteria are diversified (for example, too many fields are involved). The common query cannot meet the following requirements: Query simple words and phrases, or multiple forms of words or phrases in the full text.

3.         Read data is much more than written data.

Introduce required classes in Elasticsearch applications

Correct:

//Classes that need to be imported when RestClient is created:
org.elasticsearch.client.RestClient;import org.elasticsearch.client.RestClientBuilder;
//Classes that need to be imported when a request is sent:
org.apache.http.HttpEntity;import 
org.apache.http.entity.ContentType;
//Class that need to be imported when a response is parsed: 
org.elasticsearch.client.Response;

If the cluster is installed in the security mode, ensure that the time on the client is the same as that on the server

If the cluster is of the security edition and Kerberos authentication is required, the time on the server must be the same as that on the client. Pay attention to the time difference conversion between time zones. If the time is inconsistent, the client authentication fails and subsequent service processes cannot be executed.

When a self-built user performs index data operations, authentication information needs to be configured and the corresponding read and write permissions must be assigned to the user

If the cluster is of the security edition, Kerberos authentication is required for connection to the server. Perform the following operations to log in to the KDC:

private static void setSecConfig() throws Exception {
        String krb5ConfFile = System.getProperty("user.dir") + File.separator + "conf" + File.separator + "krb5.conf";
        LOG.info("krb5ConfFile: " + krb5ConfFile);
        System.setProperty("java.security.krb5.conf", krb5ConfFile);
        String jaasPath = System.getProperty("user.dir") + File.separator + "conf" + File.separator + "jaas.conf";
        LOG.info("jaasPath: " + jaasPath);
        System.setProperty("java.security.auth.login.config", jaasPath);
        System.setProperty("javax.security.auth.useSubjectCredsOnly", "false");
        //add for ES security indication 
        System.setProperty("es.security.indication", "true");
        LOG.info("es.security.indication is  " + System.getProperty("es.security.indication"));
}

The krb5.conf and user.keytable files are obtained from the user management page of FusionInsight Manager. In jaas.conf, change principal to the user name and keyTab to the actual storage path of user.keytab.

The created user must have the read and write permissions on the index to be operated. Select the administrator role for the user or grant the read and write permissions of the corresponding index to the user.

If the cluster is of the security edition, set up a secure httpClientBuilder

If the cluster is of the security edition, set up a secure httpClientBuilder using the following codes:

HttpAsyncClientBuilder httpClientBuilder = HttpAsyncClientBuilder.create().setDefaultRequestConfig(requestConfigBuilder.build())
//default settings for connection pooling may be too constraining
.setMaxConnPerRoute(DEFAULT_MAX_CONN_PER_ROUTE).setMaxConnTotal(DEFAULT_MAX_CONN_TOTAL)
.setSSLContext(SSLContext.getDefault());
if (httpClientConfigCallback != null) {
httpClientBuilder = httpClientConfigCallback.customizeHttpClient(httpClientBuilder);
}
if (isSecureMode) {
wrapSecureHttpAsyncClientBuilder(httpClientBuilder);
}

Invoke the RestClient closing function before the application ends

When the application ends, invoke the restClient.close() function.

1.2 Suggestions

Creating Elasticsearch Indexes

l   To reduce the number of indexes and avoid huge mappings, store data with the same index structure in the same index.

l   Do not place irrelevant data in the same index to avoid sparsity.

note

These suggestions are not recommended when you use the parent/child relationship between documents because this function is supported only by documents in the same index.

Shard Division Policy

Each Shard can process index and query requests. When setting the number of Shards, consider the following two aspects:

l   It is recommended that the maximum capacity of a single Shard be less than or equal to 30 GB.

l   Determine the number of primary Shards according to the maximum data capacity of the index and the capacity of a single Shard.

l   To improve data reliability, set the number of replica Shards properly.

note

Once the number of primary Shards is determined, it cannot be changed. The number of replica Shards can be modified as required.

Do not return alarge result set

Elasticsearch is designed as a search engine that makes it very good at acquiring the best document that matches the query. It is not suitable for retrieving all documents that match a particular query. In this case, use the Scroll API.

 


This article contains more resources

You need to log in to download or view. No account? Register

x

welcome
View more
  • x
  • convention:

Elasticsearch Development Specifications-2811171-1
View more
  • x
  • convention:

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.