Got it

Hive:Development Environment Preparation

Latest reply: Sep 24, 2018 06:10:17 1226 3 0 0 0

1.1.1 Development and Operating Environment

Hive supports application development by using the JDBC/HCatalog/Python/ODBC API. Table 1-1 describes the development and operating environment to be prepared. JDBC and HCatalog APIs use the same development and operating environment.

Table 1-1 Development environment

Item

Description

OS

l  Development environment: Windows OS. Windows 7 or later is recommended.

l  Operating environment: Windows OS or Linux OS

JDK installation

Basic configuration for the development and operating environment. The version requirements are as follows:

The FusionInsight HD cluster's server and client support only the built-in Oracle JDK 1.8 and therefore JDK replacement is not allowed.

Customers' applications that need to reference the JAR files of SDK to run in the application processes support Oracle JDK and IBM JDK.

l  Oracle JDK versions: 1.7 and 1.8

l  Recommended IBM JDK versions: 1.7.8.10, 1.7.9.40, and 1.8.3.0

NOTE

FusionInsight servers support only TLSv1.1 and TLSv1.2 to meet security requirements. IBM JDK supports only TLSv1.0 by default. Therefore, if IBM JDK is used, set com.ibm.jsse2.overrideDefaultTLS to true for IBM JDK to support TLSv1.0, TLSv1.1, and TLSv1.2.

For details, see https://www.ibm.com/support/knowledgecenter/zh/SSYKE2_8.0.0/com.ibm.java.security.component.80.doc/security-component/jsse2Docs/matchsslcontext_tls.html#matchsslcontext_tls.

Eclipse installation and configuration

Basic configuration for the development and operating environment. The Eclipse version must be 4.2 or later.

NOTE

l  To use the IBM JDK, ensure that the JDK configured in Eclipse is the IBM JDK.

l  To use the Oracle JDK, ensure that the JDK configured in Eclipse is the Oracle JDK.

l  If different Eclipses are used, different workspaces and example projects in different paths are required.

Developer account preparation

For details, see Application Development Guide > Security Mode > Security Authentication > Preparing the Developer Account in the FusionInsight HD Product Documentation.

Client installation

Development environment: For details, see Application Development Guide > Security Mode > Security Authentication > Configuring Client Files in the FusionInsight HD Product Documentation.

 

Python development environment

Item

Description

OS

Development and operating environment: Linux OS

Python installation

Python is a tool used to develop Hive applications. Its version must be 2.6.6 or later but should not be later than 2.7.0.

setuptools installation

Basic configuration for the Python development environment. The setuptools version must be 5.0 or later.

Developer account preparation

For details, see Application Development Guide > Security Mode > Security Authentication > Preparing the Developer Account in the FusionInsight HD Product Documentation.

Client installation

For details, see Software Installation > Initial Configuration > Configuring Client > Installing a Client in the FusionInsight HD Product Documentation.

 

note

For details about how to install and configure the Python development tool, see Using the FusionInsight Client for Development.

ODBC development environment

Item

Description

OS

Development and operating environment: Windows OS or Linux OS

JDK installation

Basic configuration for the development and operating environment. The version requirements are as follows:

The FusionInsight HD cluster's server and client support only the built-in Oracle JDK 1.8 and therefore JDK replacement is not allowed.

Customers' applications that need to reference the JAR files of SDK to run in the application processes support Oracle JDK and IBM JDK.

l  Oracle JDK version: 1.8

l  Recommended IBM JDK version: 1.8.3.0

NOTE

FusionInsight servers support only TLSv1.1 and TLSv1.2 to meet security requirements. IBM JDK supports only TLSv1.0 by default. Therefore, if IBM JDK is used, set com.ibm.jsse2.overrideDefaultTLS to true for IBM JDK to support TLSv1.0, TLSv1.1, and TLSv1.2.

For details, see https://www.ibm.com/support/knowledgecenter/zh/SSYKE2_8.0.0/com.ibm.java.security.component.80.doc/security-component/jsse2Docs/matchsslcontext_tls.html#matchsslcontext_tls.

OpenSSL installation

Basic configuration for the development environment. The OpenSSL version must be 1.x.

LibSASL installation

Basic configuration for the Linux development environment. The LibSASL version must be 2.x.

Cyrus SASL installation

Basic configuration for the Windows development environment. The Cyrus SASL version must be 2.1.23.

MIT Kerberos installation

Basic configuration for the Windows development environment.

Visual Studio 2012 installation

Basic configuration for the Windows development environment.

Developer account preparation

For details, see Application Development Guide > Security Mode > Security Authentication > Preparing the Developer Account in the FusionInsight HD Product Documentation.

Client installation

l  Windows environment: For details, see Application Development Guide > Security Mode > Security Authentication > Configuring Client Files in the FusionInsight HD Product Documentation.

l  Linux environment: For details, see Software Installation > Initial Configuration > Configuring Client > Installing a Client in the FusionInsight HD Product Documentation.

 

note

For details about how to install and configure the ODBC development tool, see section "Configuring the ODBC Sample Project" in the FusionInsight HD Product Documentation.

1.1.2 Preparing for Security Authentication

1.1.2.1 Security Authentication

1.1.2.1.1 Security Authentication Principle and Mechanism

Function

Kerberos, named after the character Cerberus from Greek mythology, the ferocious three-headed guard dog of Hades, is now used as a concept for security authentication. Systems using Kerberos adopts the client/server structure and encryption technologies such as AES, and allows the client and server to authenticate each other. Kerberos is used to prevent interception and replay attacks, and protect data integrity. It is a system that manages keys by using an asymmetric key mechanism.

Structure

Figure 1-1 shows the Kerberos architecture and Table 1-2 describes the Kerberos modules.

Figure 1-1 Kerberos architecture

20180829165715248003.png

 

Table 1-2 Kerberos modules

Module

Description

Application Client

An application client, which is usually an application that submits tasks or jobs.

Application Server

An application server, which is usually an application that an application client accesses.

Kerberos

A service that provides security authentication.

KerberosAdmin

A process that provides authentication user management.

KerberosServer

A process that provides authentication ticket distribution.

 

The process and principle are described as follows:

An application client can be a service in the cluster or a secondary development application of the customer. An application client can submit tasks or jobs to an application service.

1.         Before submitting a task or job, the application client needs to apply for a ticket granting ticket (TGT) from the Kerberos service to establish a secure session with the Kerberos server.

2.         After receiving the TGT request, the Kerberos service resolves parameters in the request to generate a TGT, and uses the key of the username specified by the client to encrypt the response.

3.         After receiving the TGT response, the application client (based on the underlying RPC) resolves the response and obtains the TGT, and then applies for a server ticket (ST) of the application server from the Kerberos service.

4.         After receiving the ST request, the Kerberos service verifies the TGT validity in the request and generates an ST of the application service, and then uses the application service key to encrypt the response.

5.         After receiving the ST response, the application client packages the ST into a request and sends the request to the application server.

6.         After receiving the request, the application server uses its local application service key to resolves the ST. After successful verification, the request becomes valid.

Basic Concepts

The following concepts can help users learn the Kerberos architecture quickly and understand the Kerberos service better. The following uses security authentication for HDFS as an example:

TGT

A TGT is generated by the Kerberos service and used to establish a secure session between an application and the Kerberos server. The validity period of a TGT is 24 hours. After 24 hours, the TGT expires automatically.

The following describes how to apply for a TGT (HDFS is used as an example):

1.         You can obtain a TGT through an interface provided by HDFS.

/**
  * login Kerberos to get TGT, if the cluster is in security mode
  * @throws IOException if login is failed
  */
  private void login() throws IOException {       
  // not security mode, just return
    if (! "kerberos".equalsIgnoreCase(conf.get("hadoop.security.authentication"))) {
        return;
    }
        
    //security mode
    System.setProperty("java.security.krb5.conf", PATH_TO_KRB5_CONF);
        
    UserGroupInformation.setConfiguration(conf);
    UserGroupInformation.loginUserFromKeytab(PRNCIPAL_NAME, PATH_TO_KEYTAB);        
  }

2.         You can obtain a TGT by running shell commands of the client in kinit mode. For details, see the Shell O&M command description.

ST

An ST is generated by the Kerberos service and used to establish a secure session between an application and application service. An ST is valid only once.

In FusionInsight products, the generation of an ST is based on the Hadoop-RPC communication. The underlying RPC submits a request to the Kerberos server and the Kerberos server generates an ST.

Authentication Code Example Elaboration

 
package com.huawei.bigdata.hdfs.examples;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.security.UserGroupInformation;
public class KerberosTest {
    private static String PATH_TO_HDFS_SITE_XML = KerberosTest.class.getClassLoader().getResource("hdfs-site.xml")
            .getPath();
    private static String PATH_TO_CORE_SITE_XML = KerberosTest.class.getClassLoader().getResource("core-site.xml")
            .getPath();
    private static String PATH_TO_KEYTAB = KerberosTest.class.getClassLoader().getResource("user.keytab").getPath();
    private static String PATH_TO_KRB5_CONF = KerberosTest.class.getClassLoader().getResource("krb5.conf").getPath();
    private static String PRNCIPAL_NAME = "develop";
    private FileSystem fs;
    private Configuration conf;
     
    /**
     * initialize Configuration
     */
    private void initConf() {
        conf = new Configuration();
        
        // add configuration files
        conf.addResource(new Path(PATH_TO_HDFS_SITE_XML));
        conf.addResource(new Path(PATH_TO_CORE_SITE_XML));
    }
    
    /**
     * login Kerberos to get TGT, if the cluster is in security mode
     * @throws IOException if login is failed
     */
    private void login() throws IOException {       
        // not security mode, just return
        if (! "kerberos".equalsIgnoreCase(conf.get("hadoop.security.authentication"))) {
            return;
        }
        
        //security mode
        System.setProperty("java.security.krb5.conf", PATH_TO_KRB5_CONF);
         
        UserGroupInformation.setConfiguration(conf);
        UserGroupInformation.loginUserFromKeytab(PRNCIPAL_NAME, PATH_TO_KEYTAB);        
    }
    
    /**
     * initialize FileSystem, and get ST from Kerberos
     * @throws IOException
     */
    private void initFileSystem() throws IOException {
        fs = FileSystem.get(conf);
    }
    
    /**
     * An example to access the HDFS
     * @throws IOException
     */
    private void doSth() throws IOException {
        Path path = new Path("/tmp");
        FileStatus fStatus = fs.getFileStatus(path);
        System.out.println("Status of " + path + " is " + fStatus);
        //other thing
    }
    public static void main(String[] args) throws Exception {
        KerberosTest test = new KerberosTest();
        test.initConf();
        test.login();
        test.initFileSystem();
        test.doSth();       
    }
}

note

1.     During Kerberos authentication, you need to configure the file parameters required for configuring the Kerberos authentication, including the keytab path, Kerberos authentication username, and krb5.conf configuration file of the client for Kerberos authentication.

2.     Method login() indicates invoking the Hadoop interface to perform Kerberos authentication and generating a TGT.

3.     Method doSth() indicates invoking the Hadoop interface to access the file system. In this situation, the underlying RPC automatically carries the TGT to Kerberos for verification and then an ST is generated.

4.     The preceding code can be used to create KerberosTest.java in the HDFS secondary development sample project in security mode and run and view the commissioning result. For details, see the HDFS Development Guide.

1.1.2.1.2 Preparing the Developer Account

Scenario

A developer account is used to run the sample project. During the development of different service components, different user rights must be granted. (For details about rights configuration, see the development guide of the corresponding services.)

Procedure

                               Step 1      Log in to FusionInsight Manager and choose System > Role Management > Add Role.

1.         Enter a role name, for example, developrole.

2.         Edit the role. Table 1-3 describes rights to be granted for different services.

Table 1-3 Rights list

Service

Rights to Be Granted

HDFS

In Rights, choose HDFS > File System, select Read, Write, and Execute for the hdfs://hacluster/, and click OK.

MapReduce/YARN

1.    In Rights, choose HDFS > File System > hdfs://hacluster/, select Read, Write, and Execute for the user, and click OK.

2.    Edit the role. In Rights, choose Yarn > Scheduler Queue > root, select the default option Submit, and click OK.

HBase

In Rights, choose Hbase > HBase Scope > global. Select the admin, create, read, write, and execute permissions, and click OK.

Spark/Spark2x

1.    In Rights, choose Hbase > HBase Scope > global. Select the default option create and click OK.

2.    In Rights, choose Hbase > HBase Scope > global > hbase. Select execute for hbase:meta and click OK.

3.    Edit the role. In Rights, choose HDFS > File System > hdfs://hacluster/ > user, select Execute for hive, and click OK.

4.    Edit the role. In Rights, choose HDFS > File System > hdfs://hacluster/ > user > hive, select Read, Write, and Execute for warehouse, and click OK.

5.    Edit the role. In Rights, choose Hive > Hive Read Write Privileges, select the default option Create, and click OK.

6.    Edit the role. In Rights, choose Yarn > Scheduler Queue > root, select the default option Submit, and click OK.

Hive

In Rights, choose Yarn > Scheduler Queue > root, select the default option Submit and Admin, and click OK.

NOTE

Extra operation permissions required for Hive application development must be obtained from the system administrator. For details about permission requirements, see Required Permissions section in Hive Development Guide.

 

Flink

1.    In the Rights table, choose HDFS > File System > hdfs://hacluster/ > flink, select Read, Write, and Execute, and click Service in the Rights table to return.

2.    In the Rights table, choose Yarn > Scheduler Queue > root, select the default option Submit, and click OK.

Solr

-

Kafka

-

Storm/CQL

-

Redis

In Rights, choose Redis > Redis Access Manage, select Read, Write, and Management, and click OK.

Oozie

1.    In Rights, choose Oozie > Common User Privileges, and click OK.

2.    Edit the role. In Rights, choose HDFS > File System, select Read, Write, and Execute for hdfs://hacluster, and click OK.

3.    Edit the role. In Rights, choose Yarn, select Cluster Admin Operations, and click OK.

Unified SQL (Fiber)

1.    Edit the role. In Rights, choose HDFS > File System > hdfs://hacluster/ > user > hive, select Execute, and click OK.

2.    Edit the role. In Rights, choose HDFS > File System > hdfs://hacluster/ > user > hive > warehouse, select Read, Write, and Execute, and click OK.

3.    Edit the role. In Rights, choose Yarn > Scheduler Queue > root, select the default option Submit, and click OK.

4.    Grant the following permissions if the Phoenix engine is to be used:

a.     In Rights, choose Hbase > HBase Scope > global, select the default options create, read, write, and execute, and click OK.

b.    Edit the role. In Rights, choose Hbase > HBase Scope > global > hbase. Select execute for hbase:meta,and click OK.

5.    Perform the following operations if the Hive and Spark engines are to be used:

a.     Perform 4 first if you need to access HBase data.

b.    Edit the role. In Rights, choose HDFS > File System > hdfs://hacluster/ > tmp > hive-scratch, select Read, Write, and Execute, and click OK.

c.     Edit the role. In Rights, choose Hive > Hive Read Write Privileges, select the default option Create, and click OK.

 

                               Step 2      Choose System > User Group Management >Add Groupto create a user group for the sample project, for example, developgroup.

                               Step 3      Choose System > User Management > User > Add User to create a user for the sample project.

                               Step 4      Enter a user name, for example, developuser. Select the corresponding Usertype and User Group to which the user is to be added according to Table 1-4, bind the role developrole to obtain rights, and click OK.

Table 1-4 User type and user group list

Service

User Type

User Group

HDFS

Machine-Machine

Joining the developgroup and supergroup groups

Set the primary group to supergroup.

MR/Yarn

Machine-Machine

Joining the developgroup group.

HBase

Machine-Machine

Joining the hbase group.

Set the primary group to hbase.

Spark

Machine-Machine

Joining the developgroup group If the user needs to interconnect with Kafka, add the Kafka user group.

Hive

Machine-Machine/Human-Machine

Joining the hive group.

Solr

Machine-Machine

Joining the solr group.

Kafka

Machine-Machine

Joining the kafkaadmin group.

Storm/CQL

Human-Machine

Joining the storm group.

Redis

Machine-Machine

Joining the developgroup group.

Oozie

Human-Machine

Joining the hadoop, supergroup,and hive groups

If the multi-instance function is enabled for Hive, the user must belong to a specific Hive instance group, for example, hive3.

Unified SQL (Fiber)

Machine-Machine

Joining the developgroup group.

 

                               Step 5      On the homepage of FusionInsight Manager, choose System > User Management. Select developuser from the user list and click 20180829165717896005.png to download authentication credentials. Save the downloaded package and decompress the file to obtain user.keytab and krb5.conf files. These files are used for security authentication during the sample project. For details, see the corresponding service development guide.

note

If the user type is human-machine, you need to change the initial password before downloading the authentication credential file. Otherwise, Password has expired - change password to reset is displayed when you use the authentication credential file. As a result, security authentication fails.For details about how to change the password, see section "Changing an Operation User Password" in the Administrator Guide..

 

----End

1.1.2.1.3 Configuring Client Files

During application development, download the cluster client to the local PC.

                               Step 1      Confirm that the components required by the FusionInsight HD cluster have been installed and are running properly.

                               Step 2      Ensure that the time difference between the client and the FusionInsight HD cluster is less than 5 minutes.

Time of the FusionInsight HD cluster can be viewed in the upper-right corner on the FusionInsight Manager page.

                               Step 3      Download the client to the local PC by following instructions in Software Installation > Initial Configuration > Configuring Client > Installing a Client and decompress the installation package. For example, decompress the package to D:\FusionInsight_Services_ClientConfig. The path cannot contain spaces.

                               Step 4      Go to the directory and double-click the install. bat file.

The project dependent package is automatically imported to the lib directory and configuration files are automatically imported to the configuration file directory of each service sample project.

Dependency packages and configuration files are required for running a sample project. Table 1-5 lists the path of the configuration file for each component sample project.

Table 1-5 Paths

Project

Path

CQL

src\main\resource

HBase

conf

HDFS

conf

Hive

conf

Kafka

src\main\resource

MapReduce

conf

Oozie

oozie-example\conf

Redis

src\config

Solr

conf

Storm

src\main\resource

 

                               Step 5      Configure network connections for the client.

Copy all items from the hosts file in the decompression directory to the hosts file on the host where the client is installed. Ensure that the network communication between the local PC and hosts listed in the hosts file in the decompression directory is normal.

note

l  If the host where the client is installed is not a node in the cluster, configure network connections for the client to prevent errors from occurring when you run commands on the client.

l  The local hosts file in a Windows environment is stored in, for example, C:\WINDOWS\system32\drivers\etc\hosts.

----End

1.1.2.1.4 Handling an Authentication Failure

Symptom

An authentication failure occurs during the commissioning and running of an example project.

Procedure

There are many reasons that will cause an authentication failure. The following steps are recommended for troubleshooting in different scenarios:

                               Step 1      Check whether the network connection between the device where this application runs and the FusionInsight cluster is normal, and check whether the TCP and UDP ports required by Kerberos authentication can be accessed.

                               Step 2      Check whether each configuration file is correctly read and stored in a correct directory.

                               Step 3      Check whether the username and keytab file are obtained as instructed.

                               Step 4      Check whether the configuration information is properly set before initiating an authentication.

                               Step 5      Check whether multiple authentication requests are initiated in the same process. That is, check whether the login() method is invoked repeatedly.

                               Step 6      If the problem persists, contact Huawei engineers for further analysis.

----End

Authentication Failure Example

If "clock skew too great" is displayed, handle the problem using the following method:

                               Step 1      Check the FusionInsight cluster time.

                               Step 2      Check the time of the machine where the development environment is located. The difference between the machine time and the cluster time must be less than 5 minutes.

----End

If "(Receive time out) can not connect to kdc server" is displayed, handle the problem using the following method:

                               Step 1      Check whether the content of the krb5.conf file is correct. That is, check whether the file content is the same as the service IP address configuration of KerberoServer in the cluster.

                               Step 2      Check whether the Kerberos service is running properly.

                               Step 3      Check whether the firewall is disabled.

----End

1.1.2.2 Preparing Authentication Mechanism Code

Scenario

In a secure cluster environment, components must perform mutual authentication before communicating with each other to ensure communication security. HBase application development requires ZooKeeper and Kerberos security authentication. For the jaas.conf file used for ZooKeeper authentication and the keytab file and principal file used for Kerberos authentication, you can contact the administrator to create the files and obtain them. For details about how to use the files, see related description in the example code.

Security authentication uses the code authentication mode. This example project applies to the Oracle Java platform and the IBM Java platform.

The following code snippet belongs to the TestMain class of the com.huawei.bigdata.hbase.examples package.

l   Code authentication

try { 
 init(); 
 login(); 
 }  
catch (IOException e) { 
 LOG.error("Failed to login because ", e); 
 return; 
}

l   Initial configuration

private static void init() throws IOException{ 
     // Default load from conf directory 
     conf = HBaseConfiguration.create(); 
     String userdir = System.getProperty("user.dir") + File.separator + "conf" + File.separator; 
     conf.addResource(new Path(userdir + "core-site.xml")); 
     conf.addResource(new Path(userdir + "hdfs-site.xml")); 
     conf.addResource(new Path(userdir + "hbase-site.xml")); 
}

l   Secure login

Set userName to the actual username based on the actual situation, for example, developuser.

private static void login() throws IOException { 
    if (User.isHBaseSecurityEnabled(conf)) { 
      String userdir = System.getProperty("user.dir") + File.separator + "conf" + File.separator; 
      userName = "developuser"; 
      userKeytabFile = userdir + "user.keytab"; 
      krb5File = userdir + "krb5.conf"; 
 
      /* 
       * if need to connect zk, please provide jaas info about zk. of course, 
       * you can do it as below: 
     * System.setProperty("java.security.auth.login.config", confDirPath + 
       * "jaas.conf"); but the demo can help you more : Note: if this process 
       * will connect more than one zk cluster, the demo may be not proper. you 
       * can contact us for more help 
       */ 
      LoginUtil.setJaasConf(ZOOKEEPER_DEFAULT_LOGIN_CONTEXT_NAME, userName, userKeytabFile); 
      LoginUtil.setZookeeperServerPrincipal(ZOOKEEPER_SERVER_PRINCIPAL_KEY, 
          ZOOKEEPER_DEFAULT_SERVER_PRINCIPAL); 
      LoginUtil.login(userName, userKeytabFile, krb5File, conf); 
    } 
  }

1.1.3 Using the FusionInsight Client for Development

Background

The application installation directory of the Hive client contains a Hive development example project. You can import the example project to Eclipse and start to learn it.

Prerequisites

Ensure that the time difference between the local PC and the FusionInsight cluster is shorter than 5 minutes. If the time difference cannot be determined, contact the system administrator. The time of the FusionInsight cluster can be viewed in the upper right corner on FusionInsight Manager.

Procedure for Using Java

                               Step 1      Decompress the FusionInsight client installation package and obtain the example projects jdbc-examples and HCatalog-examples, which are saved in the Hive sub-folder of the FusionInsight_Services_ClientConfig folder.

                               Step 2      Copy the keytab and krb5.conf files obtained in section "Preparing the Developer Account" to the conf directory of the example project.

                               Step 3      Import the example project to the Eclipse development environment.

1.         Choose File > Import > General > Existing Projects into Workspace > Next > Browse.

The Browse dialog box is displayed.

2.         Select the example project folder, and click Finish.

                               Step 4      Set the Eclipse text file encoding format to prevent garbled characters.

1.         On the Eclipse menu bar, choose Window > Preferences.

The Preferences window is displayed.

2.         Choose General > Workspace from the navigation tree. In the Text file encoding area, select Other, set the value to UTF-8, click Apply, and click OK, as shown in Figure 1-2.

Figure 1-2 Setting the Eclipse encoding format

20180829165719789008.png

 

----End

Procedure for Using Python

                               Step 1      Check that Python 2.6.6 or later has been installed on a client. The Python version cannot be later than 2.7.0.

The Python version can be viewed by running the python command on the command-line interface (CLI) of the client. The following information indicates that the Python version is 2.6.6.

Python 2.6.6 (r266:84292, Oct 12 2012, 14:23:48)  
[GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2 
Type "help", "copyright", "credits" or "license" for more information.    

                               Step 2      Check that setuptools 5.0 or later has been installed on a client.

To obtain the software, visit the official website.

Copy the downloaded setuptools package to the client, decompress the package, go to the decompressed directory, and run the python setup.py install command in the CLI of the client.

The following information indicates that setuptools 5.7 is installed successfully.

Finished processing dependencies for setuptools==5.7    

                               Step 3      Install Python on the client.

1.         Go to the directory in which the FusionInsight client installation package is decompressed, for example, /opt/FusionInsight_Services_ClientConfig.

2.         Go to the Hive subdirectory.

3.         Go to the python-examples folder.

4.         Run the python setup.py install command in the CLI.

If the following information is displayed, the Python client is installed successfully:

Finished processing dependencies for pyhs2==0.5.0    

After the installation is successful, the following files are generated:

l   python-examples/pyCLI_sec.py: Python client example code

l   python-examples/pyhs2/haconnection.py: Python client API

Run the hive_python_client script to execute SQL statements, such as hive_python_client 'show tables'. This function applies to only simple SQL statements and depends on the ZooKeeper client.

----End

Procedure for Using ODBC

                               Step 1      Download and install MIT Kerberos. Obtain the software from the official website.

Record the installation path, for example, C:\Program Files\MIT\Kerberos.

                               Step 2      Set the Kerberos configuration file.

Obtain the krb5.conf file of cluster Kerberos from the FusionInsight HD cluster administrator, rename the file as krb5.ini, and copy the file to C:\ProgramData\MIT\Kerberos5.

note

Directory C:\ProgramData is hidden. You can unhide it.

                               Step 3      Set a cache file for Kerberos tickets.

1.         Create a directory for storing the tickets, for example, C:\temp.

2.         Configure Windows environment parameter KRB5CCNAME and set the parameter value to C:\temp\krb5cache.

3.         Restart the system.

                               Step 4      Perform authentication on Windows.

l   Log in to the system using a human-machine account.

a.         Obtain the username and password for the new developer account created in Preparing the Developer Account in the FusionInsight HD Product Documentation. The username is in the following format: username@Kerberos domain name.

b.         Open MIT Kerberos, and click get Ticket. In the MIT Kerberos: Get Ticket window that is displayed, set Pricipal to the username, set Password to the password, and click OK.

20180829165721073010.png

l   Log in to the system using a machine-machine account.

a.         Run the following command to go to the MIT Kerberos installation path and find kinit.exe:

C:\Program Files\MIT\Kerberos\bin

b.         Run the following command:

kinit -k -t /path_to_userkeytab/user.keytabUserName

path_to_userkeytab indicates the path for storing the keytab file, user.keytab the keytab file of the user, and UserName the username.

                               Step 5      Install the compiling software Visual Studio 2012.

Record the installation path, for example, C:\Program Files (x86)\Microsoft Visual Studio 11.0. After the installation, add C:\Program Files (x86)\Microsoft Visual Studio 11.0\vc\bin to the OS environment variable Path.

                               Step 6      Extract the example project.

1.         Decompress the FusionInsight client installation package and obtain FusionInsight_Hive_ODBC_Driver.zip from the Hive sub-folder of the FusionInsight_Services_ClientConfig folder.

2.         Decompress FusionInsight_Hive_ODBC_Driver.zip to a specified directory, for example, C:\FusionInsight_Hive_ODBC_Driver.

                               Step 7      Install OpenSSL 1.x.

1.         Visit http://www.activestate.com/activeperl/downloads. Download and install ActivePerl 5.24.0 (64-bit, x64).

2.         On the system running the Windows OS, choose Start > Run, and enter the cmd command in the Run dialog box. In the displayed CLI, enter ${ActivePerl installation directory}/eg and run the following command to check whether ActivePerl is successfully installed.

perl example.pl

If the following information is displayed, ActivePerl is successfully installed.

Hello from ActivePerl!

3.         Go to the C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin directory and run the vcvars32.bat command to configure the VS2012 environment variable.

note

Keep running vcvars32.bat and later operations in the same cmd terminal. Otherwise, errors may occur when the nmake command is used.

4.         Visit https://www.openssl.org/source/old/. Download and decompress openssl-1.x.tar.gz.

The HiveODBC driver uses OpenSSL 1.0 or later.

5.         In the command line interface, enter ${openssl decompression directory} and run the following command:

perl Configure VC-WIN32

6.         Enter the ${openssl decompression directory}directory and run the following command:

ms\do_ms

7.         Enter the ${openssl decompression directory}directory and run the following command:

nmake -f ms\ntdll.mak

8.         Open the ${openssl decompression directory}/out32dll directory and copy the libeay32.dll and ssleay32.dll files to the directory in which FusionInsight_Hive_ODBC_Driver.zip is decompressed.

                               Step 8      Install Cyrus SASL 2.1.23.

1.         Visit https://sourceforge.net/projects/saslwindows/files/cyrus-sasl-2.1.23/, and download and decompress cyrus-sasl-2.1.23-static-x86.zip.

2.         Run the decompressed .exe file to install Cyrus SASL.

3.         Enter the C:\CMU\bin directory and copy the libsasl.dll file to the Windows directory in the HiveODBC decompression directory.

                               Step 9      Register with the driver.

1.         On the Windows OS, choose Start > Run. Enter cmd. Go to the Windows directory created in section Using the FusionInsight Client for Development"Using the FusionInsight Client for Development", for example, C:\FusionInsight_Hive_ODBC_Driver\Windows.

2.         Run the following command:

odbcreg /i hiveodbc hiveodbc.dll C:\FusionInsight_Hive_ODBC_Driver\Windows

Enter y as prompted.

The command output is as follows:

C:\FusionInsight_Hive_ODBC_Driver\Windows>odbcreg /i hiveodbc hiveodbc.dll C:\FusionInsight_Hive_ODBC_Driver\Windows  
ODBCREG - to register or unregister ODBC driver  
Proceeding to register...  
  driver: hiveodbc  
     dll: hiveodbc.dll  
    path: C:\FusionInsight_Hive_ODBC_Driver\Windows  
Confirm(y/n)  
hiveodbc installed/registered successfully

3.         View the registry.

Check that driver hiveodbc exists in HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\ODBC\ODBCINST.INI of the registry.

note

The registry path of a 32-bit Windows OS is HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBCINST.INI.

                           Step 10      Configure the authentication information and files required for connecting to ZooKeeper.

                           Step 11      Copy the krb5.conf and user.keytab files obtained in Preparing the Developer Account of the FusionInsight HD Product Documentation to a specific path, for example, C:\FusionInsight_Hive_ODBC_Driver\odbcconf. Then create file jaas.conf under the directory. The file content is as follows:

Client {  
com.sun.security.auth.module.Krb5LoginModule required  
useKeyTab=true  
keyTab="C:\\FusionInsight_Hive_ODBC_Driver\\odbcconf\\user.keytab"  
principal="hiveuser@HADOOP.COM"  
useTicketCache=false  
storeKey=true  
debug=true;  
};

keyTab is the absolute path for storing user.keytab, and principal is the Principal account of the user.

                           Step 12      Configure the data source.

l   Configure the file data source.

a.         Rename template.dsn saved in C:\FusionInsight_Hive_ODBC_Driver\Windows as xxx.dsn, for example, hive.dsn.

b.         The following shows how to configure hive.dsn in non-cluster mode:

[ODBC]  
DRIVER=hiveodbc  
MODE=0  
HOST=127.0.0.1  
PORT=21066  
DATABASE=default  
PRINCIPAL=hive/hadoop.huawei.com@HUAWEI.COM  
FRAMED=0  
NAMESPACE=  
KRB5PATH=  
JAASPATH=

l   Configure the machine data source.

a.         Rename template.dsn saved in C:\FusionInsight_Hive_ODBC_Driver\Windows as xxx.ini, for example, hive.ini.

b.         Modify hive.ini and change [ODBC] to [Hive]. The following is the content of hive.ini:

[Hive]  
DRIVER=hiveodbc  
MODE=0  
HOST=127.0.0.1  
PORT=21066  
DATABASE=default  
PRINCIPAL=hive/hadoop.huawei.com@HUAWEI.COM  
FRAMED=0  
NAMESPACE=  
KRB5PATH=  
JAASPATH=

c.         On the Windows OS, choose Start > Run. Enter cmd. Go to the Windows directory created in section Using the FusionInsight Client for Development"Using the FusionInsight Client for Development", for example, C:\FusionInsight_Hive_ODBC_Driver\Windows.

d.         Run the following command:

odbcreg /d hiveodbcds C:\FusionInsight_Hive_ODBC_Driver\Windows\hive.ini

e.         After confirming that the print is consistent with the content of file hive.ini, enter y.

f.          The execution result is as follows:

ODBCREG - to register or unregister ODBC driver  
Attribute to be register are:  
DRIVER = hiveodbc  
DATABASE = default  
HOST = 127.0.0.1  
PORT = 21066  
FRAMED = 0  
MODE = 0  
PRINCIPAL = hive/hadoop.huawei.com@HUAWEI.COM  
NAMESPACE =  
KRB5PATH = 
JAASPATH = 
Confirm(y/n)   
Add datasource .

note

The example code requires the machine data source.

                           Step 13      Install JDK 1.8.

1.         Download the 32-bit JDK 1.8 from the Oracle official website and install it.

2.         Configure environment variable PATH and add directory jre\bin\client in the JDK 1.8 installation directory to PATH.

If the application is executed using the CLI, run the set command for configuration:

set PATH=D:\dev\Java\jdk1.8.0_31_win32\jre\bin\client;%PATH%

                           Step 14      Configure the ODBCCLASSPATH parameter.

If the cluster mode is used, configure the ODBCCLASSPATH parameter and add the JAR file path required for connecting to ZooKeeper to ODBCCLASSPATH. The ODBCCLASSPATH value is as follows:

C:/FusionInsight_Hive_ODBC_Driver/Windows/jars/zookeeper-3.5.1.jar;C:/FusionInsight_Hive_ODBC_Driver/Windows/jars/slf4j-api-1.7.10.jar;C:/FusionInsight_Hive_ODBC_Driver/Windows/jars/log4j-1.2.17.jar;C:/FusionInsight_Hive_ODBC_Driver/Windows/jars/slf4j-log4j12-1.7.5.jar;C:/FusionInsight_Hive_ODBC_Driver/Windows/jars/zk-helper.jar;C:/FusionInsight_Hive_ODBC_Driver/Windows/jars/log4j.properties;C:/FusionInsight_Hive_ODBC_Driver/Windows/jars/commons-logging-1.2.jar

C:/FusionInsight_Hive_ODBC_Driver indicates the installation path of the Hive ODBC driver.

----End

1.1.4 Using the Maven Repository for Development

This section uses HBase as an example to describe how to use the Maven repository for secondary development.

Background

The example codes of the FusionInsight have been uploaded to the Huawei DevCloud website. Users can download the example codes from the Huawei DevCloud website and corresponding dependency components through the Maven repository.

Procedure

                               Step 1      Log in to Huawei DevClod (https://codehub-cn-south-1.devcloud.huaweicloud.com/codehub/7076065/home) to download code udner components/flink to the local PC.

                               Step 2      Copy the keytab and krb5.conf files obtained in section Preparing the Developer Account of the FusionInsight HD Product Documentation to the conf directory of the example project.

                               Step 3      Import the example project to the Eclipse development environment.

1.         Choose File > Import > General > Existing Projects into Workspace > Next > Browse.

The Browse dialog box is displayed.

2.         Select the example project folder, and click Finish.

                               Step 4      Configure information about the Maven repository.

1.         On the Eclipse menu bar, choose Window > Preferences.

The Preferences window is displayed.

2.         Choose Maven > User Settings from the navigation tree. In the User Setting interface, select the settings file you have downloaded in Step 1, click Apply, and click OK, as shown in Figure 1-3.

Figure 1-3 Modifying Maven configurations

20180829165724625013.png

 

                               Step 5      Set the Eclipse text file encoding format to prevent garbled characters.

1.         On the Eclipse menu bar, choose Window > Preferences.

The Preferences window is displayed.

2.         Choose General > Workspace from the navigation tree. In the Text file encoding area, select Other, set the value to UTF-8, click Apply, and click OK, as shown in Figure 1-4.

Figure 1-4 Setting the Eclipse encoding format

20180829165719789008.png

 

----End

note

For details, see the development procedures using Python and ODBC in this document.

 


This article contains more resources

You need to log in to download or view. No account? Register

x

welcome
View more
  • x
  • convention:

It’s a post worth reading carefully.
View more
  • x
  • convention:

that a long useful post
View more
  • x
  • convention:

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.