Got it

MapReduce:Case 2: MapReduce Accessing Multi-component Sample

Latest reply: Nov 26, 2018 09:58:32 797 2 3 0 0

1.1 MapReduce

1.1.1 Case 2: MapReduce Accessing Multi-component Sample Programs

1.1.1.1 Scenario

Applicable Versions

FusionInsight HD V100R002C70, FusionInsight HD V100R002C80

Scenario

The following example illustrates how to compile MapReduce jobs to visit multiple service components in HDFS, HBase, and Hive, helping users to understand key actions, such as authentication and configuration loading.

The logic process of the example is as follows:

Use an HDFS text file as input data:

log1.txt: input file

YuanJing,male,10

GuoYijun,male,5

Map:

1.         Obtain one row of the input data and extract the user names.

2.         Query a piece of data of HBase.

3.         Query a piece of data of Hive.

4.         Combine the data queried from HBase and that from Hive as the output of Map.

Reduce:

1.         Obtain the last piece of data from the Map output.

2.         Export the data to HBase.

3.         Save the data to HDFS.

Data Planning

1.         Create an HDFS data file.

a.         Create a text file named data.txt in the Linux-based HDFS and copy the content of log1.txt to data.txt.

b.         Run the following commands to create the /tmp/examples/multi-components/mapreduce/input/ folder in HDFS, and upload data.txt to it.

2.         On the HDFS client of the Linux OS, run the hdfs dfs -mkdir -p/tmp/examples/multi-components/mapreduce/input/ command.

3.         On the HDFS client of the Linux OS, run the hdfs dfs -putdata.txt/tmp/examples/multi-components/mapreduce/input/ command.

4.         Create an HBase table and insert data.

a.         On the HBase client of the Linux OS, run the hbase shell command.

b.         Create the table1 table in the HBase shell interaction window. The table contains a column family cf. Run the create 'table1', 'cf' command.

c.         Insert a record whose rowkey is 1, column name is cid, and data value is 123. Run the put 'table1', '1',' cf:cid ',' 123 ' command.

d.         Run the quit command to exit.

5.         Create a Hive table and insert data.

a.         On the Hive client in Linux, run the beeline command.

b.         Create the person table in the Hive beeline interaction window. The table contains the following fields: name, gender, and stayTime. Run the CREATE TABLE person (name STRING, gender STRING, stayTime INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' stored as textfile; command.

c.         Load the data file in the Hive beeline interactive window. Run the LOAD DATA INPATH '/tmp/examples/multi-components/mapreduce/input/' OVERWRITE INTO TABLE person; command.

d.         Run the !q command to exit.

6.         Hive loads data to clear the HDFS data directory. Therefore, scenarios in section 1.2.3.1 needs to be executed again.

1.1.1.2 Development Guidelines

The development process consists of three parts:

l   Collect the name information from HDFS original files, query and combine data of HBase and Hive using the MultiComponentMapper class inherited from the Mapper abstract class.

l   Obtain the last piece of mapped data and output it to HBase and HDFS, using the MultiComponentReducer class inherited from the Reducer abstract class.

l   Use the main method to create a MapReduce job and then submit the MapReduce job to the Hadoop cluster.

1.1.1.3 Obtaining Sample Code

Using the FusionInsight Client

Obtain the sample project mapreduce-example-security in the HDFS directory in the FusionInsight_Services_ClientConfig file extracted from the client.

Using the Maven Project

Log in to Huawei DevClod (https://codehub-cn-south-1.devcloud.huaweicloud.com/codehub/7076065/home) to download code udner components/mapreduce to the local PC.

1.1.1.4 Sample Code Description

The following code snippets are used as an example. For complete code, see the com.huawei.bigdata.mapreduce.examples.MultiComponentExample class.

Example 1: The MultiComponentMapper class defines the map method of the Mapper abstract class.

private static class MultiComponentMapper extends Mapper<Object, Text, Text, Text> {

    Configuration conf;

    @Override protected void map(Object key, Text value, Context context) throws IOException, InterruptedException {

      conf = context.getConfiguration();

// Configure the jaas and krb5 parameters for components that need to access the ZooKeeper.

//The user does not need to repeatedly log in to the Map. The authentication information configured in the main method is used.

      String krb5 = "krb5.conf";

      String jaas = "jaas_mr.conf";

// These files are uploaded from the main method.

      File jaasFile = new File(jaas);

      File krb5File = new File(krb5);

      System.setProperty("java.security.auth.login.config", jaasFile.getCanonicalPath());

      System.setProperty("java.security.krb5.conf", krb5File.getCanonicalPath());

      System.setProperty("zookeeper.sasl.client", "true");

      LOG.info("UGI :" + UserGroupInformation.getCurrentUser());

      String name = "";

      String line = value.toString();

      if (line.contains("male")) {

        name = line.substring(0, line.indexOf(","));

      }

      // 1. Read the HBase data.

      String hbaseData = readHBase();

      // 2. Read the Hive data.

      String hiveData = readHive(name);

// Map generates a key-value pair, which is a character string combining HBase and Hive data.

      context.write(new Text(name), new Text("hbase:" + hbaseData + ", hive:" + hiveData));

    }

Example 2: Use the readHBase method to read HBase data.

    private String readHBase() {

      String tableName = "table1";

      String columnFamily = "cf";

      String hbaseKey = "1";

      String hbaseValue;

      Configuration hbaseConfig = HBaseConfiguration.create(conf);

      org.apache.hadoop.hbase.client.Connection conn = null;

      try {

// Establish an HBase connection.

        conn = ConnectionFactory.createConnection(hbaseConfig);

// Obtain the HBase table.

       Table table = conn.getTable(TableName.valueOf(tableName));

// Create an HBase Get request instance.

        Get get = new Get(hbaseKey.getBytes());

 // Submit a Get request.

        Result result = table.get(get);

        hbaseValue = Bytes.toString(result.getValue(columnFamily.getBytes(), "cid".getBytes()));

        return hbaseValue;

      } catch (IOException e) {

        LOG.warn("Exception occur ", e);

      } finally {

      }

      return "";

    }

Example 3: Use the readHive method to read Hive data.

    private String readHive(String name) {

// Load the configuration file.

      Properties clientInfo = null;

      String userdir = System.getProperty("user.dir") + "/";

      InputStream fileInputStream = null;

      try {

        clientInfo = new Properties();

      String hiveclientProp = userdir + "hiveclient.properties";

        File propertiesFile = new File(hiveclientProp);

        fileInputStream = new FileInputStream(propertiesFile);

        clientInfo.load(fileInputStream);

      } catch (Exception e) {

        throw new IOException(e);

      } finally {

        if (fileInputStream != null) {

          fileInputStream.close();

        }

      }

      String zkQuorum = clientInfo.getProperty("zk.quorum");

      String zooKeeperNamespace = clientInfo.getProperty("zooKeeperNamespace");

      String serviceDiscoveryMode = clientInfo.getProperty("serviceDiscoveryMode");

// Read this section carefully.

// The MapReduce task accesses Hive in JDBC mode.

// Hive encapsulates SQL query into another MapReduce task and submits it.

// Therefore, it is not recommended to invoke Hive in MapReduce jobs.

     final String driver = "org.apache.hive.jdbc.HiveDriver";

      String sql = "select name,sum(stayTime) as " + "stayTime from person where name = '" + name + "' group by name";

      StringBuilder sBuilder = new StringBuilder("jdbc:hive2://").append(zkQuorum).append("/");

// In map or reduce, the Hive connection mode is auth=delegationToken.

      sBuilder

          .append(";serviceDiscoveryMode=")

          .append(serviceDiscoveryMode)

          .append(";zooKeeperNamespace=")

          .append(zooKeeperNamespace)

          .append(";auth=delegationToken;");

      String url = sBuilder.toString();

      Connection connection = null;

      PreparedStatement statement = null;

      ResultSet resultSet = null;

      try {

        Class.forName(driver);

        connection = DriverManager.getConnection(url, "", "");

        statement = connection.prepareStatement(sql);

        resultSet = statement.executeQuery();

        if (resultSet.next()) {

        return resultSet.getString(1);

        }

      } catch (ClassNotFoundException e) {

        LOG.warn("Exception occur ", e);

      } catch (SQLException e) {

        LOG.warn("Exception occur ", e);

      } finally {

      }

      return "";

    }

Example 4: The MultiComponentReducer class defines the reduce method of the Reducer abstract class.

  private static class MultiComponentReducer extends Reducer<Text, Text, Text, Text> {

    Configuration conf;

    public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {

      conf = context.getConfiguration();

// Configure the jaas and krb5 parameters for components that need to access the ZooKeeper.

The // user does not need to repeatedly log in to the Reduce. The authentication information configured in the main method is used.

// These files are uploaded from the main method.

      File jaasFile = new File(jaas);

      File krb5File = new File(krb5);

      System.setProperty("java.security.auth.login.config", jaasFile.getCanonicalPath());

      System.setProperty("java.security.krb5.conf", krb5File.getCanonicalPath());

      System.setProperty("zookeeper.sasl.client", "true");

      Text finalValue = new Text("");

      for (Text value : values) {

        finalValue = value;

      }

The results are exported to the HBase.

      writeHBase(key.toString(), finalValue.toString());

// The results are saved to the HDFS.

     context.write(key, finalValue);

    }

Example 5: Use the writeHBase method to generate data to HBase.

    private void writeHBase(String rowKey, String data) {

      String tableName = "table1";

      String columnFamily = "cf";

  try {

        LOG.info("UGI read :" + UserGroupInformation.getCurrentUser());

      } catch (IOException e1) {

        // handler exception

      }

      Configuration hbaseConfig = HBaseConfiguration.create(conf);

      org.apache.hadoop.hbase.client.Connection conn = null;

      try {

// Create an HBase connection

        conn = ConnectionFactory.createConnection(hbaseConfig);

// Obtain the HBase table.

       Table table = conn.getTable(TableName.valueOf(tableName));

// Create an HBase Put request instance.

     List<Put> list = new ArrayList<Put>();

        byte[] row = Bytes.toBytes("row" + rowKey);

        Put put = new Put(row);

        byte[] family = Bytes.toBytes(columnFamily);

        byte[] qualifier = Bytes.toBytes("value");

        byte[] value = Bytes.toBytes(data);

        put.addColumn(family, qualifier, value);

        list.add(put);

 // Execute the PUT request.

       table.put(list);

      } catch (IOException e) {

        LOG.warn("Exception occur ", e);

      } finally {

      }

    }

Example 6: Use the main () method to create a job, configure dependencies and authentication information, and submit the job to the Hadoop cluster.

    public static void main(String[] args) throws Exception {

// Load the configuration file.

      Properties clientInfo = null;

      String userdir = System.getProperty("user.dir") + "/";

      InputStream fileInputStream = null;

      try {

        clientInfo = new Properties();

        String hiveclientProp = userdir + "hiveclient.properties";

        File propertiesFile = new File(hiveclientProp);

        fileInputStream = new FileInputStream(propertiesFile);

        clientInfo.load(fileInputStream);

      } catch (Exception e) {

        throw new IOException(e);

      } finally {

        if (fileInputStream != null) {

          fileInputStream.close();

        }

      }

      String zkQuorum = clientInfo.getProperty("zk.quorum");

      String zooKeeperNamespace = clientInfo.getProperty("zooKeeperNamespace");

      String serviceDiscoveryMode = clientInfo.getProperty("serviceDiscoveryMode");

      String principal = clientInfo.getProperty("principal");

      String auth = clientInfo.getProperty("auth");

      String sasl_qop = clientInfo.getProperty("sasl.qop");

      String hbaseKeytab = MultiComponentExample.class.getClassLoader().getResource("user.keytab").getPath();

      String hbaseJaas = MultiComponentExample.class.getClassLoader().getResource("jaas_mr.conf").getPath();

      String hiveClientProperties = MultiComponentExample.class.getClassLoader().getResource("hiveclient.properties").getPath();

// Combination file lists, separated them by commas (,).

      String files = "file://" + KEYTAB + "," + "file://" + KRB + "," + "file://" + JAAS;

      files = files + "," + "file://" + hbaseKeytab;

      files = files + "," + "file://" + hbaseJaas;

      files = files + "," + "file://" + hiveClientProperties;

// Files related to the tmpfiles attribute will be uploaded to the HDFS when the job is submitted.

      config.set("tmpfiles", files);

// Directory to be cleared

      MultiComponentExample.cleanupBeforeRun();

// security cluster login

      LoginUtil.login(PRINCIPAL, KEYTAB, KRB, config);

// Search for the Hive dependency JAR package.

    Class hiveDriverClass = Class.forName("org.apache.hive.jdbc.HiveDriver");

      Class thriftClass = Class.forName("org.apache.thrift.TException");

      Class thriftCLIClass = Class.forName("org.apache.hive.service.cli.thrift.TCLIService");

      Class hiveConfClass = Class.forName("org.apache.hadoop.hive.conf.HiveConf");

      Class hiveTransClass = Class.forName("org.apache.thrift.transport.HiveTSaslServerTransport");

      Class hiveMetaClass = Class.forName("org.apache.hadoop.hive.metastore.api.MetaException");

      Class hiveShimClass = Class.forName("org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge23");

// Add the Hive running dependency to jobs.

      JarFinderUtil

          .addDependencyJars(config, hiveDriverClass, thriftCLIClass, thriftClass, hiveConfClass, hiveTransClass,

              hiveMetaClass, hiveShimClass);

// Add a Hive configuration file.

     config.addResource("hive-site.xml");

// Add an HBase configuration file.

    Configuration conf = HBaseConfiguration.create(config);

// Instantiate Jobs.

     Job job = Job.getInstance(conf);

      job.setJarByClass(MultiComponentExample.class);

// Set the mapper&reducer class.

      job.setMapperClass(MultiComponentMapper.class);

      job.setReducerClass(MultiComponentReducer.class);

// Set the input and output paths of jobs.

 FileInputFormat.addInputPath(job, new Path(baseDir, INPUT_DIR_NAME + File.separator + "data.txt"));

      FileOutputFormat.setOutputPath(job, new Path(baseDir, OUTPUT_DIR_NAME));

// Set the output key type.

  job.setOutputKeyClass(Text.class);

      job.setOutputValueClass(Text.class);

// HBase provides a tool class to add HBase running dependency to jobs.

      TableMapReduceUtil.addDependencyJars(job);

// This operation must be performed in security mode.

// HBase adds authentication information to the job. The map or reduce task uses the authentication information here.

      TableMapReduceUtil.initCredentials(job);

// Creat Hive authentication information.

      StringBuilder sBuilder = new StringBuilder("jdbc:hive2://").append(zkQuorum).append("/");

      sBuilder.append(";serviceDiscoveryMode=").append(serviceDiscoveryMode).append(";zooKeeperNamespace=")

          .append(zooKeeperNamespace)

        .append(";sasl.qop=")

          .append(sasl_qop)

          .append(";auth=")

          .append(auth)

          .append(";principal=")

          .append(principal)

          .append(";");

      String url = sBuilder.toString();

      Connection connection = DriverManager.getConnection(url, "", "");

      String tokenStr = ((HiveConnection) connection)

          .getDelegationToken(UserGroupInformation.getCurrentUser().getShortUserName(), PRINCIPAL);

      connection.close();

      Token<DelegationTokenIdentifier> hive2Token = new Token<DelegationTokenIdentifier>();

      hive2Token.decodeFromUrlString(tokenStr);

// Add Hive authentication information to a Job.

Text("hive.server2.delegation.token"), hive2Token);

      job.getCredentials().addToken(new Text(HiveAuthFactory.HS2_CLIENT_TOKEN), hive2Token);

// Submit a job.

      System.exit(job.waitForCompletion(true) ? 0 : 1);

    }

note

Replace all zkQuorum objects in the examples with the actual ZooKeeper cluster node information.

1.1.1.5 Application Commissioning

Compilation and Running Results

1.         In the Eclipse development environment, select the LocalRunner.java project and click102503mzhra0zhv8zmi7rz.jpg to run the corresponding application project.

Alternatively, right-click the project and choose Run as > Java Application from the shortcut menu to run the application project.

note

Do not restart the HDFS service during the running of MapReduce jobs. Otherwise, the jobs may fail.

Viewing Commissioning Results

After the MapReduce application is run, you can view the running status of the MapReduce application in the following ways:

l   Check the running status of the application in the Eclipse.

l   Use MapReduce logs to obtain the running status of applications.

l   Log in to the MapReduce WebUI to check the running status of the application.

l   Log in to the Yarn WebUI to check the running status of the application.

note

Contact the administrator to obtain a service account that has the right to access the web UI and its password.

1.1.1.5.1 Commissioning Applications on Windows

1. Compiling and Running Programs

Scenario

You can run applications in the Windows environment after application code development is complete.

note

If the IBM JDK is used on Windows, applications cannot be directly run on Windows.

Procedure

1.         In the Eclipse development environment, select the LocalRunner.java project and click102503mzhra0zhv8zmi7rz.jpg to run the corresponding application project.

Alternatively, right-click the project and choose Run as > Java Application from the shortcut menu to run the application project.

note

Do not restart the HDFS service during the running of MapReduce jobs. Otherwise, the jobs may fail.

2. Viewing the Commissioning Result

Scenario

After the MapReduce application is run, you can view the running status of the MapReduce application in the following ways:

l   Check the running status of the application in the Eclipse.

l   Use MapReduce logs to obtain the running status of applications.

l   Log in to the MapReduce WebUI to check the running status of the application.

l   Log in to the Yarn WebUI to check the running status of the application.

note

Contact the administrator to obtain a service account that has the right to access the web UI and its password.

Procedure

l   Viewing running results to learn application running status

View the output result on the console to learn the application running status as follows:

1848 [main] INFO  org.apache.hadoop.security.UserGroupInformation  - Login successful for user admin@HADOOP.COM using keytab file 

Login success!!!!!!!!!!!!!!

7093 [main] INFO  org.apache.hadoop.hdfs.PeerCache  - SocketCache disabled.

9614 [main] INFO  org.apache.hadoop.hdfs.DFSClient  - Created HDFS_DELEGATION_TOKEN token 45 for admin on ha-hdfs:hacluster

9709 [main] INFO  org.apache.hadoop.mapreduce.security.TokenCache  - Got dt for hdfs://hacluster; Kind: HDFS_DELEGATION_TOKEN,

Service: ha-hdfs:hacluster, Ident: 

(HDFS_DELEGATION_TOKEN token 45 for admin)

10914 [main] INFO  org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider  - Failing over to 53

12136 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat  - Total input files to process : 2

12731 [main] INFO  org.apache.hadoop.mapreduce.JobSubmitter  - number of splits:2

13405 [main] INFO  org.apache.hadoop.mapreduce.JobSubmitter  - Submitting tokens for job: job_1456738266914_0006

13405 [main] INFO  org.apache.hadoop.mapreduce.JobSubmitter  - Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hacluster, 

Ident: (HDFS_DELEGATION_TOKEN token 45 for admin)

16019 [main] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl  - Application submission is not finished, 

submitted application application_1456738266914_0006 is still in NEW

16975 [main] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl  - Submitted application application_1456738266914_0006

17069 [main] INFO  org.apache.hadoop.mapreduce.Job  - The url to track the job: 

https://linux2:26001/proxy/application_1456738266914_0006/

17086 [main] INFO  org.apache.hadoop.mapreduce.Job  - Running job: job_1456738266914_0006

29811 [main] INFO  org.apache.hadoop.mapreduce.Job  - Job job_1456738266914_0006 running in uber mode : false

29811 [main] INFO  org.apache.hadoop.mapreduce.Job  -  map 0% reduce 0%

41492 [main] INFO  org.apache.hadoop.mapreduce.Job  -  map 100% reduce 0%

53161 [main] INFO  org.apache.hadoop.mapreduce.Job  -  map 100% reduce 100%

53265 [main] INFO  org.apache.hadoop.mapreduce.Job  - Job job_1456738266914_0006 completed successfully

53393 [main] INFO  org.apache.hadoop.mapreduce.Job  - Counters: 50

note

The following exception may occur when the sample code is running in the Windows OS, but it will not affect services.

java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

l   Viewing the task execution status by using the MapReduce WebUI

Log in to FusionInsight Manager, choose Service Management > MapReduce > JobHistoryServer, and check the task execution status on the web page.

Figure 1-1 JobHistory WebUI

102508sjak21e7ey63u7y3.jpg

 

l   Viewing the task execution status by using the Yarn WebUI

Log in to FusionInsight Manager, choose Service Management > Yarn > ResourceManager (Master), and check the task execution status on the web page.

Figure 1-2 ResourceManager WebUI

102509blenvukxn22n8xrz.jpg

 

l   View MapReduce logs to obtain the application running status.

View MapReduce logs to learn application running status, and adjust applications based on log information.

1.1.1.5.2 Running Applications on Linux

1. Compiling and Running Programs

Scenario

After the program code is developed, you can run the application in the Linux environment.

Prerequisites

The Yarn client has been installed.

Procedure

                               Step 1      Export the executable MapReduce application package.

l   For the MapReduce statistics sample program, select the FemaleInfoCollector.java, LoginUtil.java, krb5.conf, and user.keytab files, and choose Export from the shortcut menu.

l   For the MapReduce multi-component accessing sample project, select the LoginUtil.java, MultiComponentExample.java, and JarFinderUtil.java files, and choose Export from the shortcut menu.

                               Step 2      Select JAR file, as shown in Figure 1-3. Click Next.

Figure 1-3 Selecting JAR file

102510cdp8qk77fp8ppb27.png

 

                               Step 3      Select a path for exporting the package, as shown in Figure 1-4. Click Finish.

Figure 1-4 Selecting a path for exporting the JAR file

102511uz6775dv66jjggpz.png

 

                               Step 4      Upload the generated application package mapreduce-example.jar to a Linux client, for example, /srv/client/conf, in the same directory as the configuration file.

                               Step 5      Execute the sample project on Linux.

l   For the MapReduce statistics sample project, run the following command:

yarn jar mapreduce-example.jar com.huawei.bigdata.mapreduce.examples.FemaleInfoCollector <inputPath> <outputPath>

This command is used to set parameters and submit jobs. In the command, <inputPath> indicates the input path of the HDFS file system, and <outputPath> indicates the output path of the HDFS file system.

note

l  Before running the yarn jar mapreduce-example.jar com.huawei.bigdata.mapreduce.examples.FemaleInfoCollector<inputPath> <outputPath> command, upload the log1.txt and log2.txt files to the <inputPath> directory of the HDFS. For details, see the description of typical scenarios.

l  Before running the yarn jar mapreduce-example.jar com.huawei.bigdata.mapreduce.examples.FemaleInfoCollector<inputPath> <outputPath> command, ensure that the <outputPath> directory does not exist. Otherwise, an error is reported.

l  Do not restart the HDFS service during the running of MapReduce jobs. Otherwise, the jobs may fail.

l   For the MapReduce multi-component accessing sample project, perform the following steps:

a.         Obtain the user.keytab, krb5.conf, hbase-site.xml, hiveclient.properties, and hive-site.xml files, and create a folder in the Linux environment to save the configuration files, for example, /srv/client/conf.

note

Contact the administrator to obtain the user.keytab and krb5.conf files corresponding to the account and permission. Obtain hbase-site.xml from the HBase client, and hiveclient.properties and hive-site.xml from the Hive client.

b.         Create the jaas_mr.conf file in the new folder. The file content is as follows:

Client {  
com.sun.security.auth.module.Krb5LoginModule required  
useKeyTab=true  
keyTab="user.keytab"  
principal="test@HADOOP.COM"  
useTicketCache=false  
storeKey=true  
debug=true;  
};

note

In the preceding file content, test@HADOOP.COM is an example. Change it based on the site requirements.

c.         In the Linux environment, add the classpath required for running the sample project, for example,

export YARN_USER_CLASSPATH=/srv/client/conf/:/srv/client/HBase/hbase/lib/*:/srv/client/Hive/Beeline/lib/*

d.         Submit the MapReduce job and run the following command to run the sample project.

yarn jar mapreduce-example.jar com.huawei.bigdata.mapreduce.examples.MultiComponentExample

----End

2. Viewing the Commissioning Result

Scenario

After the MapReduce application is run, you can view the running status of the MapReduce application in the following ways:

l   Check the running status of the program based on the running result.

l   Log in to the MapReduce WebUI to check the running status of the application.

l   Log in to the Yarn WebUI to check the running status of the application.

l   Use MapReduce logs to obtain the running status of applications.

note

Contact the administrator to obtain a service account that has the right to access the web UI and its password.

Procedure

l   Viewing the task execution status by using the MapReduce WebUI

Log in to FusionInsight Manager, choose Service Management > MapReduce > JobHistoryServer, and check the task execution status on the web page.

Figure 1-5 JobHistory WebUI

102512jmtqglhf8qlmloln.jpg

 

l   Viewing the task execution status by using the Yarn WebUI

Log in to FusionInsight Manager, choose Service Management > Yarn > ResourceManager (Master), and check the task execution status on the web page.

Figure 1-6 ResourceManager WebUI

102513suge***yditlalox.jpg

 

l   Viewing the running result of the MapReduce application

           After running the yarn jar mapreduce-example.jar command in the Linux OS, you can view the running status of the running applications. For example:

linux1:/opt # yarn jar mapreduce-example.jar /user/mapred/example/input/ /output6  
16/02/24 15:45:40 INFO security.UserGroupInformation: Login successful for user admin@HADOOP.COM using keytab file user.keytab  
Login success!!!!!!!!!!!!!!  
16/02/24 15:45:40 INFO hdfs.PeerCache: SocketCache disabled.
16/02/24 15:45:41 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 28 for admin on ha-hdfs:hacluster
16/02/24 15:45:41 INFO security.TokenCache: Got dt for hdfs://hacluster; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hacluster, Ident: (HDFS_DELEGATION_TOKEN token 28 for admin)  
16/02/24 15:45:41 INFO input.FileInputFormat: Total input files to process : 2  
16/02/24 15:45:41 INFO mapreduce.JobSubmitter: number of splits:2  
16/02/24 15:45:42 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1455853029114_0027  
16/02/24 15:45:42 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hacluster, Ident: (HDFS_DELEGATION_TOKEN token 28 for admin)  
16/02/24 15:45:42 INFO impl.YarnClientImpl: Submitted application application_1455853029114_0027  
16/02/24 15:45:42 INFO mapreduce.Job: The url to track the job: https://linux1:26001/proxy/application_1455853029114_0027/  
16/02/24 15:45:42 INFO mapreduce.Job: Running job: job_1455853029114_0027  
16/02/24 15:45:50 INFO mapreduce.Job: Job job_1455853029114_0027 running in uber mode : false  
16/02/24 15:45:50 INFO mapreduce.Job:  map 0% reduce 0%  
16/02/24 15:45:56 INFO mapreduce.Job:  map 100% reduce 0%  
16/02/24 15:46:03 INFO mapreduce.Job:  map 100% reduce 100%  
16/02/24 15:46:03 INFO mapreduce.Job: Job job_1455853029114_0027 completed successfully  
16/02/24 15:46:03 INFO mapreduce.Job: Counters: 49

l   Run the yarn application -status <ApplicationID> command in the Linux OS. The execution result shows the running status of the running applications. For example:

linux1:/opt # yarn application -status application_1455853029114_0027  
Application Report : 
        Application-Id : application_1455853029114_0027  
        Application-Name : Collect Female Info  
        Application-Type : MAPREDUCE  
        User : admin  
        Queue : default  
        Start-Time : 1456299942302  
        Finish-Time : 1456299962343  
        Progress : 100%  
        State : FINISHED  
        Final-State : SUCCEEDED  
        Tracking-URL : https://linux1:26014/jobhistory/job/job_1455853029114_0027  
        RPC Port : 27100  
        AM Host : SZV1000044726   
        Aggregate Resource Allocation : 114106 MB-seconds, 42 vcore-seconds  
        Log Aggregation Status : SUCCEEDED  
        Diagnostics : Application finished execution. 
        Application Node Label Expression : <Not set>  
        AM container Node Label Expression : <DEFAULT_PARTITION>

l   Viewing MapReduce logs to learn application running status

View MapReduce logs to learn application running status, and adjust applications based on log information.

 


This article contains more resources

You need to log in to download or view. No account? Register

x

welcome
View more
  • x
  • convention:

thanks for sharing
View more
  • x
  • convention:

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.