1.1 MapReduce
1.1.1 Case 2: MapReduce Accessing Multi-component Sample Programs
1.1.1.1 Scenario
Applicable Versions
FusionInsight HD V100R002C70, FusionInsight HD V100R002C80
Scenario
The following example illustrates how to compile MapReduce jobs to visit multiple service components in HDFS, HBase, and Hive, helping users to understand key actions, such as authentication and configuration loading.
The logic process of the example is as follows:
Use an HDFS text file as input data:
log1.txt: input file
YuanJing,male,10
GuoYijun,male,5
Map:
1. Obtain one row of the input data and extract the user names.
2. Query a piece of data of HBase.
3. Query a piece of data of Hive.
4. Combine the data queried from HBase and that from Hive as the output of Map.
Reduce:
1. Obtain the last piece of data from the Map output.
2. Export the data to HBase.
3. Save the data to HDFS.
Data Planning
1. Create an HDFS data file.
a. Create a text file named data.txt in the Linux-based HDFS and copy the content of log1.txt to data.txt.
b. Run the following commands to create the /tmp/examples/multi-components/mapreduce/input/ folder in HDFS, and upload data.txt to it.
2. On the HDFS client of the Linux OS, run the hdfs dfs -mkdir -p/tmp/examples/multi-components/mapreduce/input/ command.
3. On the HDFS client of the Linux OS, run the hdfs dfs -putdata.txt/tmp/examples/multi-components/mapreduce/input/ command.
4. Create an HBase table and insert data.
a. On the HBase client of the Linux OS, run the hbase shell command.
b. Create the table1 table in the HBase shell interaction window. The table contains a column family cf. Run the create 'table1', 'cf' command.
c. Insert a record whose rowkey is 1, column name is cid, and data value is 123. Run the put 'table1', '1',' cf:cid ',' 123 ' command.
d. Run the quit command to exit.
5. Create a Hive table and insert data.
a. On the Hive client in Linux, run the beeline command.
b. Create the person table in the Hive beeline interaction window. The table contains the following fields: name, gender, and stayTime. Run the CREATE TABLE person (name STRING, gender STRING, stayTime INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' stored as textfile; command.
c. Load the data file in the Hive beeline interactive window. Run the LOAD DATA INPATH '/tmp/examples/multi-components/mapreduce/input/' OVERWRITE INTO TABLE person; command.
d. Run the !q command to exit.
6. Hive loads data to clear the HDFS data directory. Therefore, scenarios in section 1.2.3.1 needs to be executed again.
1.1.1.2 Development Guidelines
The development process consists of three parts:
l Collect the name information from HDFS original files, query and combine data of HBase and Hive using the MultiComponentMapper class inherited from the Mapper abstract class.
l Obtain the last piece of mapped data and output it to HBase and HDFS, using the MultiComponentReducer class inherited from the Reducer abstract class.
l Use the main method to create a MapReduce job and then submit the MapReduce job to the Hadoop cluster.
1.1.1.3 Obtaining Sample Code
Using the FusionInsight Client
Obtain the sample project mapreduce-example-security in the HDFS directory in the FusionInsight_Services_ClientConfig file extracted from the client.
Using the Maven Project
Log in to Huawei DevClod (https://codehub-cn-south-1.devcloud.huaweicloud.com/codehub/7076065/home) to download code udner components/mapreduce to the local PC.
1.1.1.4 Sample Code Description
The following code snippets are used as an example. For complete code, see the com.huawei.bigdata.mapreduce.examples.MultiComponentExample class.
Example 1: The MultiComponentMapper class defines the map method of the Mapper abstract class.
private static class MultiComponentMapper extends Mapper<Object, Text, Text, Text> {
Configuration conf;
@Override protected void map(Object key, Text value, Context context) throws IOException, InterruptedException {
conf = context.getConfiguration();
// Configure the jaas and krb5 parameters for components that need to access the ZooKeeper.
//The user does not need to repeatedly log in to the Map. The authentication information configured in the main method is used.
String krb5 = "krb5.conf";
String jaas = "jaas_mr.conf";
// These files are uploaded from the main method.
File jaasFile = new File(jaas);
File krb5File = new File(krb5);
System.setProperty("java.security.auth.login.config", jaasFile.getCanonicalPath());
System.setProperty("java.security.krb5.conf", krb5File.getCanonicalPath());
System.setProperty("zookeeper.sasl.client", "true");
LOG.info("UGI :" + UserGroupInformation.getCurrentUser());
String name = "";
String line = value.toString();
if (line.contains("male")) {
name = line.substring(0, line.indexOf(","));
}
// 1. Read the HBase data.
String hbaseData = readHBase();
// 2. Read the Hive data.
String hiveData = readHive(name);
// Map generates a key-value pair, which is a character string combining HBase and Hive data.
context.write(new Text(name), new Text("hbase:" + hbaseData + ", hive:" + hiveData));
}
Example 2: Use the readHBase method to read HBase data.
private String readHBase() {
String tableName = "table1";
String columnFamily = "cf";
String hbaseKey = "1";
String hbaseValue;
Configuration hbaseConfig = HBaseConfiguration.create(conf);
org.apache.hadoop.hbase.client.Connection conn = null;
try {
// Establish an HBase connection.
conn = ConnectionFactory.createConnection(hbaseConfig);
// Obtain the HBase table.
Table table = conn.getTable(TableName.valueOf(tableName));
// Create an HBase Get request instance.
Get get = new Get(hbaseKey.getBytes());
// Submit a Get request.
Result result = table.get(get);
hbaseValue = Bytes.toString(result.getValue(columnFamily.getBytes(), "cid".getBytes()));
return hbaseValue;
} catch (IOException e) {
LOG.warn("Exception occur ", e);
} finally {
}
return "";
}
Example 3: Use the readHive method to read Hive data.
private String readHive(String name) {
// Load the configuration file.
Properties clientInfo = null;
String userdir = System.getProperty("user.dir") + "/";
InputStream fileInputStream = null;
try {
clientInfo = new Properties();
String hiveclientProp = userdir + "hiveclient.properties";
File propertiesFile = new File(hiveclientProp);
fileInputStream = new FileInputStream(propertiesFile);
clientInfo.load(fileInputStream);
} catch (Exception e) {
throw new IOException(e);
} finally {
if (fileInputStream != null) {
fileInputStream.close();
}
}
String zkQuorum = clientInfo.getProperty("zk.quorum");
String zooKeeperNamespace = clientInfo.getProperty("zooKeeperNamespace");
String serviceDiscoveryMode = clientInfo.getProperty("serviceDiscoveryMode");
// Read this section carefully.
// The MapReduce task accesses Hive in JDBC mode.
// Hive encapsulates SQL query into another MapReduce task and submits it.
// Therefore, it is not recommended to invoke Hive in MapReduce jobs.
final String driver = "org.apache.hive.jdbc.HiveDriver";
String sql = "select name,sum(stayTime) as " + "stayTime from person where name = '" + name + "' group by name";
StringBuilder sBuilder = new StringBuilder("jdbc:hive2://").append(zkQuorum).append("/");
// In map or reduce, the Hive connection mode is auth=delegationToken.
sBuilder
.append(";serviceDiscoveryMode=")
.append(serviceDiscoveryMode)
.append(";zooKeeperNamespace=")
.append(zooKeeperNamespace)
.append(";auth=delegationToken;");
String url = sBuilder.toString();
Connection connection = null;
PreparedStatement statement = null;
ResultSet resultSet = null;
try {
Class.forName(driver);
connection = DriverManager.getConnection(url, "", "");
statement = connection.prepareStatement(sql);
resultSet = statement.executeQuery();
if (resultSet.next()) {
return resultSet.getString(1);
}
} catch (ClassNotFoundException e) {
LOG.warn("Exception occur ", e);
} catch (SQLException e) {
LOG.warn("Exception occur ", e);
} finally {
}
return "";
}
Example 4: The MultiComponentReducer class defines the reduce method of the Reducer abstract class.
private static class MultiComponentReducer extends Reducer<Text, Text, Text, Text> {
Configuration conf;
public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
conf = context.getConfiguration();
// Configure the jaas and krb5 parameters for components that need to access the ZooKeeper.
The // user does not need to repeatedly log in to the Reduce. The authentication information configured in the main method is used.
// These files are uploaded from the main method.
File jaasFile = new File(jaas);
File krb5File = new File(krb5);
System.setProperty("java.security.auth.login.config", jaasFile.getCanonicalPath());
System.setProperty("java.security.krb5.conf", krb5File.getCanonicalPath());
System.setProperty("zookeeper.sasl.client", "true");
Text finalValue = new Text("");
for (Text value : values) {
finalValue = value;
}
The results are exported to the HBase.
writeHBase(key.toString(), finalValue.toString());
// The results are saved to the HDFS.
context.write(key, finalValue);
}
Example 5: Use the writeHBase method to generate data to HBase.
private void writeHBase(String rowKey, String data) {
String tableName = "table1";
String columnFamily = "cf";
try {
LOG.info("UGI read :" + UserGroupInformation.getCurrentUser());
} catch (IOException e1) {
// handler exception
}
Configuration hbaseConfig = HBaseConfiguration.create(conf);
org.apache.hadoop.hbase.client.Connection conn = null;
try {
// Create an HBase connection
conn = ConnectionFactory.createConnection(hbaseConfig);
// Obtain the HBase table.
Table table = conn.getTable(TableName.valueOf(tableName));
// Create an HBase Put request instance.
List<Put> list = new ArrayList<Put>();
byte[] row = Bytes.toBytes("row" + rowKey);
Put put = new Put(row);
byte[] family = Bytes.toBytes(columnFamily);
byte[] qualifier = Bytes.toBytes("value");
byte[] value = Bytes.toBytes(data);
put.addColumn(family, qualifier, value);
list.add(put);
// Execute the PUT request.
table.put(list);
} catch (IOException e) {
LOG.warn("Exception occur ", e);
} finally {
}
}
Example 6: Use the main () method to create a job, configure dependencies and authentication information, and submit the job to the Hadoop cluster.
public static void main(String[] args) throws Exception {
// Load the configuration file.
Properties clientInfo = null;
String userdir = System.getProperty("user.dir") + "/";
InputStream fileInputStream = null;
try {
clientInfo = new Properties();
String hiveclientProp = userdir + "hiveclient.properties";
File propertiesFile = new File(hiveclientProp);
fileInputStream = new FileInputStream(propertiesFile);
clientInfo.load(fileInputStream);
} catch (Exception e) {
throw new IOException(e);
} finally {
if (fileInputStream != null) {
fileInputStream.close();
}
}
String zkQuorum = clientInfo.getProperty("zk.quorum");
String zooKeeperNamespace = clientInfo.getProperty("zooKeeperNamespace");
String serviceDiscoveryMode = clientInfo.getProperty("serviceDiscoveryMode");
String principal = clientInfo.getProperty("principal");
String auth = clientInfo.getProperty("auth");
String sasl_qop = clientInfo.getProperty("sasl.qop");
String hbaseKeytab = MultiComponentExample.class.getClassLoader().getResource("user.keytab").getPath();
String hbaseJaas = MultiComponentExample.class.getClassLoader().getResource("jaas_mr.conf").getPath();
String hiveClientProperties = MultiComponentExample.class.getClassLoader().getResource("hiveclient.properties").getPath();
// Combination file lists, separated them by commas (,).
String files = "file://" + KEYTAB + "," + "file://" + KRB + "," + "file://" + JAAS;
files = files + "," + "file://" + hbaseKeytab;
files = files + "," + "file://" + hbaseJaas;
files = files + "," + "file://" + hiveClientProperties;
// Files related to the tmpfiles attribute will be uploaded to the HDFS when the job is submitted.
config.set("tmpfiles", files);
// Directory to be cleared
MultiComponentExample.cleanupBeforeRun();
// security cluster login
LoginUtil.login(PRINCIPAL, KEYTAB, KRB, config);
// Search for the Hive dependency JAR package.
Class hiveDriverClass = Class.forName("org.apache.hive.jdbc.HiveDriver");
Class thriftClass = Class.forName("org.apache.thrift.TException");
Class thriftCLIClass = Class.forName("org.apache.hive.service.cli.thrift.TCLIService");
Class hiveConfClass = Class.forName("org.apache.hadoop.hive.conf.HiveConf");
Class hiveTransClass = Class.forName("org.apache.thrift.transport.HiveTSaslServerTransport");
Class hiveMetaClass = Class.forName("org.apache.hadoop.hive.metastore.api.MetaException");
Class hiveShimClass = Class.forName("org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge23");
// Add the Hive running dependency to jobs.
JarFinderUtil
.addDependencyJars(config, hiveDriverClass, thriftCLIClass, thriftClass, hiveConfClass, hiveTransClass,
hiveMetaClass, hiveShimClass);
// Add a Hive configuration file.
config.addResource("hive-site.xml");
// Add an HBase configuration file.
Configuration conf = HBaseConfiguration.create(config);
// Instantiate Jobs.
Job job = Job.getInstance(conf);
job.setJarByClass(MultiComponentExample.class);
// Set the mapper&reducer class.
job.setMapperClass(MultiComponentMapper.class);
job.setReducerClass(MultiComponentReducer.class);
// Set the input and output paths of jobs.
FileInputFormat.addInputPath(job, new Path(baseDir, INPUT_DIR_NAME + File.separator + "data.txt"));
FileOutputFormat.setOutputPath(job, new Path(baseDir, OUTPUT_DIR_NAME));
// Set the output key type.
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
// HBase provides a tool class to add HBase running dependency to jobs.
TableMapReduceUtil.addDependencyJars(job);
// This operation must be performed in security mode.
// HBase adds authentication information to the job. The map or reduce task uses the authentication information here.
TableMapReduceUtil.initCredentials(job);
// Creat Hive authentication information.
StringBuilder sBuilder = new StringBuilder("jdbc:hive2://").append(zkQuorum).append("/");
sBuilder.append(";serviceDiscoveryMode=").append(serviceDiscoveryMode).append(";zooKeeperNamespace=")
.append(zooKeeperNamespace)
.append(";sasl.qop=")
.append(sasl_qop)
.append(";auth=")
.append(auth)
.append(";principal=")
.append(principal)
.append(";");
String url = sBuilder.toString();
Connection connection = DriverManager.getConnection(url, "", "");
String tokenStr = ((HiveConnection) connection)
.getDelegationToken(UserGroupInformation.getCurrentUser().getShortUserName(), PRINCIPAL);
connection.close();
Token<DelegationTokenIdentifier> hive2Token = new Token<DelegationTokenIdentifier>();
hive2Token.decodeFromUrlString(tokenStr);
// Add Hive authentication information to a Job.
Text("hive.server2.delegation.token"), hive2Token);
job.getCredentials().addToken(new Text(HiveAuthFactory.HS2_CLIENT_TOKEN), hive2Token);
// Submit a job.
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
![]()
Replace all zkQuorum objects in the examples with the actual ZooKeeper cluster node information.
1.1.1.5 Application Commissioning
Compilation and Running Results
1.
In the Eclipse development environment, select
the LocalRunner.java project and click
to run the corresponding
application project.
Alternatively, right-click the project and choose Run as > Java Application from the shortcut menu to run the application project.
![]()
Do not restart the HDFS service during the running of MapReduce jobs. Otherwise, the jobs may fail.
Viewing Commissioning Results
After the MapReduce application is run, you can view the running status of the MapReduce application in the following ways:
l Check the running status of the application in the Eclipse.
l Use MapReduce logs to obtain the running status of applications.
l Log in to the MapReduce WebUI to check the running status of the application.
l Log in to the Yarn WebUI to check the running status of the application.
![]()
Contact the administrator to obtain a service account that has the right to access the web UI and its password.
1.1.1.5.1 Commissioning Applications on Windows
1. Compiling and Running Programs
Scenario
You can run applications in the Windows environment after application code development is complete.
![]()
If the IBM JDK is used on Windows, applications cannot be directly run on Windows.
Procedure
1.
In the Eclipse development environment, select
the LocalRunner.java project and click
to run the corresponding
application project.
Alternatively, right-click the project and choose Run as > Java Application from the shortcut menu to run the application project.
![]()
Do not restart the HDFS service during the running of MapReduce jobs. Otherwise, the jobs may fail.
2. Viewing the Commissioning Result
Scenario
After the MapReduce application is run, you can view the running status of the MapReduce application in the following ways:
l Check the running status of the application in the Eclipse.
l Use MapReduce logs to obtain the running status of applications.
l Log in to the MapReduce WebUI to check the running status of the application.
l Log in to the Yarn WebUI to check the running status of the application.
![]()
Contact the administrator to obtain a service account that has the right to access the web UI and its password.
Procedure
l Viewing running results to learn application running status
View the output result on the console to learn the application running status as follows:
1848 [main] INFO org.apache.hadoop.security.UserGroupInformation - Login successful for user admin@HADOOP.COM using keytab file
Login success!!!!!!!!!!!!!!
7093 [main] INFO org.apache.hadoop.hdfs.PeerCache - SocketCache disabled.
9614 [main] INFO org.apache.hadoop.hdfs.DFSClient - Created HDFS_DELEGATION_TOKEN token 45 for admin on ha-hdfs:hacluster
9709 [main] INFO org.apache.hadoop.mapreduce.security.TokenCache - Got dt for hdfs://hacluster; Kind: HDFS_DELEGATION_TOKEN,
Service: ha-hdfs:hacluster, Ident:
(HDFS_DELEGATION_TOKEN token 45 for admin)
10914 [main] INFO org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to 53
12136 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input files to process : 2
12731 [main] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:2
13405 [main] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1456738266914_0006
13405 [main] INFO org.apache.hadoop.mapreduce.JobSubmitter - Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hacluster,
Ident: (HDFS_DELEGATION_TOKEN token 45 for admin)
16019 [main] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Application submission is not finished,
submitted application application_1456738266914_0006 is still in NEW
16975 [main] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1456738266914_0006
17069 [main] INFO org.apache.hadoop.mapreduce.Job - The url to track the job:
https://linux2:26001/proxy/application_1456738266914_0006/
17086 [main] INFO org.apache.hadoop.mapreduce.Job - Running job: job_1456738266914_0006
29811 [main] INFO org.apache.hadoop.mapreduce.Job - Job job_1456738266914_0006 running in uber mode : false
29811 [main] INFO org.apache.hadoop.mapreduce.Job - map 0% reduce 0%
41492 [main] INFO org.apache.hadoop.mapreduce.Job - map 100% reduce 0%
53161 [main] INFO org.apache.hadoop.mapreduce.Job - map 100% reduce 100%
53265 [main] INFO org.apache.hadoop.mapreduce.Job - Job job_1456738266914_0006 completed successfully
53393 [main] INFO org.apache.hadoop.mapreduce.Job - Counters: 50
![]()
The following exception may occur when the sample code is running in the Windows OS, but it will not affect services.
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
l Viewing the task execution status by using the MapReduce WebUI
Log in to FusionInsight Manager, choose Service Management > MapReduce > JobHistoryServer, and check the task execution status on the web page.
Figure 1-1 JobHistory WebUI
![]()
l Viewing the task execution status by using the Yarn WebUI
Log in to FusionInsight Manager, choose Service Management > Yarn > ResourceManager (Master), and check the task execution status on the web page.
Figure 1-2 ResourceManager WebUI
![]()
l View MapReduce logs to obtain the application running status.
View MapReduce logs to learn application running status, and adjust applications based on log information.
1.1.1.5.2 Running Applications on Linux
1. Compiling and Running Programs
Scenario
After the program code is developed, you can run the application in the Linux environment.
Prerequisites
The Yarn client has been installed.
Procedure
Step 1 Export the executable MapReduce application package.
l For the MapReduce statistics sample program, select the FemaleInfoCollector.java, LoginUtil.java, krb5.conf, and user.keytab files, and choose Export from the shortcut menu.
l For the MapReduce multi-component accessing sample project, select the LoginUtil.java, MultiComponentExample.java, and JarFinderUtil.java files, and choose Export from the shortcut menu.
Step 2 Select JAR file, as shown in Figure 1-3. Click Next.
Figure 1-3 Selecting JAR file
![]()
Step 3 Select a path for exporting the package, as shown in Figure 1-4. Click Finish.
Figure 1-4 Selecting a path for exporting the JAR file
![]()
Step 4 Upload the generated application package mapreduce-example.jar to a Linux client, for example, /srv/client/conf, in the same directory as the configuration file.
Step 5 Execute the sample project on Linux.
l For the MapReduce statistics sample project, run the following command:
yarn jar mapreduce-example.jar com.huawei.bigdata.mapreduce.examples.FemaleInfoCollector <inputPath> <outputPath>
This command is used to set parameters and submit jobs. In the command, <inputPath> indicates the input path of the HDFS file system, and <outputPath> indicates the output path of the HDFS file system.
![]()
l Before running the yarn jar mapreduce-example.jar com.huawei.bigdata.mapreduce.examples.FemaleInfoCollector<inputPath> <outputPath> command, upload the log1.txt and log2.txt files to the <inputPath> directory of the HDFS. For details, see the description of typical scenarios.
l Before running the yarn jar mapreduce-example.jar com.huawei.bigdata.mapreduce.examples.FemaleInfoCollector<inputPath> <outputPath> command, ensure that the <outputPath> directory does not exist. Otherwise, an error is reported.
l Do not restart the HDFS service during the running of MapReduce jobs. Otherwise, the jobs may fail.
l For the MapReduce multi-component accessing sample project, perform the following steps:
a. Obtain the user.keytab, krb5.conf, hbase-site.xml, hiveclient.properties, and hive-site.xml files, and create a folder in the Linux environment to save the configuration files, for example, /srv/client/conf.
![]()
Contact the administrator to obtain the user.keytab and krb5.conf files corresponding to the account and permission. Obtain hbase-site.xml from the HBase client, and hiveclient.properties and hive-site.xml from the Hive client.
b. Create the jaas_mr.conf file in the new folder. The file content is as follows:
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
keyTab="user.keytab"
principal="test@HADOOP.COM"
useTicketCache=false
storeKey=true
debug=true;
};
![]()
In the preceding file content, test@HADOOP.COM is an example. Change it based on the site requirements.
c. In the Linux environment, add the classpath required for running the sample project, for example,
export YARN_USER_CLASSPATH=/srv/client/conf/:/srv/client/HBase/hbase/lib/*:/srv/client/Hive/Beeline/lib/*
d. Submit the MapReduce job and run the following command to run the sample project.
yarn jar mapreduce-example.jar com.huawei.bigdata.mapreduce.examples.MultiComponentExample
----End
2. Viewing the Commissioning Result
Scenario
After the MapReduce application is run, you can view the running status of the MapReduce application in the following ways:
l Check the running status of the program based on the running result.
l Log in to the MapReduce WebUI to check the running status of the application.
l Log in to the Yarn WebUI to check the running status of the application.
l Use MapReduce logs to obtain the running status of applications.
![]()
Contact the administrator to obtain a service account that has the right to access the web UI and its password.
Procedure
l Viewing the task execution status by using the MapReduce WebUI
Log in to FusionInsight Manager, choose Service Management > MapReduce > JobHistoryServer, and check the task execution status on the web page.
Figure 1-5 JobHistory WebUI
![]()
l Viewing the task execution status by using the Yarn WebUI
Log in to FusionInsight Manager, choose Service Management > Yarn > ResourceManager (Master), and check the task execution status on the web page.
Figure 1-6 ResourceManager WebUI
![]()
l Viewing the running result of the MapReduce application
− After running the yarn jar mapreduce-example.jar command in the Linux OS, you can view the running status of the running applications. For example:
linux1:/opt # yarn jar
mapreduce-example.jar /user/mapred/example/input/ /output6
16/02/24 15:45:40 INFO security.UserGroupInformation: Login successful for user
admin@HADOOP.COM using keytab file user.keytab
Login success!!!!!!!!!!!!!!
16/02/24 15:45:40 INFO hdfs.PeerCache: SocketCache disabled.
16/02/24 15:45:41 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 28
for admin on ha-hdfs:hacluster
16/02/24 15:45:41 INFO security.TokenCache: Got dt for hdfs://hacluster; Kind:
HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hacluster, Ident:
(HDFS_DELEGATION_TOKEN token 28 for admin)
16/02/24 15:45:41 INFO input.FileInputFormat: Total input files to process :
2
16/02/24 15:45:41 INFO mapreduce.JobSubmitter: number of splits:2
16/02/24 15:45:42 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_1455853029114_0027
16/02/24 15:45:42 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN,
Service: ha-hdfs:hacluster, Ident: (HDFS_DELEGATION_TOKEN token 28 for
admin)
16/02/24 15:45:42 INFO impl.YarnClientImpl: Submitted application
application_1455853029114_0027
16/02/24 15:45:42 INFO mapreduce.Job: The url to track the job:
https://linux1:26001/proxy/application_1455853029114_0027/
16/02/24 15:45:42 INFO mapreduce.Job: Running job:
job_1455853029114_0027
16/02/24 15:45:50 INFO mapreduce.Job: Job job_1455853029114_0027 running in
uber mode : false
16/02/24 15:45:50 INFO mapreduce.Job: map 0% reduce 0%
16/02/24 15:45:56 INFO mapreduce.Job: map 100% reduce 0%
16/02/24 15:46:03 INFO mapreduce.Job: map 100% reduce 100%
16/02/24 15:46:03 INFO mapreduce.Job: Job job_1455853029114_0027 completed
successfully
16/02/24 15:46:03 INFO mapreduce.Job: Counters: 49
l Run the yarn application -status <ApplicationID> command in the Linux OS. The execution result shows the running status of the running applications. For example:
linux1:/opt # yarn application -status
application_1455853029114_0027
Application Report :
Application-Id :
application_1455853029114_0027
Application-Name : Collect Female
Info
Application-Type :
MAPREDUCE
User : admin
Queue : default
Start-Time :
1456299942302
Finish-Time :
1456299962343
Progress : 100%
State : FINISHED
Final-State : SUCCEEDED
Tracking-URL :
https://linux1:26014/jobhistory/job/job_1455853029114_0027
RPC Port : 27100
AM Host : SZV1000044726
Aggregate Resource Allocation :
114106 MB-seconds, 42 vcore-seconds
Log Aggregation Status :
SUCCEEDED
Diagnostics : Application finished
execution.
Application Node Label Expression :
<Not set>
AM container Node Label Expression :
<DEFAULT_PARTITION>
l Viewing MapReduce logs to learn application running status
View MapReduce logs to learn application running status, and adjust applications based on log information.

