Hello, everyone!
This post is about a simple Java program to append to a file in HDFS, hope it helps you.
In this article, I will present you with a Java program to append to a file in HDFS.
I will be using Maven as the build tool.
First, we need to add Maven dependencies in the pom.xml.
Now, we need to import the following classes:
import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import java.io.*;
We will be using the hadoop.conf.Configuration class to set the file system configurations as per the configuration of the Hadoop cluster installed.
Let's now start with configuring the file system:
public FileSystem configureFileSystem(String coreSitePath, String hdfsSitePath) { FileSystem fileSystem = null; Configuration conf = new Configuration(); conf.setBoolean("dfs.support.append", true); Path coreSite = new Path(coreSitePath); Path hdfsSite = new Path(hdfsSitePath); conf.addResource(coreSite); conf.addResource(hdfsSite); fileSystem = FileSystem.get(conf); } catch (IOException ex) { System.out.println("Error occurred while configuring FileSystem"); } return fileSystem;}
Make sure that the property dfs.support.append in hdfs-site.xml is set to true.
You can either set it manually by editing the hdfs-site.xml file or programmatically using:
conf.setBoolean("dfs.support.append", true);
Now that the file system is configured, we can access the files stored in HDFS.
Let's start with appending to a file in HDFS.
public String appendToFile(FileSystem fileSystem, String content, String dest) throws IOException { Path destPath = new Path(dest); if (!fileSystem.exists(destPath)) { System.err.println("File doesn't exist"); return "Failure"; } Boolean isAppendable = Boolean.valueOf(fileSystem.getConf().get("dfs.support.append")); if(isAppendable) { FSDataOutputStream fs_append = fileSystem.append(destPath); PrintWriter writer = new PrintWriter(fs_append); writer.append(content); writer.flush(); fs_append.hflush(); writer.close(); fs_append.close(); return "Success"; } else { System.err.println("Please set the dfs.support.append property to true"); return "Failure"; } }
To see whether the data has been correctly written to HDFS, let's write a method to read from HDFS and return the content as a String.
public String readFromHdfs(FileSystem fileSystem, String hdfsFilePath) { Path hdfsPath = new Path(hdfsFilePath); StringBuilder fileContent = new StringBuilder(""); try{ BufferedReader bfr=new BufferedReader(new InputStreamReader(fileSystem.open(hdfsPath))); String str; while ((str = bfr.readLine()) != null) { fileContent.append(str+"\n"); } }catch (IOException ex){ System.out.println("----------Could not read from HDFS---------\n"); } return fileContent.toString(); }
After that, we have successfully written and read the file in HDFS. It's time to close the file system.
public void closeFileSystem(FileSystem fileSystem){ try { fileSystem.close(); } catch (IOException ex){ System.out.println("----------Could not close the FileSystem----------"); } }
Before executing the code, you should have Hadoop running on your system.
You just need to go to HADOOP_HOME and run following command:
./sbin/start-all.sh
For the complete program, refer to my GitHub repository.
Happy coding!
Original link:
https://dzone.com/articles/simple-java-program-to-append-to-a-file-in-hdfs