How to setup maven, read file, write file, upload file, check file exists, list files, delete file & download file in Hadoop using Java.
Let's following the code below.
Maven Dependencies
<dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>${hadoop.version}</version> </dependency>
Write / Upload File
Configuration conf = new Configuration(); conf.set("fs.default.name", "hdfs://localhost:9820"); // Same as -> etc/hadoop/core-site.xml FileSystem fs = FileSystem.get(conf); FileInputStream fis = new FileInputStream("D:/TestFile.txt"); // Local Path FSDataOutputStream fsdos = fs.create(new Path("/home/TestFile.txt")); // Hadoop Path byte buffer[] = new byte[1024]; int bytesRead = 0; while ((bytesRead = fis.read(buffer)) > 0) { fsdos.write(buffer, 0, bytesRead); } fis.close(); fsdos.close();
Create Directory / Folder
Configuration conf = new Configuration(); conf.set("fs.default.name", "hdfs://localhost:9820"); // Same as -> etc/hadoop/core-site.xml FileSystem fs = FileSystem.get(conf); fs.mkdirs(new Path("/home/myfolder"), new FsPermission("1777")); // Chmod code
Write / Upload File
Configuration conf = new Configuration(); conf.set("fs.default.name", "hdfs://localhost:9820"); // Same as -> etc/hadoop/core-site.xml FileSystem fs = FileSystem.get(conf); FileInputStream fis = new FileInputStream("D:/TestFile.txt"); // Local Path FSDataOutputStream fsdos = fs.create(new Path("/home/TestFile.txt")); // Hadoop Path byte buffer[] = new byte[1024]; int bytesRead = 0; while((bytesRead = fis.read(buffer)) > 0) { fsdos.write(buffer, 0, bytesRead); } fis.close(); fsdos.close();
Read File
Configuration conf = new Configuration(); conf.set("fs.default.name", "hdfs://localhost:9820"); // Same as -> etc/hadoop/core-site.xml FileSystem fs = FileSystem.get(conf); FSDataInputStream fsdis = fs.open(new Path("hdfs://localhost:9820/home/TestFile.txt")); OutputStream os = System.out; byte buffer[] = new byte[1024]; int bytesRead = 0; while((bytesRead = fsdis.read(buffer)) > 0) { os.write(buffer, 0, bytesRead); } fsdis.close(); os.close();
List of Files / Directories
Configuration conf = new Configuration(); conf.set("fs.default.name", "hdfs://localhost:9820"); // Same as -> etc/hadoop/core-site.xml FileSystem fs = FileSystem.get(conf); FileStatus[] fileStatus = fs.listStatus(new Path("/")); for(FileStatus status : fileStatus) { System.out.println(">> " + status.getPath().toString()); }
Set File Permission (chmod)
Configuration conf = new Configuration(); conf.set("fs.default.name", "hdfs://localhost:9820"); // Same as -> etc/hadoop/core-site.xml FileSystem fs = FileSystem.get(conf); fs.setPermission(new Path("/home/TestFile.txt"), new FsPermission("1777")); // Chmod code
Check File Exists
Configuration conf = new Configuration(); conf.set("fs.default.name", "hdfs://localhost:9820"); // Same as -> etc/hadoop/core-site.xml FileSystem fs = FileSystem.get(conf); boolean exists = fs.exists(new Path("/home/TestFile.txt"));
Delete File
Configuration conf = new Configuration(); conf.set("fs.default.name", "hdfs://localhost:9820"); // Same as -> etc/hadoop/core-site.xml FileSystem fs = FileSystem.get(conf); fs.delete(new Path("/home/TestFile.txt"), true);
Download File from HttpServletResponse
public void downloadFromServlet(String remoteFile, HttpServletResponse servletResponse) throws IOException { Configuration conf = new Configuration(); conf.set("fs.default.name", "hdfs://localhost:9820"); // Same as -> etc/hadoop/core-site.xml FileSystem fs = FileSystem.get(conf); FSDataInputStream fsdis = fs.open(new Path(remoteFile)); OutputStream os = servletResponse.getOutputStream(); byte buffer[] = new byte[1024]; int bytesRead = 0; while((bytesRead = fsdis.read(buffer)) > 0) { os.write(buffer, 0, bytesRead); } fsdis.close(); os.close(); }
References :
- CreativeData.Atlassian.net - Java - Read & Write files with HDFS
- JavaTips.net - Apache Hadoop File Permission (FsPermission)
0 komentar: