第3章 HDFS客户端操作

3.1 HDFS客户端环境准备

  1. 在IDEA中创建一个Maven工程teaching
  2. 导入相应的依赖
<dependencies>
	<dependency>
		<groupId>junit</groupId>
		<artifactId>junit</artifactId>
		<version>RELEASE</version>
	</dependency>
	<dependency>
		<groupId>org.apache.logging.log4j</groupId>
		<artifactId>log4j-core</artifactId>
		<version>2.8.2</version>
	</dependency>
	<dependency>
		<groupId>org.apache.hadoop</groupId>
		<artifactId>hadoop-common</artifactId>
		<version>2.7.2</version>
	</dependency>
	<dependency>
		<groupId>org.apache.hadoop</groupId>
		<artifactId>hadoop-client</artifactId>
		<version>2.7.2</version>
	</dependency>
	<dependency>
		<groupId>org.apache.hadoop</groupId>
		<artifactId>hadoop-hdfs</artifactId>
		<version>2.7.2</version>
	</dependency>
</dependencies>
  1. 创建包名:com.test.hdfs
  2. 创建HdfsClient类
package com.test.hdfs;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;

public class HDFSClient {
    public static void main(String[] args){
        // 1 获取文件系统配置
        Configuration configuration = new Configuration();

        FileSystem fs = null;
        try {
            fs = FileSystem.get(new URI("hdfs://10.13.11.22:8020"), configuration,"hdfs");

            // 2 创建目录
            fs.mkdirs(new Path("/teaching/hdfs2"));
        } catch (IOException | InterruptedException | URISyntaxException e) {
            e.printStackTrace();
        } finally {
            if(fs != null){
                // 3 关闭资源
                try {
                    fs.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
    }
}
  1. 执行程序
    客户端去操作hdfs时,设定默认用户:hdfs
  2. 注意:如果打印不出日志,在控制台上只显示
1.log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).  
2.log4j:WARN Please initialize the log4j system properly.  
3.log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

需要在项目的src/main/resources目录下,新建一个文件,命名为“log4j.properties”,在文件中填入

log4j.rootLogger=INFO, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n
log4j.appender.logfile=org.apache.log4j.FileAppender
log4j.appender.logfile.File=target/spring.log
log4j.appender.logfile.layout=org.apache.log4j.PatternLayout
log4j.appender.logfile.layout.ConversionPattern=%d %p [%c] - %m%n

3.2 HDFS的文件操作

3.2.1.HDFS文件上传(测试参数优先级)

  1. 编写源代码
/**
 * 上传本地文件到HDFS文件系统中
 * @param filePath 本地文件路径
 * @param dstPath HDFS文件系统路径
 */
public static void upFile(String filePath, String dstPath){
    Configuration configuration = new Configuration();

    FileSystem fileSystem = null;

    try{
        // 设置副本数
        configuration.set("dfs.replication","2");
        fileSystem = FileSystem.get(new URI("hdfs://10.13.11.22:8020"), configuration, "hdfs");

        fileSystem.copyFromLocalFile(new Path(filePath),new Path(dstPath));
    } catch(Exception e){
        e.printStackTrace();
    } finally {
        if(fileSystem != null){
            try {
                fileSystem.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}
  1. 将hdfs-site.xml拷贝到项目的根目录下
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
	<property>
		<name>dfs.replication</name>
        <value>1</value>
	</property>
</configuration>
  1. 参数优先级
    客户端代码中设置的值 > classpath下的用户自定义配置文件 > 然后是服务器的默认配置(副本数3)

3.2.2.HDFS文件下载

/**
 * 下载HDFS文件到本地
 * @param hdfsFile HDFS上文件路径
 * @param dst 本地路径
 */
public static void downloadFile(String hdfsFile, String dst){
    Configuration configuration = new Configuration();
    FileSystem fs = null;

    try {
        fs = FileSystem.get(new URI("hdfs://10.13.11.22:8020"), configuration, "hdfs");
        // boolean delSrc 指是否将原文件删除,默认:false
        // Path src 指要下载的文件路径
        // Path dst 指将文件下载到的路径
        // boolean useRawLocalFileSystem 是否开启文件校验,不开启文件校验,下载时会多一个crc文件,默认:false
        fs.copyToLocalFile(false,new Path(hdfsFile), new Path(dst),true);
    } catch (IOException | InterruptedException | URISyntaxException e) {
        e.printStackTrace();
    } finally {
        if(fs != null){
            try {
                fs.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

3.2.3.HDFS文件夹删除

public static void delete(String filePath){
    Configuration configuration = new Configuration();
    FileSystem fs = null;

    try {
        fs = FileSystem.get(new URI("hdfs://10.13.11.22:8020"), configuration, "hdfs");
        // Path f 删除的路径
        // boolean recursive 路径是目录设置为true,路径为文件,可以设置为true或者false
        fs.delete(new Path(filePath),true);
    } catch (IOException | InterruptedException | URISyntaxException e) {
        e.printStackTrace();
    } finally {
        if(fs != null){
            try {
                fs.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

3.2.4.HDFS文件名更改

public static void rename(String srcPath, String dstPath) {
    Configuration configuration = new Configuration();
    FileSystem fs = null;

    try {
        fs = FileSystem.get(new URI("hdfs://10.13.11.22:8020"), configuration, "hdfs");
        fs.rename(new Path(srcPath), new Path(dstPath));
    } catch (IOException | InterruptedException | URISyntaxException e) {
        e.printStackTrace();
    } finally {
        if(fs != null){
            try {
                fs.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

3.2.5.HDFS文件详情查看

查看文件名称、权限、长度、块信息

public static void listFiles(String path){
    Configuration configuration = new Configuration();
    FileSystem fs = null;

    try {
        fs = FileSystem.get(new URI("hdfs://10.13.11.22:8020"), configuration, "hdfs");
        RemoteIterator<LocatedFileStatus> files = fs.listFiles(new Path(path),true);
        while(files.hasNext()){
            LocatedFileStatus locatedFileStatus = files.next();
            System.out.println(locatedFileStatus.getPath().getName());
            System.out.println(locatedFileStatus.getLen());
            System.out.println(locatedFileStatus.getPermission());
            System.out.println(locatedFileStatus.getGroup());
            BlockLocation[] blockLocations = locatedFileStatus.getBlockLocations();
            for(BlockLocation blockLocation: blockLocations){
                String[] hosts = blockLocation.getHosts();
                for(String host:hosts){
                    System.out.println(host);
                }
            }
        }
    } catch (IOException | InterruptedException | URISyntaxException e) {
        e.printStackTrace();
    } finally {
        if(fs != null){
            try {
                fs.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

3.2.6.HDFS文件和文件夹判断

public static void fileStatus(String path) {
    Configuration configuration = new Configuration();
    FileSystem fs = null;

    try {
        fs = FileSystem.get(new URI("hdfs://10.13.11.22:8020"), configuration, "hdfs");
        FileStatus[] fileStatus = fs.listStatus(new Path(path));
        for (FileStatus fileStatus1 : fileStatus) {
            if (fileStatus1.isFile()) {
                System.out.println(fileStatus1.getPath().getName() + " is file!");
            } else {
                System.out.println(fileStatus1.getPath().getName() + " is directory!");
            }
        }
    } catch (IOException | InterruptedException | URISyntaxException e) {
        e.printStackTrace();
    } finally {
        if (fs != null) {
            try {
                fs.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

3.3 HDFS的I/O流操作

3.3.1.HDFS文件上传

public static void putFileToHDFS(String localFile, String hdfsFile){
    Configuration configuration = new Configuration();
    FileSystem fs = null;
    FileInputStream fis = null;
    FSDataOutputStream fos = null;

    try {
        fs = FileSystem.get(new URI("hdfs://10.13.11.22:8020"),configuration,"hdfs");
        fis = new FileInputStream(new File(localFile));
        fos = fs.create(new Path(hdfsFile));
        IOUtils.copyBytes(fis,fos,configuration);
    } catch (IOException | InterruptedException | URISyntaxException e) {
        e.printStackTrace();
    } finally {
        if(fs != null){
            try {
                fs.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        if(fis != null){
            IOUtils.closeStream(fis);
        }
        if(fos != null){
            IOUtils.closeStream(fos);
        }
    }
}

3.3.2.HDFS文件下载

  1. 需求:从HDFS上下载文件到本地。
  2. 编写代码:
public static void getFileFromHDFS(String hdfsFile, String localFile){
    Configuration configuration = new Configuration();
    FileSystem fs = null;
    FSDataInputStream fis = null;
    FileOutputStream fos = null;

    try {
        fs = FileSystem.get(new URI("hdfs://10.13.11.22:8020"),configuration, "hdfs");
        fis = fs.open(new Path(hdfsFile));
        fos = new FileOutputStream(localFile);
        IOUtils.copyBytes(fis,fos,configuration);
    } catch (IOException | InterruptedException | URISyntaxException e) {
        e.printStackTrace();
    } finally {
        if(fs != null){
            try {
                fs.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        if(fis != null){
            IOUtils.closeStream(fis);
        }
        if(fos != null){
            IOUtils.closeStream(fos);
        }
    }
}

3.3.3.定位文件读取

  1. 下载文件快的第一块
public static void readFileSeek(String hdfsFile, String localFile){
    Configuration configuration = new Configuration();
    FileSystem fs = null;
    FSDataInputStream fis = null;
    FileOutputStream fos = null;

    try {
        fs = FileSystem.get(new URI("hdfs://10.13.11.22:8020"),configuration, "hdfs");
        fis = fs.open(new Path(hdfsFile));
        fos = new FileOutputStream(localFile);
        byte[] buf = new byte[1024];

        for(int i =0 ; i < 1024*128 ; i++){
            fis.read(buf);
            fos.write(buf);
        }
    } catch (IOException | InterruptedException | URISyntaxException e) {
        e.printStackTrace();
    } finally {
        if(fs != null){
            try {
                fs.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        if(fis != null){
            IOUtils.closeStream(fis);
        }
        if(fos != null){
            IOUtils.closeStream(fos);
        }
    }
}
  1. 下载文件块的第二块
public static void readFileSeek2(String hdfsFile, String localFile){
    Configuration configuration = new Configuration();
    FileSystem fs = null;
    FSDataInputStream fis = null;
    FileOutputStream fos = null;

    try {
        fs = FileSystem.get(new URI("hdfs://10.13.11.22:8020"),configuration, "hdfs");
        fis = fs.open(new Path(hdfsFile));
//从指定文件的pos位置,对文件流向前搜索。
        fis.seek(1024*1024*128);
        fos = new FileOutputStream(localFile);
        IOUtils.copyBytes(fis,fos,configuration);
    } catch (IOException | InterruptedException | URISyntaxException e) {
        e.printStackTrace();
    } finally {
        if(fs != null){
            try {
                fs.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        if(fis != null){
            IOUtils.closeStream(fis);
        }
        if(fos != null){
            IOUtils.closeStream(fos);
        }
    }
}
  1. 合并文件

在window命令窗口中执行

type hadoop-2.7.2.tar.gz.part2 >> hadoop-2.7.2.tar.gz.part1
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值