windows下搭建maven环境，并建立maven项目用java操作hdfs

最新推荐文章于 2024-05-10 11:33:03 发布

杨苗苗

最新推荐文章于 2024-05-10 11:33:03 发布

阅读量1.1k

点赞数

分类专栏： hadoop HDFS

本文链接：https://blog.csdn.net/weixin_42474930/article/details/82725360

版权

hadoop 同时被 2 个专栏收录

6 篇文章 0 订阅

订阅专栏

HDFS

2 篇文章 0 订阅

订阅专栏

为什么要用maven：

如果建立大一点的java项目的话，需要的jar和各jar包之间的依赖关系会很多，需要自己手动找jar包和jar依赖导入项目中，

maven就是来解决这个问题的，用来管理java项目的jar包依赖和项目构建操作。

安装配置：

maven官网下载zip压缩包，解压到自己存放的目录，目录名之间最好不要有空格。

配置系统环境变量：

添加MAVEN_HOME=安装目录

测试：--------------------cmd下输入 echo $MAVEN_HOME ,输出安装目录即可。

现在给intellij idea整合 maven环境：

在settings ---------------》》maven ------------------------》》有个directory：选中安装目录即可。----------------》》第二项：sitting file：选中maven下的setting.xml, 并在里面配置自己本地仓库的位置:1.localRepository（本地仓库） : 2.mirror（私服） ------ 在线仓库（可先不配置）。

以上配置完成。

新建项目：

创建maven项目------------》maven--------------------》写ANG信息-----------------》next ------------>next----------->ok。

接下来在 pom.xml添加内容：引入hadoop需要的jar包和依赖（只需要写配置信息即可，包和依赖是自动导入）。

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.judy</groupId>
    <artifactId>MyMaven</artifactId>
    <version>1.0</version>

    <!--配置项目中某些jar包的版本-->
    <properties>
        <hadoop.version>2.7.1</hadoop.version>
    </properties>

    <!--在指定的项目中指定私服配置-->
    <repositories>
        <repository>
            <id>nexus-releases</id>
            <name>Nexus Release Repository</name>
            <url>http://10.0.88.249:8081/nexus/content/repositories/public/</url>
        </repository>
    </repositories>

    <!--项目jar包的依赖配置-->
    <dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>${hadoop.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
    </dependencies>


</project>

在这引入的是hadoop2.7.1的jar包依赖（因为这个私服地址2.9.1的jar包不全，所以只能用2.7的），建议用2.9.1的。

配置之后idea会自动import。

现在可以写java代码操作hdfs了。

hdfs的相关包和类和用法，详见官网API：http://hadoop.apache.org/docs/stable/api/index.html

运行操作之前，别忘了先确认自己的hdfs相关服务是否开启，

package com.judy.first;


import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
//import org.apache.hadoop.hdfs.DistributedFileSystem;
import org.apache.hadoop.io.IOUtils;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;

public class HdfsTest {
    public static void main(String[] args) {
//        String hdfsPath="/score.txt";
//        copyHdfsFileToConsole(hdfsPath);
        String hdfsfilePath = "/test/bb.txt";
        copyLocalFiletoHdfs(hdfsfilePath);

    }

    /**
     * 读取hdfs的文件并且显示在控制台
     *
     * @param hdfsPath hdfs的文件路径
     */
    public static void copyHdfsFileToConsole(String hdfsPath) {
        FileSystem fileSystem = null;
        FSDataInputStream inputStream = null;


        try {
            //1.获取
            Configuration conf = new Configuration();
            //2.配置hdfs的访问入口  (配置文件core-site.xml放到同包的resources下)

            //3.获取hdfs对象
            fileSystem = FileSystem.newInstance(conf);
            //4.对hdfs对象进行操作
            inputStream = fileSystem.open(new Path(hdfsPath));
            IOUtils.copyBytes(inputStream, System.out, 4096, true);
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (inputStream != null) {
                try {
                    inputStream.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
    }

    /**
     * 上传windows本地文件到hdfs
     *
     * @param hdfsFilePath
     */
    public static void copyLocalFiletoHdfs(String hdfsFilePath) {
        FileSystem fileSystem = null;
        FSDataOutputStream outputStream = null;
        FileInputStream fileInputStream = null;
        Configuration conf = new Configuration();
        /*File file = new File(hdfsFilePath);
        if (!file.exists()){
            file.mkdirs();
        }*/
        try {
            fileSystem = FileSystem.newInstance(conf);
//            DistributedFileSystem dfs = null;
//            outputStream = dfs.create(new Path(filePath));
            outputStream = fileSystem.create(new Path(hdfsFilePath), true);
            fileInputStream = new FileInputStream(new File("D://aa.txt"));
            IOUtils.copyBytes(fileInputStream, outputStream, 4096, true);
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (fileInputStream != null) {
                try {
                    fileInputStream.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            if (outputStream != null) {
                try {
                    outputStream.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
    }

}

在web查看是否成功呀：http://cent01:50700 --------------我的主机呀

ok，成功！

我就测试一遍就过了，无压力啊 /笑哭~~~~

果然java代码只要有oop思想，能看懂API就毫无压力的，写项目是无压力了，不是说原生java写算法哈，这就很不一样了哈哈。

杨苗苗

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
windows下搭建maven环境，并建立maven项目用java操作hdfs

为什么要用maven：如果建立大一点的java项目的话，需要的jar和各jar包之间的依赖关系会很多，需要自己手动找jar包和jar依赖导入项目中，maven就是来解决这个问题的，用来管理java项目的jar包依赖和项目构建操作。安装配置：maven官网下载zip压缩包，解压到自己存放的目录，目录名之间最好不要有空格。配置系统环境变量：添加MAVEN_HOME=安装目录...
复制链接

扫一扫