基于hadoop HDFS的存储系搭建

最新推荐文章于 2024-09-11 06:00:00 发布

lb192837

最新推荐文章于 2024-09-11 06:00:00 发布

阅读量2.1k

点赞数 1

分类专栏： java 后端分布式

本文链接：https://blog.csdn.net/liubin192837/article/details/85885057

版权

java 同时被 3 个专栏收录

10 篇文章 0 订阅

订阅专栏

后端

6 篇文章 0 订阅

订阅专栏

分布式

4 篇文章 0 订阅

订阅专栏

基于hadoop HDFS的存储系统（web 网盘）

1. HDFS的优势

1.1 源码注释说很透彻：

Hadoop DFS is a multi-machine system that appears as a single
disk. It’s useful because of its fault tolerance and potentially
very large capacity.
解释：HDFS是一个多机系统，对外作为一个整体的磁盘存在。由于它的容错性和大容量，具有很大的可用性

1.2 对外作为一个整体和容错性的原理

1）整体性（虚拟化）
NameNode保存元数据，作为整个系统打大管家。客户端直接面对的是NameNode，它负管理所有的数据节点（datanode），这样一来对于客户端来说，整合存储端是黑盒的，一体化的，无需关心具体的服务器存储点。
2）容错性
数据数据多机备份、数据校验、心跳检测、数据块报告、读写容错。这些相当于对我们的数据加了一层更加强力的外衣。

1.3 传统存储平台开发的弊端

1).
业务代码需要感知服务端存储节点的分布/状态
操作时获取哪些可以用于存储，哪些空间不足进而来决定如何存放文件
2).
当然也可以借助ZK或者数据库动态获取存储端，但是依然无法摆脱业务端对服务器端的耦合，需要具有
3) .
数据的安全性上，很难做出保证。即使有保障，也需要业务端切入过多的逻辑。对性能的消耗也是不一般的

2. 代码实现

注释：后端基于springboot实现
首先我们要和集群建立联系，通过master的节点获取存储空间的状态和要存储的data节点，从而确定下一步的数据传输目的地和分布点。

和集群的交可以借助如下的一个重要的开源包hadoop-common，这是一个对hadoop其余模块的公共支持库。包含conf、FS、IO和IPC等部分
hadoop-common源码地址如下:
https://github.com/apache/hadoop/tree/trunk/hadoop-common-project

2.1 上传功能

1）接收用户的上传请求

 package com.hdlw.controls;
 @Controller
 public class UploadMap {
    ...
    @RequestMapping(value = "/upload")
    @ResponseBody
    public ReturnMessage upload(@RequestParam("file") MultipartFile file, HttpServletRequest httpServletRequest) {
        ...
        int result = UploadDownload.upload(file.getInputStream(), path);
        ...
    }
}

2）处理上传存储请求

 public class UploadDownload {
     ...
     public static int upload(InputStream inputStream, String path) {
         int result = -1;
        FileSystem fileSystem = checkAndInit();
        if (fileSystem == null) {
            return result;
        }
        OutputStream outputStream = null;
        try {
            outputStream = fileSystem.create(new Path(path), true);
            result = IOUtils.copy(inputStream, outputStream);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            close(inputStream,outputStream);
        }
        return result;
    }
 ...
 }

2.2 下载功能

1）接收用户的下载请求

@GetMapping("/download")
    public String downloadFile(HttpServletRequest request, HttpServletResponse response, @RequestParam String filenameName) {
       .....
        try {
            response.setContentType("application/force-download");
            response.addHeader("Content-Disposition", "attachment;fileName=" + filenameName);
            int result = UploadDownload.download(response.getOutputStream(), path);
            System.out.println(result);
            if (result != -1) {
                returnMessage.setMsg(constants.operate_ok);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }

1）下载请求处理

public static int download(OutputStream outputStream, String path){
        int result = -1;
        FileSystem fileSystem = checkAndInit();
        if (fileSystem == null) {
            return result;
        }
        InputStream inputStream = null;
        Path filePath = new Path(path);
        try {
            if (!fileSystem.exists(filePath)) {
                return result;
            }
            inputStream = fileSystem.open(new Path(path));
            result = IOUtils.copy(inputStream, outputStream);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            close(inputStream,outputStream);
        }
        return result;
    }

结语

借助hadoop-common,可以帮助我们和NameNode进行交互,申请存储空间或者查询文件.

NameNode,是存储的文件元数据.

lb192837

关注

1
点赞
踩
11

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

基于hadoop HDFS的存储系搭建

基于hadoop HDFS的存储系统（web 网盘）

1. HDFS的优势

1.1 源码注释说很透彻：

1.2 对外作为一个整体 和 容错性 的原理

1.3 传统存储平台开发的弊端

2. 代码实现

2.1 上传功能

2.2 下载功能

结语

1.2 对外作为一个整体和容错性的原理