大数据学习之HDFS基础

一、HDFS介绍

  1. 基本介绍

    • HDFS的全称是Hadoop Distributed File System ,Hadoop的 分布式 文件 系统
    • 是一种允许文件通过网络在多台主机上分享的文件系统,可以让多台机器上的多个用户分享文件和存储空间
    • HDFS是一种适合大文件存储的分布式文件系统,不适合小文件存储
  2. 设计思想

在这里插入图片描述

二、HDFS基础操作

  1. HDFS的shell
    • 命令格式:bin/hdfs dfs -xxx scheme://authority/path
      • 使用hadoop bin目录的hdfs命令,后面指定dfs,表示是操作分布式文件系统的,这些属于固定格式【若在path中配置了Hadoop的bin目录,则直接使用hdfs即可】
      • xxx是一个占位符,具体我们想对hdfs做什么操作,就可以在这里指定对应的命令了
      • HDFS的schema是hdfs,authority是集群中namenode所在节点的ip和对应的端口号,把ip换成主机名也是一样的,path是我们要操作的文件路径信息
      • 其实后面这一长串内容就是core-site.xml配置文件中fs.defaultFS属性的值,这个代表的是HDFS的地址。
  2. 基础命令
    • hdfs dfs:查看帮助文档

      [root@bigdata01 ~]# hdfs dfs
      Usage: hadoop fs [generic options]
              [-appendToFile <localsrc> ... <dst>]
              [-cat [-ignoreCrc] <src> ...]
              [-checksum <src> ...]
              [-chgrp [-R] GROUP PATH...]
              [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
              [-chown [-R] [OWNER][:[GROUP]] PATH...]
              [-copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst>]
              [-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
              [-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] [-e] <path> ...]
              [-cp [-f] [-p | -p[topax]] [-d] <src> ... <dst>]
              [-createSnapshot <snapshotDir> [<snapshotName>]]
              [-deleteSnapshot <snapshotDir> <snapshotName>]
              [-df [-h] [<path> ...]]
              [-du [-s] [-h] [-v] [-x] <path> ...]
              [-expunge]
              [-find <path> ... <expression> ...]
              [-get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
              [-getfacl [-R] <path>]
              [-getfattr [-R] {-n name | -d} [-e en] <path>]
              [-getmerge [-nl] [-skip-empty-file] <src> <localdst>]
              [-head <file>]
              [-help [cmd ...]]
              [-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [<path> ...]]
              [-mkdir [-p] <path> ...]
              [-moveFromLocal <localsrc> ... <dst>]
              [-moveToLocal <src> <localdst>]
              [-mv <src> ... <dst>]
              [-put [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
              [-renameSnapshot <snapshotDir> <oldName> <newName>]
              [-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...]
              [-rmdir [--ignore-fail-on-non-empty] <dir> ...]
              [-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
              [-setfattr {-n name [-v value] | -x name} <path>]
              [-setrep [-R] [-w] <rep> <path> ...]
              [-stat [format] <path> ...]
              [-tail [-f] <file>]
              [-test -[defsz] <path>]
              [-text [-ignoreCrc] <src> ...]
              [-touch [-a] [-m] [-t TIMESTAMP ] [-c] <path> ...]
              [-touchz <path> ...]
              [-truncate [-w] <length> <path> ...]
              [-usage [cmd ...]]
      
      Generic options supported are:
      -conf <configuration file>        specify an application configuration file
      -D <property=value>               define a value for a given property
      -fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
      -jt <local|resourcemanager:port>  specify a ResourceManager
      -files <file1,...>                specify a comma-separated list of files to be copied to the map reduce cluster
      -libjars <jar1,...>               specify a comma-separated list of jar files to be included in the classpath
      -archives <archive1,...>          specify a comma-separated list of archives to be unarchived on the compute machines
      
      The general command line syntax is:
      command [genericOptions] [commandOptions]
      
      
    • hdfs dfs -ls:查看指定路径信息

      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -ls /
      Found 1 items
      -rw-r--r--   2 root supergroup       1361 2022-02-25 18:24 /README.txt
      
    • hdfs dfs -ls -R:递归显示所有目录信息

      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -ls -R /
      -rw-r--r--   2 root supergroup       1361 2022-02-25 18:24 /README.txt
      drwxr-xr-x   - root supergroup          0 2022-02-25 18:29 /abc
      drwxr-xr-x   - root supergroup          0 2022-02-25 18:29 /abc/xyz
      drwxr-xr-x   - root supergroup          0 2022-02-25 18:28 /test
      
      
    • hdfs dfs -put:上传指定文件

      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -put README.txt /  
      
    • hdfs dfs -get:下载指定文件

      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -get /README.txt .
      get: `README.txt': File exists
      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -get /README.txt README.txt.bak
      [root@bigdata01 hadoop-3.2.0]# ll
      total 188
      drwxr-xr-x. 2 1001 1002    203 Jan  8  2019 bin
      drwxr-xr-x. 3 1001 1002     20 Jan  8  2019 etc
      drwxr-xr-x. 2 1001 1002    106 Jan  8  2019 include
      drwxr-xr-x. 3 1001 1002     20 Jan  8  2019 lib
      drwxr-xr-x. 4 1001 1002   4096 Jan  8  2019 libexec
      -rw-rw-r--. 1 1001 1002 150569 Oct 19  2018 LICENSE.txt
      -rw-rw-r--. 1 1001 1002  22125 Oct 19  2018 NOTICE.txt
      -rw-rw-r--. 1 1001 1002   1361 Oct 19  2018 README.txt
      -rw-r--r--. 1 root root   1361 Feb 25 18:25 README.txt.bak
      drwxr-xr-x. 3 1001 1002   4096 Feb 25 15:53 sbin
      drwxr-xr-x. 4 1001 1002     31 Jan  8  2019 share
      
    • hdfs dfs -cat:查看指定文件

      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -cat /README.txt
      For the latest information about Hadoop, please visit our website at:
      
         http://hadoop.apache.org/
      
      and our wiki, at:
      
         http://wiki.apache.org/hadoop/
      
      This distribution includes cryptographic software.  The country in 
      which you currently reside may have restrictions on the import, 
      possession, use, and/or re-export to another country, of 
      encryption software.  BEFORE using any encryption software, please 
      check your country's laws, regulations and policies concerning the
      import, possession, or use, and re-export of encryption software, to 
      see if this is permitted.  See <http://www.wassenaar.org/> for more
      information.
      
      The U.S. Government Department of Commerce, Bureau of Industry and
      Security (BIS), has classified this software as Export Commodity 
      Control Number (ECCN) 5D002.C.1, which includes information security
      software using or performing cryptographic functions with asymmetric
      algorithms.  The form and manner of this Apache Software Foundation
      distribution makes it eligible for export under the License Exception
      ENC Technology Software Unrestricted (TSU) exception (see the BIS 
      Export Administration Regulations, Section 740.13) for both object 
      code and source code.
      
      The following provides more details on the included cryptographic
      software:
        Hadoop Core uses the SSL libraries from the Jetty project written 
      by mortbay.org.
      
    • hdfs dfs -mkdir:创建文件夹

      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -mkdir /test
      
      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -ls /
      Found 2 items
      -rw-r--r--   2 root supergroup       1361 2022-02-25 18:24 /README.txt
      drwxr-xr-x   - root supergroup          0 2022-02-25 18:28 /test
      
    • hdfs dfs -mkdir -p:递归创建多级目录

      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -mkdir -p /abc/xyz
      You have mail in /var/spool/mail/root
      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -ls /
      Found 3 items
      -rw-r--r--   2 root supergroup       1361 2022-02-25 18:24 /README.txt
      drwxr-xr-x   - root supergroup          0 2022-02-25 18:29 /abc
      drwxr-xr-x   - root supergroup          0 2022-02-25 18:28 /test
      
    • hdfs dfs -rm:删除文件

      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -rm /README.txt
      Deleted /README.txt
      
      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -ls /
      Found 2 items
      drwxr-xr-x   - root supergroup          0 2022-02-25 18:29 /abc
      drwxr-xr-x   - root supergroup          0 2022-02-25 18:28 /test
      
    • hdfs dfs -rm -r:删除目录

      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -rm -r /test
      Deleted /test
      
      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -rm -r /abc
      Deleted /abc
      You have mail in /var/spool/mail/root
      [root@bigdata01 hadoop-3.2.0]# hdfs dfs -ls /
      [root@bigdata01 hadoop-3.2.0]# 
      
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值