hadoop安装

hadoop
一、分布式大数据存储系统
  组成
    HDFS daemons are NameNode, SecondaryNameNode, and DataNode.
    YARN daemons are ResourceManager, NodeManager, and WebAppProxy.
    MapReduce then the MapReduce Job History Server will also be running.
  开放端口
    firewall-cmd --zone=public --add-port=8089/tcp --permanent
    firewall-cmd --zone=public --add-port=8088/tcp --permanent
    firewall-cmd --reload
二、支持三种部署模式
  http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
  1、本地模式 Local (Standalone) Mode
    java进程方式运行,命令行
    >hadoop jar test.jar
  2、伪分布式模式 Pseudo-Distributed Mode
  Execution:
    1)配置core-site.xml
      $ vi $HADOOP_HOME/etc/hadoop/core-site.xml
      <configuration>
          <property>
              <name>fs.defaultFS</name>
              <value>hdfs://localhost:9000</value>
          </property>
      </configuration>
    2)配置hdfs-site.xml
      $ vi $HADOOP_HOME/etc/hadoop/hdfs-site.xml
      <configuration>
          <property>
              <name>dfs.replication</name>
              <value>1</value>
          </property>
      </configuration>
    3)ssh
      $ ssh localhost
      $ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
      $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
      $ chmod 0600 ~/.ssh/authorized_keys
    4)$HADOOP_HOME/etc/hadoop/hadoop-env.sh
      export JAVA_HOME=/usr/local/jdk1.8.0_91
    5)格式化文件系统
      $ $HADOOP_HOME/bin/hdfs namenode -format
    6)启动数据节点
      $ $HADOOP_HOME/sbin/start-dfs.sh
    7)访问名称节点
      http://localhost:50070/
    8)生成目录
      $ $HADOOP_HOME/bin/hdfs dfs -mkdir /user
      $ $HADOOP_HOME/bin/hdfs dfs -mkdir /user/<username>
    9)复制本地文件到hadoop
      $ bin/hdfs dfs -put /opt/hadoop /user
    10)运行程序
      $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.1.jar grep /user/hadoop /user/output 'dfs[a-z.]+'
    11)从hadoop导出文件到本地
      $ bin/hdfs dfs -get output output
      $ cat output/*
      或
      $ bin/hdfs dfs -cat output/*
    12)停止系统
      $ sbin/stop-dfs.sh
  YARN :
    1)vi $HADOOP_HOME/etc/hadoop/mapred-site.xml
    <configuration>
        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
    </configuration>
    2)vi $HADOOP_HOME/etc/hadoop/yarn-site.xml
    <configuration>
      <property>
          <name>yarn.nodemanager.aux-services</name>
          <value>mapreduce_shuffle</value>
      </property>
    </configuration>
    3) 启动
    $ $HADOOP_HOME/sbin/start-yarn.sh
    4)访问管理节点
    http://localhost:8088/
    5) 执行任务
    6) 停止
    $ $HADOOP_HOME/sbin/stop-yarn.sh

  3、分布式集群模式 Fully-Distributed Mode
    http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/ClusterSetup.html
    /opt/hadoop-2.7.7/
    Hadoop Startup
    To start a Hadoop cluster you will need to start both the HDFS and YARN cluster.
    The first time you bring up HDFS, it must be formatted. Format a new distributed filesystem as hdfs:
    [hdfs]$ $HADOOP_PREFIX/bin/hdfs namenode -format <cluster_name>

    Start the HDFS NameNode with the following command on the designated node as hdfs:
    [hdfs]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode

    Start a HDFS DataNode with the following command on each designated node as hdfs:
    [hdfs]$ $HADOOP_PREFIX/sbin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --script hdfs start datanode

    If etc/hadoop/slaves and ssh trusted access is configured (see Single Node Setup), all of the HDFS processes can be started with a utility script. As hdfs:
    [hdfs]$ $HADOOP_PREFIX/sbin/start-dfs.sh

    Start the YARN with the following command, run on the designated ResourceManager as yarn:
    [yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager

    Run a script to start a NodeManager on each designated host as yarn:
    [yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemons.sh --config $HADOOP_CONF_DIR start nodemanager

    Start a standalone WebAppProxy server. Run on the WebAppProxy server as yarn. If multiple servers are used with load balancing it should be run on each of them:
    [yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start proxyserver

    If etc/hadoop/slaves and ssh trusted access is configured (see Single Node Setup), all of the YARN processes can be started with a utility script. As yarn:
    [yarn]$ $HADOOP_PREFIX/sbin/start-yarn.sh

    Start the MapReduce JobHistory Server with the following command, run on the designated server as mapred:
    [mapred]$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh --config $HADOOP_CONF_DIR start historyserver


    Hadoop Shutdown
    Stop the NameNode with the following command, run on the designated NameNode as hdfs:
    [hdfs]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop namenode

    Run a script to stop a DataNode as hdfs:
    [hdfs]$ $HADOOP_PREFIX/sbin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --script hdfs stop datanode

    If etc/hadoop/slaves and ssh trusted access is configured (see Single Node Setup), all of the HDFS processes may be stopped with a utility script. As hdfs:
    [hdfs]$ $HADOOP_PREFIX/sbin/stop-dfs.sh

    Stop the ResourceManager with the following command, run on the designated ResourceManager as yarn:
    [yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop resourcemanager

    Run a script to stop a NodeManager on a slave as yarn:
    [yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemons.sh --config $HADOOP_CONF_DIR stop nodemanager

    If etc/hadoop/slaves and ssh trusted access is configured (see Single Node Setup), all of the YARN processes can be stopped with a utility script. As yarn:
    [yarn]$ $HADOOP_PREFIX/sbin/stop-yarn.sh

    Stop the WebAppProxy server. Run on the WebAppProxy server as yarn. If multiple servers are used with load balancing it should be run on each of them:
    [yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop proxyserver

    Stop the MapReduce JobHistory Server with the following command, run on the designated server as mapred:
    [mapred]$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh --config $HADOOP_CONF_DIR stop historyserver


    Daemon                          Web Interface            Notes
    NameNode                        http://nn_host:port/    Default HTTP port is 50070.
    ResourceManager                  http://rm_host:port/    Default HTTP port is 8088.
    MapReduce  JobHistory Server    http://jhs_host:port/    Default HTTP port is 19888.


三、常用命令行
    http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html
 

 

#简介
hadoop 创建用户及hdfs权限,hdfs操作等常用shell命令


#参考
http://blog.csdn.net/swuteresa/article/details/13767169

添加一个hadoop组
sudo addgroup hadoop
将当前用户larry加入到hadoop组
sudo usermod -a -G hadoop larry

将hadoop组加入到sudoer
sudo gedit etc/sudoers
在root ALL=(ALL) ALL后 hadoop ALL=(ALL) ALL

修改hadoop目录的权限
sudo chown -R larry:hadoop /home/larry/hadoop<所有者:组 文件>

修改hdfs的权限
sudo chmod -R 755 /home/larry/hadoop
sudo bin/hadoop dfs -chmod -R 755 /
sudo bin/hadoop dfs -ls /

修改hdfs文件的所有者
sudo bin/hadoop fs -chown -R larry /

解除hadoop的安全模式
sudo bin/hadoop dfsadmin -safemode leave

拷贝本地文件到hdfs
hadoop fs -copyFromLocal <localsrc> URI

将路径指定文件的内容输出到stdout
hadoop fs -cat file:///file3 /user/hadoop/file4

改变文件的所属组
hadoop fs -chgrp [-R] GROUP URI

改变用户访问权限
hadoop fs -chmod [-R] 755 URI

修改文件的所有者
hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]

拷贝hdfs文件到本地
hadoop fs -copyToLocal URI localdst

拷贝hdfs文件到其它目录
hadoop fs -cp URI [URI …] <dest>

显示目录中所有文件的大小
hadoop fs -du URI [URI …]

合并文件到本地目录
hadoop fs -getmerge <src> <localdst> [addnl]

#问题:Permission denied: user=root, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
#解决:切换成hdfs用户
#root用户登录在hadoop中创建文件夹
sudo -u hdfs  hadoop fs -mkdir /nutch
#查看文件列表
hadoop dfs -ls /

 

core-site.xml

 

hdfs-site.xml

 

yarn-site.xml

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值