spark+hadoop的组合是将来大数据的发展趋势,简单说一下hadoop的安装
下载hadoop安装包
http://mirrors.hust.edu.cn/apache/hadoop/core/hadoop-2.8.0/
解压:tar -zxvf hadoop-2.8.0.tar.gz
配置环境变量
export HADOOP_HOME=/home/qpx/tool/hadoop-2.8.0
export PATH=.:$SPARK_HOME/bin:$HADOOP_HOME/bin:$JAVA_HOME/bin:$PATH
修改 hadoop-env.sh
export JAVA_HOME=${JAVA_HOME}
修改core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop:9000</value>
<description>change you own hadoop hostname</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp</value>
</property>
</configuration>
修改hdfs-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>9001</value>
<description>change you own hostname</description>
</property>
</configuration>
修改mapred-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
接下来在启动hadoop之前需要格式化hdfs。命令:hadoop namenode -format;
启动hadoop并验证。启动命令:start-all.sh;验证命令:jps
希望对您有所帮助
后续进行spark 的入门学习