1、解压hadoop
2、配置hadoop-env.sh mapreduce yarn 三个文件的:JAVA_HOME
3、core-site.xml:配置 (默认端口和临时目录)
<property>
<name>fs.defaultFS</name><value>hdfs://hadoop-senior.ibeifeng.com:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value> /home/beifeng/opt/cdh-5.3.6/hadoop-2.5.0-cdh5.3.6/data/tmp</value>
</property>
3.1、创建临时目录 :mkdir -p /home/beifeng/opt/cdh-5.3.6/hadoop-2.5.0-cdh5.3.6/data/tmp
4、hdfs-site.xml
<property>
<name>dfs.replication</name> //副本数
<value>1</value>
</property>
<property>
<name>dfs.permissions</name> //权限
<value>false</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>hadoop-senior.ibeifeng.com:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop-senior.ibeifeng.com:50090</value>
</property>
5、配置yarn-site.xml :
<property>
<name>yarn.nodemanager.aux-services</name> //mapreduce运行在yarn上
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name> //resourcemanager地址
<value>hadoop-senior.ibeifeng.com</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name> //日志聚集
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>640800</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name> //集群资源管理配置,内存
<value>4096</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name> //集群资源管理配置,CPU
<value>4</value>
</property>
6、配置mapreduce-site.xml.template改名mapreduce-site.xml
<property>
<name>mapreduce.framework.name</name> //mapreduce运行在yarn上
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name> //jobhistory
<value>hadoop-senior.ibeifeng.com:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop-senior.ibeifeng.com:19888</value>
</property>
7、slaves文件:加入主机名
8、格式化:Hadoop安装目录下 bin/hdfs namenode -format
9、一个一个启动:sbin/hadoop-daemon.sh start namenode
二、hive配置
1、hive-env.sh : HADOOP_HOME 和 HIVE_CONF_DIR(hive中conf配置文件目录)
2、hive-log4j.properties :hive.log.dir(hive日志存储地址改在安装目录下)
/home/beifeng/opt/modules/hive-0.13.1/logs
3.1、在conf创建hive-site.xml个人配置文件 touch hive-site.xml vi hive-site.xml 随便加几行内容保证格式
3.2、插入
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name> //修改MySQL中元数据表名
<value>jdbc:mysql://hadoop-senior.ibeifeng.com:3306/metadata?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.cli.print.header</name>
<value>true</value>
<description>Whether to print the names of the columns in query output.</description>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
<description>Whether to include the current database in the Hive prompt.</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
</configuration>
4、将MySQL的 jar 包拷贝到hive 中lib 目录下
cp /home/beifeng/opt/modules/hive-0.13.1/lib/mysql-connector-java-5.1.27-bin.jar ./lib
5、创建hive的元数据的存储目录
bin/hdfs dfs -mkdir -p /user/hive/warehouse
bin/hdfs dfs -chmod g+w /user/hive/warehouse
6、测试hive
create table student(id int,name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ;
load data local inpath '/home/beifeng/opt/datas/students.txt' overwrite into table student ;
select count(1) from student ; 成功说明hive 和Hadoop都成功了
三、sqoop安装配置
1、解压到Hadoop同级目录下
2、配置conf中sqoop-env.sh: HADOOP_COMMON_HOME 和 HADOOP_MAPRED_HOME (Hadoop和mapreduce安装目录)
两个都设置为 /home/beifeng/opt/cdh-5.3.6/hadoop-2.5.0-cdh5.3.6
HIVE_HOME(hive安装目录):/home/beifeng/opt/cdh-5.3.6/hive-0.13.1-cdh5.3.6
3、sqoop使用:bin/sqoop help (可使用的功能)
4、RDBMS以Mysql数据库为例讲解,拷贝jdbc驱动包到$SQOOP_HOME/lib目录下
cp /home/beifeng/opt/softwares/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /home/beifeng/opt/cdh-5.3.6/sqoop-1.4.5-cdh5.3.6/lib/