数据仓库的搭建

一、构建数据仓库思路

hadoop11作为数据仓库的client客户端
haoop12作为数据仓库hive server端
hadoop13上作为mysql server客户端

二、在hadoop13上安装mysql

1.下载mysql的repo源

wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm 

在这里插入图片描述
2.安装mysql-community-release-el7-5.noarch.rpm包

rpm -ivh mysql-community-release-el7-5.noarch.rpm

在这里插入图片描述
3.安装mysql

yum install mysql-server
yum -y install mysql-community-server

4.启动服务
重载所有修改过的配置文件:

systemctl daemon-reload 

5.开启服务:

systemctl start mysqld

设置开机自启

systemctl enable mysqld

6.设置sql密码为空
查看mysql状态,如果开启,使用命令关闭

ps -ef | grep -i mysql 
systemctl stop mysqld

修改mysql的配置文件my.cnf
增加以下代码(跳过密码验证)

[mysqld]

skip-grant-tables

启动mysql服务,进入mysql

systemctl start mysqld
mysql -u root

在这里插入图片描述
7.修改密码

update user set password=password("你的新密码") where user="root";

flush privileges;

在这里插入图片描述
8.重启mysql服务
删除之前加的两行代码,并重启。
在这里插入图片描述

9.允许远程连接

grant all privileges on *.* to 'root'@'%' with grant option;

flush privileges; 

在这里插入图片描述
10.创建hive用户,数据库,并把权限赋给hadoop12。

create database hive;
grant all on *.* to root@hadoop12 identified by 'zyl990708';
flush privileges;
exit

三、Hive的安装配置(每台机器都需要)

1.解压hive3.1.2

 tar -zxvf apache-hive-3.1.2-bin.tar.gz -C /opt/moudle/

修改名称为hive
将hive文件夹分发给其他虚拟机

xsync /opt/moudle/hive

2.配置环境变量

##HIVE_HOME
export HIVE_HOME=/opt/moudle/hive
export PATH=$PATH:$HIVE_HOME/bin

3.解决日志冲突

mv lib/log4j-slf4j-impl-2.10.0.jar lib/log4j-slf4j-impl-2.10.0..bak

4.修改conf目录下的文件名称

cp hive-env.sh.template hive-env.sh
cp hive-default.xml.template hive-site.xml

5.在hive-env.sh文件中找到HADOOP_HOME改为

HADOOP_HOME=/opt/moudle/hadoop

6.将mysql的jar包放在hive的lib目录下

cp /opt/sofeware/mysql-connector-java-5.1.46-bin.jar ./lib/

7.在hadoop13上配置core-site.xml文件

<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://hadoop13:3306/hive?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>
  </property>
  
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
  
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>username to use against metastore database</description>
  </property>
  
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>zyl990708</value>
    <description>password to use against metastore database</description>
  </property>

8.在hadoop11上进行配置

在hive目录下创建文件夹tmp

使用本地服务连接 Hive,默认为 true
<property>
 	<name>hive.metastore.local</name>
	 <value>false</value>
</property>
<property>
 	<name>hive.metastore.uris</name>
	<value>thrift://hadoop12:9083</value>
</property> 
  
  <property>
    <name>hive.querylog.location</name>
    <value>/opt/moudle/hive/tmp/${system:user.name}</value>
    <description>Location of Hive run time structured log file</description>
  </property>
  
  <property>
    <name>hive.exec.local.scratchdir</name>
    <value>/opt/moudle/hive/tmp/${user.name}</value>
    <description>Local scratch space for Hive jobs</description>
</property>

<property>
    <name>hive.downloaded.resources.dir</name>
    <value>/opt/moudle/hive/tmp/${hive.session.id}_resources</value>
    <description>Temporary local directory for added resources in the remote file system.</description>
</property>

<property>
    <name>hive.server2.logging.operation.log.location</name>
    <value>/opt/moudle/hive/tmp/${system:user.name}/operation_logs</value>
    <description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>

9.在hadoop12上进行配置

配置之前在hive目录下创建目录tmp

<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://hadoop13:3306/hive?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>
  </property>
  
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
  
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>username to use against metastore database</description>
  </property>
  
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>zyl990708</value>
    <description>password to use against metastore database</description>
  </property>
  
<property>
    <name>datanucleus.schema.autoCreateAll</name>
    <value>true</value>
 </property>
 
<property>
    <name>hive.metastore.schema.verification</name>
    <value>false</value>
    <description>
      Enforce metastore schema version consistency.
      True: Verify that version information stored in is compatible with one from Hive jars.  Also disable automatic
            schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
            proper metastore schema migration. (Default)
      False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
    </description>
  </property>
  
  <property>
    <name>hive.metastore.event.db.notification.api.auth</name>
    <value>false</value>
    <description>
      Should metastore do authorization against database notification related APIs such as get_next_notification.
      If set to true, then only the superusers in proxy settings have the permission
    </description>
  </property>
  
  <property>
    <name>hive.querylog.location</name>
    <value>/opt/moudle/hive/tmp/${system:user.name}</value>
    <description>Location of Hive run time structured log file</description>
  </property>
  
  <property>
    <name>hive.exec.local.scratchdir</name>
    <value>/opt/moudle/hive/tmp/${user.name}</value>
    <description>Local scratch space for Hive jobs</description>
</property>

<property>
    <name>hive.downloaded.resources.dir</name>
    <value>/opt/moudle/hive/tmp/${hive.session.id}_resources</value>
    <description>Temporary local directory for added resources in the remote file system.</description>
</property>

<property>
    <name>hive.server2.logging.operation.log.location</name>
    <value>/opt/moudle/hive/tmp/${system:user.name}/operation_logs</value>
    <description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>

8.将hadoop中的guava.jar复制到hive的lib目录下,并删除hive的guava.jar

cp /opt/moudle/hadoop/share/hadoop/common/lib/guava-27.0-jre.jar lib/
rm -rf guava-19.0.jar 

10.不显示info信息

set hive.server2.logging.operation.level=NONE

11.格式化(在hadoop12上进行)

schematool -dbType mysql -initSchema

12在hadoop12上启动

hive --service metastore &

执行命令会出现卡顿,并不是错误,按下回车键即可。
在这里插入图片描述

13.在hadoop11上启动hive

hive

在这里插入图片描述
参考:全国大学生大数据技能竞赛

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值