本人单机搭建环境,以前搭完就拉到了,只是最近又搭建。。。还是有坑,觉得还是记录一下相关配置吧
1.下载linux版的jdk并安装,一定选择的时候选择linux版的,不然他识别不到
2.下载hadoop和hive,一定要注意版本,一般不要用最新版本,本次我用的都是3.12的版本
3.配置/etc/profile
# /etc/profile
#jdk
export JAVA_HOME=/root/bigdata/jdk1.8.0
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin:$PATH
#hadoop
export HADOOP_HOME=/root/bigdata/hadoop-3.1.2
export PATH=$HADOOP_HOME/bin:$PATH
#hive
export HIVE_HOME=/root/bigdata/hive-3.1.2
export PATH=$PATH:$HIVE_HOME/bin:$PATH
#mysql
export PATH=/root/bigdata/mysql-8.0.19/bin:$PATH
4.配置Hadoop
4.1 vi hadoop-env.sh
export JAVA_HOME=/root/bigdata/jdk1.8.0
export HADOOP_SSH_OPTS="-p 2248"
4.2 vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<!-- 这里填的是你自己的ip,端口默认-->
<value>hdfs://192.38.248.149:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<!-- 这里填的是你自定义的hadoop工作的目录,端口默认-->
<value>/root/bigdata/hadoop-3.1.2/tmp</value>
</property>
<property>
<name>hadoop.native.lib</name>
<value>false</value>
<description>Should native hadoop libraries, if present, be used.
</description>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
4.3 vi hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/root/bigdata/hadoop-3.1.2/dfs/name</value>
<description>Path on the local filesystem where theNameNode stores the namespace and transactions logs persistently.</description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/root/bigdata/hadoop-3.1.2/dfs/data</value>
<description>Comma separated list of paths on the localfilesystem of a DataNode where it should store its blocks.</description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<!--这里是你自己的ip,端口默认-->
<value>192.38.248.149:50090</value>
</property>
</configuration>
4.4 vi yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<!-- 自己的ip端口默认 -->
<value>192.38.248.149</value>
</property>
<!-- reducer获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
4.5 vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>192.38.248.149:9001</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/root/bigdata/hadoop-3.1.2/var</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>/root/bigdata/hadoop-3.1.2/share/hadoop/mapreduce/*,/root/bigdata/hadoop-3.1.2/share/hadoop/mapreduce/lib/*</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx1000m</value>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>300</value>
</property>
</configuration>
5.免密设置
配置ssh免登陆,不然需要一直输入登录密码
# ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
# chmod 0600 ~/.ssh/authorized_keys
6.格式化Hadoop
格式化文件系统
# bin/hdfs namenode -format
7.启动Hadoop
sbin/start-dfs.sh
启动报错:
Starting namenodes on [localhost]
ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
Starting datanodes
ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
Starting secondary namenodes [admin]
ERROR: Attempting to operate on hdfs secondarynamenode as root
ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation
在start-dfs.sh 和 stop-dfs.sh,还有对应的stop文件中加入
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
启动完成后:jsp
# jps
11264 NodeManager
11138 ResourceManager
11588 Jps
10442 SecondaryNameNode
10090 NameNode
10220 DataNode
8.hive配置
8.1 vi conf/hive-site.xml
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<!-- 数据库连接driver,即MySQL驱动-->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
</property>
<!-- MySQL数据库用户名-->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<!-- MySQL数据库密码-->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.local</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/root/bigdata/hive-2.3.6/warehouse</value>
</property>
<property>
<name>hive.warehouse.subdir.inherit.perms</name>
<value>true</value>
<description>
inheriting the permission of the warehouse or database directory.
</description>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>localhost</value>
</property>
<property>
<name>beeline.hs2.connection.user</name>
<value>hive</value>
</property>
<property>
<property>
<name>beeline.hs2.connection.password</name>
<value>hive</value>
</property>
<!--
<property>
<name>beeline.hs2.connection.user</name>
<value>root</value>
</property>
<property>
<name>beeline.hs2.connection.password</name>
<value>root</value>
</property>
-->
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<!--权限控制-->
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.security.authorization.createtable.owner.grants</name>
<value>ALL</value>
</property>
<property>
<name>hive.security.authorization.task.factory</name>
<value>org.apache.hadoop.hive.ql.parse.authorization.HiveAuthorizationTaskFactoryImpl</value>
</property>
<property>
<name>hive.metastore.authorization.storage.checks</name>
<value>true</value>
<description>Should the metastore do authorization checks against
the underlying storage for operations like drop-partition (disallow
the drop-partition if the user in question doesn't have permissions
to delete the corresponding directory on the storage).</description>
</property>
<property>
<name>hive.server2.authentication</name>
<value>CUSTOM</value>
</property>
<property>
<name>hive.server2.custom.authentication.class</name>
<value>CustomHiveServer2Auth</value>
</property>
<property>
<name>hive.server2.custom.authentication.file</name>
<value>/root/bigdata/hive-3.1.2/conf/hive.server2.users.conf</value>
</property>
</configuration>
8.2 vi conf/hive-env.xml
export JAVA_HOME=/root/bigdata/jdk1.8.0
export HIVE_HOME=/root/bigdata/hive-3.1.2
export HADOOP_HOME=/root/bigdata/hadoop-3.1.2
export HIVE_CONF_DIR=/root/bigdata/hive-3.1.2/conf
8.3 启动hive
后台运行
hivemetastore nohup hive --service metastore &
hiveserver2 nohup hive --service hiveserver2 &
bin/beeline !connect jdbc:hive2://172.16.145.124:10000 hive hive