SparkSQL配置(HIVE作为数据源)

HIVE的配置(以mysql做为元数据的存储,hdfs作为数据的存储):

1.修改 hive-env.sh  (可以从hive-default.xml.template拷贝修改)

#hadoop的主目录
export HADOOP_HOME=/usr/local/hadoop
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/usr/local/hive/conf
# Folder containing extra ibraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=/usr/local/hive/lib

2.修改 hive-site.xml(可以参考hive-default.xml.template修改)
#此处主要配置与mysql相关信息
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>
  </property>

  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>youpassword</value>
    <description>password to use against metastore database</description>
  </property>

 <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>Username to use against metastore database</description>
  </property>
<span style="font-family: Arial, Helvetica, sans-serif;">至此hive基本配置完毕</span>
<span style="font-family: Arial, Helvetica, sans-serif;">然后启动./HIVE_HOME/bin/hive 看是否能启动成功!</span>
<span style="font-family: Arial, Helvetica, sans-serif;">-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------</span>
配置spark
1.修改spark-env.sh
#内存根据自己的机器配置,注意:太配置小了,运行会出现no resource。。。。。。,
export SCALA_HOME=/usr/local/spark
export JAVA_HOME=/usr/local/jdk1.8.0
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
export SPARK_MASTER_IP=master
export SPARK_WORKER_MEMORY=800m
export SPARK_EXECUTOR_MEMORY=800m
export SPARK_DRIVER_MEMORY=800m
export SPARK_WORKER_CORES=4
export MASTER=spark://master:7077

2.配置spark-defaults.conf
spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two thr"
spark.eventLog.enabled           true
spark.eventLog.dir               hdfs://master:9000/historyserverforSpark
#可以用来查看spark的历史执行任务 web UI
spark.yarn.historyServer.address        master:18080
spark.history.fs.logDirectory   hdfs://master:9000/historyserverforSpark 

3.配置slaves(配置了两个work节点)
slave1
slave2
-------------------------------------------------------
在spark/conf中配置添加hive-site.xml,内容如下
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<configuration>
<property>
<name>hive.metastore.uris</name>  
    <value>thrift://master:9083</value>  
    <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description> 
</property>


</configuration>

4.启动 hive的元数据
 hive --servie meatastore
5. 启动sparkSQL
./bin/spark-bin

    


                                                                           
                                               



  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值