【Spark】 Spark beeline简单使用

最新推荐文章于 2024-05-01 18:46:44 发布

可乐大牛

最新推荐文章于 2024-05-01 18:46:44 发布

阅读量5k

点赞数

分类专栏：大数据

本文链接：https://blog.csdn.net/qq_44173974/article/details/113803698

版权

大数据专栏收录该内容

27 篇文章 4 订阅

订阅专栏

操作

Spark Thrift Server 是 Spark 社区基于 HiveServer2 实现的一个 Thrift 服务。旨在无缝兼容
HiveServer2。因为 Spark Thrift Server 的接口和协议都和 HiveServer2 完全一致，因此我们部
署好 Spark Thrift Server 后，可以直接使用 hive 的 beeline 访问 Spark Thrift Server 执行相关
语句。Spark Thrift Server 的目的也只是取代 HiveServer2，因此它依旧可以和 Hive Metastore
进行交互，获取到 hive 的元数据。
如果想连接 Thrift Server，需要通过以下几个步骤：
➢ Spark 要接管 Hive 需要把 hive-site.xml 拷贝到 conf/目录下
➢ 把 Mysql 的驱动 copy 到 jars/目录下
➢ 如果访问不到 hdfs，则需要把 core-site.xml 和 hdfs-site.xml 拷贝到 conf/目录下
➢ 启动 Thrift Server

sbin/start-thriftserver.sh

➢ 使用 beeline 连接 Thrift Server

bin/beeline -u jdbc:hive2://linux1:10000 -n root

说明

使用了Spark Thrift Server之后，我们无需再启动HiveServer2 的服务

可能遇到的问题

如果运行select语句碰到Class com.hadoop.compression.lzo.LzoCodec not found的问题可以参考这里

配置文件

这里附上我的配置文件

hive-site.xml

	<!--显示当前数据库以及查询表的头信息-->
        <property>
          <name>hive.cli.print.header</name>
          <value>true</value>
        </property>

        <property>
          <name>hive.cli.print.current.db</name>
          <value>true</value>
        </property>


        <!--关闭元数据的检查-->
        <property>
          <name>hive.metastore.schema.verification</name>
          <value>false</value>
        </property>


        <!--使用tez作为引擎-->
        <property>
          <name>hive.execution.engine</name>
          <value>tez</value>
        </property>

core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

<!-- 指定HDFS中NameNode的地址 -->
<property>
                <name>fs.defaultFS</name>
      <value>hdfs://hadoop102:9000</value>
</property>

<!-- 指定Hadoop运行时产生文件的存储目录 -->
<property>
                <name>hadoop.tmp.dir</name>
                <value>/opt/module/hadoop-2.7.2/data/tmp</value>
</property>


<property>
<name>io.compression.codecs</name>
<value>
org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
org.apache.hadoop.io.compress.BZip2Codec,
org.apache.hadoop.io.compress.SnappyCodec,
com.hadoop.compression.lzo.LzoCodec,
com.hadoop.compression.lzo.LzopCodec
</value>
</property>

<property>
    <name>io.compression.codec.lzo.class</name>
    <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>



</configuration>

hdfs-site.xml

<configuration>

	<property>
    	<name>dfs.replication</name>
    	<value>1</value>
	</property>

	<!-- 指定Hadoop辅助名称节点主机配置 -->
	<property>
    	<name>dfs.namenode.secondary.http-address</name>
    	<value>hadoop104:50090</value>
	</property>
	<property>
		<name>dfs.permissions</name>
		<value>false</value>
		</property>
</configuration>

可乐大牛

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
【Spark】 Spark beeline简单使用

操作Spark Thrift Server 是 Spark 社区基于 HiveServer2 实现的一个 Thrift 服务。旨在无缝兼容HiveServer2。因为 Spark Thrift Server 的接口和协议都和 HiveServer2 完全一致，因此我们部署好 Spark Thrift Server 后，可以直接使用 hive 的 beeline 访问 Spark Thrift Server 执行相关语句。Spark Thrift Server 的目的也只是取代 HiveServer2
复制链接

扫一扫

专栏目录