python连接hive操作

万物具理

已于 2023-11-27 21:12:04 修改

阅读量364

点赞数

文章标签： hive hadoop 数据仓库

于 2023-11-27 17:42:22 首次发布

本文链接：https://blog.csdn.net/m0_73905064/article/details/134650175

版权

1：这个问题本来我在暑假就开始想了，因为当时无聊写hive程序进行数据分析，但是后面发现只有hive语句不是很好，程序有点难写，所以我就想能不能像mysql一样就行python连接，并且python更适合搞数据分析，所以我当时就找方法，能不能搞，可惜搞不了，现在发现有可能当时那个python包要配置好多环境，比较复杂，反正没有成功，这几天身体不舒服，我就想了一下，然后发现python里面还有其他连接hive包，一顿操作下来，哈哈成功了，所以我把这个方法分享下，

2：配置本机环境（我电脑是anacoda环境，如果anacoda在服务端，就全部在服务端执行）

安装 pure-sasl
pip install pure-sasl
安装 thrift_sasl
pip install thrift_sasl==0.2.1 --no-deps
安装thrift
pip install thrift_sasl==0.2.1 --no-deps
安装最终的：impyla
pip install impyla
pip install thriftpy

3:配置服务端环境

#启动hadoop
start-all.sh 
#启动hive两个服务
nohup /export/hive/bin/hive --service metastore &
nohup /export/hive/bin/hive --service hiveserver2 &

#不管这个，反正是增加权限的，搞就行了（在用python导数据数据时候权限问题）
hadoop fs -chmod -R 777 /user/hive/warehouse

hadoop fs -chmod -R 777 /

4：DataSpell里面敲python连接hive代码

from impala.dbapi import connect
#                 服务端名称             连接的数据库可以不要                    
conn = connect(host='master', port=10000, database='db1', auth_mechanism='PLAIN')

#conn = connect(host='master', port=10000, #auth_mechanism='PLAIN',user="root",password="123456",database="db1") 这个也可以

cur = conn.cursor()

#查看数据库
cur.execute('SHOW DATABASES')
print(cur.fetchall())

#查看表
cur.execute('SHOW Tables')
print(cur.fetchall())


#查看tb1表里面数据 因为这个表里面我有数据，不会报错，看你自己有什么库和表
cur.execute('select * from db1.tb1')
print(cur.fetchall())


#创建db2数据库   try   except 把错误包起来，其实也不能说错误吧，无所谓吧 Ctrl+Alt+t 快捷键
try:
    cur.execute('CREATE DATABASE IF NOT EXISTS db2')
except:
    print("创建失败")

# 关闭连接
cur.close()
conn.close()

万物具理

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
1
评论
python连接hive操作

【代码】python连接hive操作。
复制链接

扫一扫