前提:Hive默认用户名和密码为空,并没有做修改,可以在hive-site.xml中进行配置
(1)安装相关Python库
pip install sasl
pip install thrift
pip install thrift-sasl
pip install PyHive
安装sasl的过程中,可能会报以下错误:
error: command 'gcc' failed with exit status
解决方法:Ubuntu系统可能需要先装好libsasl2-dev,CentOS系统需要预先装好python-devel和cyrus-sasl-devel。再pip install sasl即可
yum install gcc-c++ python-devel.x86_64 cyrus-sasl-devel.x86_64
(2)启动Hive的metastore和hiveserver2
hive --service metastore &
hive --service hiveserver2 &
hiveserver2 正常启动会默认监听10000端口,可以通过以下命令查看它是否正常启动
netstat -anp | grep 10000
(3)编写Python脚本
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from pyhive import hive
conn = hive.Connection(host='hadoop000', port=10000, database='***')
cursor=conn.cursor()
cursor.execute('select * from user_log limit 10')
for each in cursor.fetchall():
print each