1. Kerberos是一种计算机网络授权协议,用来在非安全网络中,对个人通信以安全的手段进行身份认证。具体请查阅官网
2. 需要安装的包(基于centos)
yum install libsasl2-dev
yum install gcc-c++ python-devel.x86_64 cyrus-sasl-devel.x86_64
yum install python-devel
yum install krb5-devel
yum install python-krbV
pip install krbcontext==0.9
pip install thrift==0.9.3
pip install thrift-sasl==0.2.1
pip install impyla==0.14.1
pip install hdfs[kerberos]
pip install pykerberos==1.2.1
3. /etc/krb5.conf 配置, 在这个文件里配置你服务器所在的域
4./etc/hosts 配置, 配置集群机器和域所在机器
5. 通过kinit 生成 ccache_file或者keytab_file
6. 连接hive代码如下
import os
from impala.dbapi import connect
from krbcontext import krbcontext
keytab_path = os.path.split(os.path.realpath(__file__))[0] + '/xxx.keytab'
principal = 'xxx'
with krbcontext(using_keytab=True,principal=principal,keytab_file=keytab_path):
conn = connect(host=ip, port=10000, auth_mechanism='GSSAPI', kerberos_service_name='hive')
cursor = conn.cursor()
cursor.execute('SELECT * FROM default.books')
for row in cursor:
print(row)
7. 连接hdfs代码如下
from hdfs.ext.kerberos import KerberosClient
from krbcontext import krbcontext
hdfs_url = 'http://' + host + ':' + port
data = self._get_keytab(sso_ticket)
self._save_keytab(data)
with krbcontext(using_keytab=True, keytab_file=self.keytab_file, principal=self.user):
self.client = KerberosClient(hdfs_url)
self.client._list_status(path).json()['FileStatuses']['FileStatus'] #获取path下文件及文件夹
8. 注:krbcontext这个包官方说支持python2,但是python3也能用
这个hdfs_url 一定要带"http://"不然会报错
9. 我新增了一些配置文件配置,具体可看我的一个新文章