最近在写一个MapReduce程序,需要从DB里面读取某些数据,但是公司内所有的DB都是Kerberos方式认证,在这种情况下如何传递kerberos credential,Hadoop如何利用Kerberos认证?因而有必要对其认证机制做一些初步的研究。Hadoop定义了两种基本认证方式,即Simple和Kerberos,后者被认为是isSecurityEnabled。由于以前没有接触过Security相关技术,首先从Simple认证出发,理解一下基本概念和工作流程。
JAAS
首先,Hadoop Security是基于JAAS(Java Authentication and Authorization Service)实现的,这是JDK标准的一个框架,可plug各种类型的LoginModule实现来执行不同的Authentication。JAAS中有两个基本概念:Principal和Subject。
- Principal是一种身份的ID,在这种身份上可以唯一地标识一个user。例如,同一个person,他可能有不同的身份,公民,学生,读者等。那么他的公民ID,学生ID以及读者ID都分别是一种Principal。不同的认证机制可以定义不同的Principal来标识user。
- Subject是一个container,包含一组关于某个user的安全相关信息。Subject最主要的内容是Principals和Credentials,Principals比如上面提到的公民ID,学生ID。Subject可以在不同的认证机制中传递,每个认证机制都可以将自己定义的Principal加入该Subject。而Credential是对Principal的补充,看如下解释:
With somewhat more controversy, the JAAS designers concluded that Principals may have some sort of proof of identity that they need to be able to provide at a moment’s notice, andthese proofs of identity may include sensitive information, so a set of public credentials and a set of private credentials were also added to Subject. Since the content of a credential may vary widely across authentication mechanisms, from a simple password to a fingerprint (to infinity and beyondl), the type of a credential was simply left as java.lang.Obiect. Relationships between Principals and credentials, if any, were left as an exercise for the implementer of