1. Authentication
- kerberos
2. Proxy user
3. Secure DataNode
- Because the DataNode data transfer protocol does not use the Hadoop RPC framework, DataNodes must authenticate themselves using privileged ports which are specified by dfs.datanode.address and dfs.datanode.http.address.
- (1) dfs.datanode.address 配置成privileged ports(端口号必须小于 1024),然后在hadoop-env.sh中设置环境变量 HADOOP_SECURE_DN_USER(hdfs) and JSVC_HOME 。
- (2) As of version 2.6.0, SASL (Simple Authentication and Security Layer和SSL[Secure Sockets Layer 安全套接层])can be used to authenticate the data transfer protocol.
- 在 hdfs-site.xml中设置dfs.data.transfer.protection为true,dfs.datanode.address设置为non-privileged,dfs.http.policy为HTTPS_ONLY。
- 同时保证hadoop-env.sh中设置环境变量HADOOP_SECURE_DN_USER没被设置。
4. Data confidentiality(数据保密性)
- Data Encryption on RPC
The data transfered between hadoop services and clients can be encrypted on the wire. Setting hadoop.rpc.protection to privacy in core-site.xml activates data encryption. - Data Encryption on Block data transfer.
set dfs.encrypt.data.transfer to true in the hdfs-site.xml in order to activate da