4.Hadoop_HDFS2.x_高可用搭建

最新推荐文章于 2021-07-11 17:08:13 发布

得过且过1223

最新推荐文章于 2021-07-11 17:08:13 发布

阅读量403

点赞数

分类专栏：大数据 # Hadoop 文章标签：大数据 hadoop

本文链接：https://blog.csdn.net/dgqg1223/article/details/104188449

版权

大数据同时被 2 个专栏收录

28 篇文章 1 订阅

订阅专栏

Hadoop

11 篇文章 0 订阅

订阅专栏

架构说明

HDFS 2.x HA

HDFS High Availability Using the Quorum Journal Manager

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-1PAI5dr6-1580907488971)(4.实现hadoop高可用/image-20191030220802031.png)]

搭建说明

虚拟机	NN-1	NN-2	DN	ZK	ZKFC	JNN
node01	*				*	*
node02		*	*	*	*	*
node03			*	*		*
node04			*	*

搭建步骤

官方文档： https://hadoop.apache.org/docs/r2.6.5/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

安装jdk、hadoop，并配置环境变量
设置ssh免密钥登录，node01、node02 相互免密钥访问。

通过官方文档说明配置 hdfs-site.xml 文件和 core-site.xml

Configuration details

To configure HA NameNodes, you must add several configuration options to your hdfs-site.xml configuration file.

The order in which you set these configurations is unimportant, but the values you choose for dfs.nameservices and dfs.ha.namenodes.[nameservice ID] will determine the keys of those that follow. Thus, you should decide on these values before setting the rest of the configuration options.

dfs.nameservices

- the logical name for this new nameservice

Choose a logical name for this nameservice, for example “mycluster”, and use this logical name for the value of this config option. The name you choose is arbitrary. It will be used both for configuration and as the authority component of absolute HDFS paths in the cluster.

Note: If you are also using HDFS Federation, this configuration setting should also include the list of other nameservices, HA or otherwise, as a comma-separated list.
<property>
  <name>dfs.nameservices</name>
  <value>mycluster</value>
</property>
dfs.ha.namenodes.[nameservice ID]

- unique identifiers for each NameNode in the nameservice

Configure with a list of comma-separated NameNode IDs. This will be used by DataNodes to determine all the NameNodes in the cluster. For example, if you used “mycluster” as the nameservice ID previously, and you wanted to use “nn1” and “nn2” as the individual IDs of the NameNodes, you would configure this as such:
<property>
  <name>dfs.ha.namenodes.mycluster</name>
  <value>nn1,nn2</value>
</property>
Note: Currently, only a maximum of two NameNodes may be configured per nameservice.

dfs.namenode.rpc-address.[nameservice ID].[name node ID]

- the fully-qualified RPC address for each NameNode to listen on

For both of the previously-configured NameNode IDs, set the full address and IPC port of the NameNode processs. Note that this results in two separate configuration options. For example:
<property>
  <name>dfs.namenode.rpc-address.mycluster.nn1</name>
  <value>machine1.example.com:8020</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.mycluster.nn2</name>
  <value>machine2.example.com:8020</value>
</property>
Note: You may similarly configure the “servicerpc-address” setting if you so desire.

dfs.namenode.http-address.[nameservice ID].[name node ID]

- the fully-qualified HTTP address for each NameNode to listen on

Similarly to rpc-address above, set the addresses for both NameNodes’ HTTP servers to listen on. For example:
<property>
  <name>dfs.namenode.http-address.mycluster.nn1</name>
  <value>machine1.example.com:50070</value>
</property>
<property>
  <name>dfs.namenode.http-address.mycluster.nn2</name>
  <value>machine2.example.com:50070</value>
</property>
Note: If you have Hadoop’s security features enabled, you should also set the https-address similarly for each NameNode.

dfs.namenode.shared.edits.dir

- the URI which identifies the group of JNs where the NameNodes will write/read edits

This is where one configures the addresses of the JournalNodes which provide the shared edits storage, written to by the Active nameNode and read by the Standby NameNode to stay up-to-date with all the file system changes the Active NameNode makes. Though you must specify several JournalNode addresses, you should only configure one of these URIs. The URI should be of the form: “qjournal://host1:port1;host2:port2;host3:port3/journalId”. The Journal ID is a unique identifier for this nameservice, which allows a single set of JournalNodes to provide storage for multiple federated namesystems. Though not a requirement, it’s a good idea to reuse the nameservice ID for the journal identifier.

For example, if the JournalNodes for this cluster were running on the machines “node1.example.com”, “node2.example.com”, and “node3.example.com” and the nameservice ID were “mycluster”, you would use the following as the value for this setting (the default port for the JournalNode is 8485):
<property>
  <name>dfs.namenode.shared.edits.dir</name>
  <value>qjournal://node1.example.com:8485;node2.example.com:8485;node3.example.com:8485/mycluster</value>
</property>
dfs.client.failover.proxy.provider.[nameservice ID]

- the Java class that HDFS clients use to contact the Active NameNode

Configure the name of the Java class which will be used by the DFS Client to determine which NameNode is the current Active, and therefore which NameNode is currently serving client requests. The only implementation which currently ships with Hadoop is the ConfiguredFailoverProxyProvider, so use this unless you are using a custom one. For example:
<property>
  <name>dfs.client.failover.proxy.provider.mycluster</name>
  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
dfs.ha.fencing.methods

- a list of scripts or Java classes which will be used to fence the Active NameNode during a failover

It is desirable for correctness of the system that only one NameNode be in the Active state at any given time. Importantly, when using the Quorum Journal Manager, only one NameNode will ever be allowed to write to the JournalNodes, so there is no potential for corrupting the file system metadata from a split-brain scenario. However, when a failover occurs, it is still possible that the previous Active NameNode could serve read requests to clients, which may be out of date until that NameNode shuts down when trying to write to the JournalNodes. For this reason, it is still desirable to configure some fencing methods even when using the Quorum Journal Manager. However, to improve the availability of the system in the event the fencing mechanisms fail, it is advisable to configure a fencing method which is guaranteed to return success as the last fencing method in the list. Note that if you choose to use no actual fencing methods, you still must configure something for this setting, for example “shell(/bin/true)”.

The fencing methods used during a failover are configured as a carriage-return-separated list, which will be attempted in order until one indicates that fencing has succeeded. There are two methods which ship with Hadoop: shell and sshfence. For information on implementing your own custom fencing method, see the org.apache.hadoop.ha.NodeFencer class.
sshfence

- SSH to the Active NameNode and kill the process

The sshfence option SSHes to the target node and uses fuser to kill the process listening on the service’s TCP port. In order for this fencing option to work, it must be able to SSH to the target node without providing a passphrase. Thus, one must also configure the dfs.ha.fencing.ssh.private-key-files option, which is a comma-separated list of SSH private key files. For example:
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>sshfence</value>
</property>

<property>
  <name>dfs.ha.fencing.ssh.private-key-files</name>
  <value>/home/exampleuser/.ssh/id_rsa</value>
</property>
Optionally, one may configure a non-standard username or port to perform the SSH. One may also configure a timeout, in milliseconds, for the SSH, after which this fencing method will be considered to have failed. It may be configured like so:
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>sshfence([[username][:port]])</value>
</property>
<property>
  <name>dfs.ha.fencing.ssh.connect-timeout</name>
  <value>30000</value>
</property>
shell

- run an arbitrary shell command to fence the Active NameNode

The shell fencing method runs an arbitrary shell command. It may be configured like so:
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>shell(/path/to/my/script.sh arg1 arg2 ...)</value>
</property>
The string between ‘(’ and ‘)’ is passed directly to a bash shell and may not include any closing parentheses.

The shell command will be run with an environment set up to contain all of the current Hadoop configuration variables, with the ‘_’ character replacing any ‘.’ characters in the configuration keys. The configuration used has already had any namenode-specific configurations promoted to their generic forms – for example dfs_namenode_rpc-address will contain the RPC address of the target node, even though the configuration may specify that variable as dfs.namenode.rpc-address.ns1.nn1.

Additionally, the following variables referring to the target node to be fenced are also available:

$target_host hostname of the node to be fenced
$target_port IPC port of the node to be fenced
$target_address the above two, combined as host:port
$target_nameserviceid the nameservice ID of the NN to be fenced
$target_namenodeid the namenode ID of the NN to be fenced
These environment variables may also be used as substitutions in the shell command itself. For example:
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>shell(/path/to/my/script.sh --nameservice=$target_nameserviceid $target_host:$target_port)</value>
</property>
If the shell command returns an exit code of 0, the fencing is determined to be successful. If it returns any other exit code, the fencing was not successful and the next fencing method in the list will be attempted.

Note: This fencing method does not implement any timeout. If timeouts are necessary, they should be implemented in the shell script itself (eg by forking a subshell to kill its parent in some number of seconds).
fs.defaultFS
- the default path prefix used by the Hadoop FS client when none is given

Optionally, you may now configure the default path for Hadoop clients to use the new HA-enabled logical URI. If you used “mycluster” as the nameservice ID earlier, this will be the value of the authority portion of all of your HDFS paths. This may be configured like so, in your core-site.xml file:
<property>
  <name>fs.defaultFS</name>
  <value>hdfs://mycluster</value>
</property>
dfs.journalnode.edits.dir

- the path where the JournalNode daemon will store its local state

This is the absolute path on the JournalNode machines where the edits and other local state used by the JNs will be stored. You may only use a single path for this configuration. Redundancy for this data is provided by running multiple separate JournalNodes, or by configuring this directory on a locally-attached RAID array. For example:
<property>
  <name>dfs.journalnode.edits.dir</name>
  <value>/path/to/journal/node/local/data</value>
</property>

$target_host	hostname of the node to be fenced
$target_port	IPC port of the node to be fenced
$target_address	the above two, combined as host:port
$target_nameserviceid	the nameservice ID of the NN to be fenced
$target_namenodeid	the namenode ID of the NN to be fenced

hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    <property>
        <name>dfs.nameservices</name>
        <value>mycluster</value>
    </property>
    <property>
	<name>dfs.ha.namenodes.mycluster</name>
	<value>nn1,nn2</value>
    </property>
    <property>
	<name>dfs.namenode.rpc-address.mycluster.nn1</name>
	<value>node01:8020</value>
    </property>
    <property>
	<name>dfs.namenode.rpc-address.mycluster.nn2</name>
	<value>node02:8020</value>
    </property>
    <property>
	<name>dfs.namenode.http-address.mycluster.nn1</name>
	<value>node01:50070</value>
    </property>
    <property>
	<name>dfs.namenode.http-address.mycluster.nn2</name>
	<value>node02:50070</value>
    </property>
    <property>
	<name>dfs.namenode.shared.edits.dir</name>
	<value>qjournal://node01:8485;node02:8485;node03:8485/mycluster</value>
    </property>
    <property>
	<name>dfs.client.failover.proxy.provider.mycluster</name>
	<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <property>
	<name>dfs.ha.fencing.methods</name>
	<value>sshfence</value>
    </property>

    <property>
	<name>dfs.ha.fencing.ssh.private-key-files</name>
	<value>/root/.ssh/id_dsa</value>
    </property>
    <property>
	<name>dfs.journalnode.edits.dir</name>
	<value>/var/hadoop/ha/journalnode</value>
</property>
</configuration>

core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/var/hadoop/ha</value>
    </property>
</configuration>

根据官方提示要使用Zookeepr ，需要写入hdfs-site.xml和core-site.xml 自动故障转移配置
Configuring automatic failover

The configuration of automatic failover requires the addition of two new parameters to your configuration. In your hdfs-site.xml file, add:
```
<property>
  <name>dfs.ha.automatic-failover.enabled</name>
  <value>true</value>
</property>
```
This specifies that the cluster should be set up for automatic failover. In your core-site.xml file, add:
```
<property>
  <name>ha.zookeeper.quorum</name>
  <value>zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181</value>
</property>
```
This lists the host-port pairs running the ZooKeeper service.

As with the parameters described earlier in the document, these settings may be configured on a per-nameservice basis by suffixing the configuration key with the nameservice ID. For example, in a cluster with federation enabled, you can explicitly enable automatic failover for only one of the nameservices by setting dfs.ha.automatic-failover.enabled.my-nameservice-id.

There are also several other configuration parameters which may be set to control the behavior of automatic failover; however, they are not necessary for most installations. Please refer to the configuration key specific documentation for details.
hdfs-site.xml
```
 <property>
   <name>dfs.ha.automatic-failover.enabled</name>
   <value>true</value>
 </property>
```
core-site.xml
```
 <property>
   <name>ha.zookeeper.quorum</name>
   <value>node02:2181,node03:2181,node04:2181</value>
 </property>
```

设置从节点

# 设置从节点 把slaves中localhost 换成从节点主机名
vi /opt/hadoop-2.6.5/etc/hadoop/slaves 
node02
node03
node04

分发core-site.xml 、hdfs-site.xml、slaves到其他虚拟机
搭建ZooKeeper
```
tar xf zookeeper-3.4.6.tar.gz -C /opt/
```

修改Zookeeper配置文件

cd /opt/zookeeper-3.4.6/conf
mv zoo_sample.cfg zoo.cfg
vi zoo.cfg

修改zoo.cfg配置文件中 dataDir= 文档放置位置，并在最后添加服务器地址与出从通讯端口，选举机制端口

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/var/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=node02:2888:3888
server.2=node03:2888:3888
server.3=node04:2888:3888

把Zookeeper整个文件夹分发到其他Zookeeper节点上
创建 dataDir所指定的文件位置，再创建myid （Zookeeper 当前节点id 与zoo.cfg 配置对应）

mkdir /var/zookeeper
echo 1 > /var/zookeeper/myid

配置Zookeeper到环境变量，并分发到其他Zookeeper节点
启动Zookeeper zkServer.sh start
启动 journalnode hadoop-daemon.sh start journalnode
格式化NN节点 hdfs namenode -format
启动已格式化的NN主节点(官方文档没提示) hadoop-daemon.sh start namenode
根据官方提示格式化后在NN**从节点（这里注意：非格式化NN节点即node02节点上）**执行同步操作 hdfs namenode -bootstrapStandby
Deployment details

After all of the necessary configuration options have been set, you must start the JournalNode daemons on the set of machines where they will run. This can be done by running the command “hadoop-daemon.sh start journalnode” and waiting for the daemon to start on each of the relevant machines.

Once the JournalNodes have been started, one must initially synchronize the two HA NameNodes’ on-disk metadata.
- If you are setting up a fresh HDFS cluster, you should first run the format command (hdfs namenode -format) on one of NameNodes.
- If you have already formatted the NameNode, or are converting a non-HA-enabled cluster to be HA-enabled, you should now copy over the contents of your NameNode metadata directories to the other, unformatted NameNode by running the command “hdfs namenode -bootstrapStandby” on the unformatted NameNode. Running this command will also ensure that the JournalNodes (as configured by dfs.namenode.shared.edits.dir) contain sufficient edits transactions to be able to start both NameNodes.
- If you are converting a non-HA NameNode to be HA, you should run the command “hdfs -initializeSharedEdits”, which will initialize the JournalNodes with the edits data from the local NameNode edits directories.
At this point you may start both of your HA NameNodes as you normally would start a NameNode.

You can visit each of the NameNodes’ web pages separately by browsing to their configured HTTP addresses. You should notice that next to the configured address will be the HA state of the NameNode (either “standby” or “active”.) Whenever an HA NameNode starts, it is initially in the Standby state.
在NN主节点上执行注册到Zookeeper 的命令 hdfs zkfc -formatZK
Initializing HA state in ZooKeeper

After the configuration keys have been added, the next step is to initialize required state in ZooKeeper. You can do so by running the following command from one of the NameNode hosts.
```
$ hdfs zkfc -formatZK
```
This will create a znode in ZooKeeper inside of which the automatic failover system stores its data.

通过Zookeeper zkCli.sh 可以查看到注册信息

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[hadoop-ha, zookeeper]
[zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha/mycluster
[]
[zk: localhost:2181(CONNECTED) 2]

在NN主节点（对其他节点免密登录的节点）启动当前集群 start-dfs.sh

[root@node01 hadoop]# start-dfs.sh
Starting namenodes on [node01 node02]
node02: starting namenode, logging to /opt/hadoop/hadoop-2.6.5/logs/hadoop-root-namenode-node02.out
node01: namenode running as process 1754. Stop it first.
node03: starting datanode, logging to /opt/hadoop/hadoop-2.6.5/logs/hadoop-root-datanode-node03.out
node04: starting datanode, logging to /opt/hadoop/hadoop-2.6.5/logs/hadoop-root-datanode-node04.out
node02: starting datanode, logging to /opt/hadoop/hadoop-2.6.5/logs/hadoop-root-datanode-node02.out
Starting journal nodes [node01 node02 node03]
node01: journalnode running as process 1464. Stop it first.
node03: journalnode running as process 1288. Stop it first.
node02: journalnode running as process 1565. Stop it first.
Starting ZK Failover Controllers on NN hosts [node01 node02]
node01: starting zkfc, logging to /opt/hadoop/hadoop-2.6.5/logs/hadoop-root-zkfc-node01.out
node02: starting zkfc, logging to /opt/hadoop/hadoop-2.6.5/logs/hadoop-root-zkfc-node02.out

启动后使用jps 检查是否有相关java进程
使用 hadoop-daemon.sh stop namenode 关闭主节点NN

也可使用 hadoop-daemon.sh stop zkfc 关闭主节点zkfc 测试是否切换成功

其他命令

Zookeeper

zkServer.sh start 启动

zkServer.sh stop 关闭

zkServer.sh status 查看状态

zkCli.sh 客户端 ls / 查看根目录
启动顺序 ZK>>JN >>HDFS
ss -nal 可以查看端口
start-dfs.sh 启动hdfs集群

得过且过1223

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
4.Hadoop_HDFS2.x_高可用搭建

架构说明HDFS 2.x HAHDFS High Availability Using the Quorum Journal Manager搭建说明虚拟机NN-1NN-2DNZKZKFCJNNnode01***node02*****node03***node04**搭建步骤官方文档： ...
复制链接

扫一扫