大数据备份 -- CDH 向 Azure Storage 备份HDFS

转自: 

https://blogs.msdn.microsoft.com/pliu/2016/06/19/backup-cloudera-data-to-azure-storage/

 

Azure Blob Storage supports an HDFS interface which can be accessed by HDFS clients using the syntax wasb://.  The hadoop-azure module which implements this interface is distributed with Apache Hadoop, but is not configured out of the box in Cloudera.  In this blog, we will provide instructions on how to backup Cloudera data to Azure storage.

The steps here have been verified on a default deployment of Cloudera CDH cluster on Azure.

1. Go to Cloudera Manager, select HDFS, then Configuration, Search for "core-site", and add the following configuration to Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml, replace with your storage account name and key:

Name: fs.azure.account.key.<your_storage_account>.blob.core.windows.net
Value: <your_storage_access_key>

2. Redeploy stale client configurations, then restart all Cloudera services from Cloudera Manager.

3. To test that Cloudera can access files in Azure storage, put some files in Azure storage.  You can do so by using command line tools like AzCopy, or UI tools such as Visual Studio Server Explorer or Azure Storage Explorer.

4. SSH into any Cloudera node, run the following command.  You may see some warnings, but make sure your can see the files in your Azure storage account.  Note that if you don't specify a destination folder name, you must have the trailing slash in the wasb URL, as shown in the following example:

hdfs dfs -ls wasb://<your_container>@<your_account>.blob.core.windows.net/
 

5. Run distcp on any Cloudera node to copy data from HDFS to Azure Storage.

# Run this command under a user who has access to the source HDFS files, 
# for example, the HDFS superuser hdfs
hadoop distcp /<hdfs_src> wasb://<your_container>@<your_account>.blob.core.windows.net/

Now you should be able to see the source HDFS content showing up in Azure storage:

For more information about Hadoop support for Azure storage, please see this documentation.

Tags Azure Cloudera HDFS wasb

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值