前言
一,分布式文件系统glusterfs 如何支持MapReduce 计算
二,mapreduce存储切换(hdfs、glusterfs)
2.1 配置修改
<configuration>
<property>
<name>fs.glusterfs.impl</name>
<value>org.apache.hadoop.fs.glusterfs.GlusterFileSystem</value>
</property>
<property>
<name>fs.default.name</name>
<value>glusterfs://172.16.0.20:9000</value>
</property>
<property>
<name>fs.glusterfs.volname</name>
<value>hadoopvol</value> //glusterfs 创建的存储卷名称
</property>
<property>
<name>fs.glusterfs.mount</name>
<value>/mnt/glusterfs</value> //glusterfs 的本地挂载目录
</property>
<property>
<name>fs.glusterfs.server</name>
<value> fedora2</value>
</property>
<property>
<name>quick.slave.io</name>
<value>Off</value>
</property>
</configuration>
2.2 原理介绍
Glusterfs-hadoop.jar –Java API MapReduce jobs
The call Flow is shown in the left side –initiated from the hadoop job to the glusterfs java library through the FUSEmount and finally to the servers.
NOTE: The jar to load(glusterfs-hadoop-0.20.2-0.1.jar) is mentioned in hadoop configurationfile(conf/core-site.xml)
任意map或reduce中的文件读写操作都会经由 jar包,通过fuse+mount机制,提交到glusterfs的客户端,glusterfs客户端与glusterfs data server交互,获取数据,返回操作结果。