Postgres2015全国用户大会将于11月20至21日在北京丽亭华苑酒店召开。本次大会嘉宾阵容强大,国内顶级PostgreSQL数据库专家将悉数到场,并特邀欧洲、俄罗斯、日本、美国等国家和地区的数据库方面专家助阵:
- Postgres-XC项目的发起人铃木市一(SUZUKI Koichi)
- Postgres-XL的项目发起人Mason Sharp
- pgpool的作者石井达夫(Tatsuo Ishii)
- PG-Strom的作者海外浩平(Kaigai Kohei)
- Greenplum研发总监姚延栋
- 周正中(德哥), PostgreSQL中国用户会创始人之一
- 汪洋,平安科技数据库技术部经理
- ……
|
CitusDB本身支持表的分布式存储, 同时利用PostgreSQL 9.1的FDW功能, 还可以与Hadoop结合使用.
目前CitusDB只能与Hadoop 1.x配合使用. Hadoop 2.x支持多NameNode的结构暂时还不能配合使用.
【CitusDB + Hadoop 架构图如下 : 】
下载到的实际上是修改过的PostgreSQL数据库版本.
CitusDB的运行还需要依赖两个插件, 如下 :
1. 其中FDW用做PostgreSQL和HDFS的接口, 需要另外安装. (
https://github.com/citusdata/file_fdw)
2. 同时CitusDB还需要从Hadoop的NameNode中同步HDFS的元数据到CitusDB的Master节点(PostgreSQL DB).
还需要将DataNode的元数据同步到CitusDB的worker节点(PostgreSQL DB). 也需要另外安装. (
https://github.com/citusdata/hadoop-sync)
【部署建议 : 】
1. hadoop-sync最好与Hadoop NameNode部署在同一主机.
2. CitusDB Master node最好与
Hadoop NameNode部署在同一主机.
3. CitusDB Worker node与Hadoop DataNode一对一部署在同一主机.
【本文主要讲一下CitusDB第二个插件hadoop-sync的安装方法 : 】
1. maven架构
下载maven, 用于安装hadoop-sync.
# wget http://mirror.bjtu.edu.cn/apache/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz# tar -zxvf apache-maven-3.0.5-bin.tar.gz# mv apache-maven-3.0.5 /opt/
2. maven依赖java, javac, 所以还需要安装java, javac.
# yum install java# java -versionjava version "1.6.0_24"OpenJDK Runtime Environment (IcedTea6 1.11.9) (rhel-1.36.1.11.9.el5_9-x86_64)OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)# which java/usr/bin/java
# 安装javac.
# yum install java-devel# javac -versionjavac 1.6.0_24# which javac/usr/bin/javac
3. 安装maven
配置环境变量 :
# vi ~/.bash_profile
export M2_HOME=/opt/apache-maven-3.0.5export M2=$M2_HOME/binexport MAVEN_OPTS="-Xms256m -Xmx512m"export JAVA_HOME=/usrexport PATH=$M2:$JAVA_HOME/bin:$PATH:.
# . ~/.bash_profile
# mvn --version
Apache Maven 3.0.5 (r01de14724cdef164cd33c7c8c2fe155faf9602da; 2013-02-19 21:51:28+0800)Maven home: /opt/apache-maven-3.0.5Java version: 1.6.0_24, vendor: Sun Microsystems Inc.Java home: /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jreDefault locale: en_US, platform encoding: UTF-8OS name: "linux", version: "2.6.18-274.el5", arch: "amd64", family: "unix"
4. 下载hadoop-sync
# wget --no-check-certificate https://github.com/citusdata/hadoop-sync/archive/master.zip -O master.zip# unzip master.zip
5. 安装hadoop-sync
使用maven安装hadoop-sync时需要连接外网, 下载依赖包. 所以首先要确保主机能通外网.
如果不能通外网, 那么请自行下载依赖包.
# cd hadoop-sync-master
# ll
total 12-rw-r--r-- 1 root root 2533 Feb 16 04:56 pom.xml-rw-r--r-- 1 root root 3047 Feb 16 04:56 README.mddrwxr-xr-x 3 root root 4096 Feb 16 04:56 src
# cat pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion><groupId>com.citusdata.sync</groupId><artifactId>hadoop-sync</artifactId><version>0.1</version><packaging>jar</packaging><name>hadoop-sync</name><url>http://maven.apache.org</url>
<build><plugins><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-compiler-plugin</artifactId><version>2.3.2</version><configuration><source>1.6</source><target>1.6</target></configuration></plugin><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-jar-plugin</artifactId><version>2.3.2</version><configuration><archive><manifest><addClasspath>true</addClasspath><classpathPrefix>lib/</classpathPrefix><mainClass>com.citusdata.sync.hdfs.HdfsSynchronizer</mainClass></manifest></archive></configuration></plugin><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-dependency-plugin</artifactId><executions><execution><id>copy</id><phase>package</phase><goals><goal>copy-dependencies</goal></goals><configuration><outputDirectory>${project.build.directory}/lib</outputDirectory></configuration></execution></executions></plugin></plugins><resources><resource><directory>src/main/config</directory></resource></resources></build>
<properties><project.build.sourceEncoding>UTF-8</project.build.sourceEncoding></properties>
<dependencies><dependency><groupId>commons-configuration</groupId><artifactId>commons-configuration</artifactId><version>1.6</version><scope>compile</scope></dependency><dependency><groupId>junit</groupId><artifactId>junit</artifactId><version>3.8.1</version><scope>test</scope></dependency><dependency><groupId>log4j</groupId><artifactId>log4j</artifactId><version>1.2.17</version><type>jar</type></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-core</artifactId><version>1.0.4</version><type>jar</type></dependency><dependency><groupId>postgresql</groupId><artifactId>postgresql</artifactId><version>9.1-901.jdbc4</version></dependency></dependencies>
</project>
安装hadoop-sync
# mvn install
[INFO] Scanning for projects...[INFO][INFO] ------------------------------------------------------------------------[INFO] Building hadoop-sync 0.1[INFO] ------------------------------------------------------------------------[INFO][INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ hadoop-sync ---[debug] execute contextualize[INFO] Using 'UTF-8' encoding to copy filtered resources.[INFO] Copying 2 resources[INFO][INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ hadoop-sync ---[INFO] Compiling 7 source files to /root/hadoop-sync-master/target/classes[INFO][INFO] --- maven-resources-plugin:2.5:testResources (default-testResources) @ hadoop-sync ---[debug] execute contextualize[INFO] Using 'UTF-8' encoding to copy filtered resources.[INFO] skip non existing resourceDirectory /root/hadoop-sync-master/src/test/resources[INFO][INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ hadoop-sync ---[INFO] No sources to compile[INFO][INFO] --- maven-surefire-plugin:2.10:test (default-test) @ hadoop-sync ---Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-api/2.0.9/maven-plugin-api-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-api/2.0.9/maven-plugin-api-2.0.9.pom (2 KB at 1.4 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven/2.0.9/maven-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven/2.0.9/maven-2.0.9.pom (19 KB at 16.5 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-parent/8/maven-parent-8.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-parent/8/maven-parent-8.pom (24 KB at 9.7 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-booter/2.10/surefire-booter-2.10.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-booter/2.10/surefire-booter-2.10.pom (3 KB at 5.3 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-api/2.10/surefire-api-2.10.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-api/2.10/surefire-api-2.10.pom (3 KB at 4.0 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/maven-surefire-common/2.10/maven-surefire-common-2.10.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/maven-surefire-common/2.10/maven-surefire-common-2.10.pom (4 KB at 1.3 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/2.1/plexus-utils-2.1.pomDownloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/2.1/plexus-utils-2.1.pom (4 KB at 4.7 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/sonatype/spice/spice-parent/16/spice-parent-16.pomDownloaded: http://repo.maven.apache.org/maven2/org/sonatype/spice/spice-parent/16/spice-parent-16.pom (9 KB at 7.3 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/sonatype/forge/forge-parent/5/forge-parent-5.pomDownloaded: http://repo.maven.apache.org/maven2/org/sonatype/forge/forge-parent/5/forge-parent-5.pom (9 KB at 9.7 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact/2.0.9/maven-artifact-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact/2.0.9/maven-artifact-2.0.9.pom (2 KB at 2.8 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-project/2.0.9/maven-project-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-project/2.0.9/maven-project-2.0.9.pom (3 KB at 4.7 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-settings/2.0.9/maven-settings-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-settings/2.0.9/maven-settings-2.0.9.pom (3 KB at 3.6 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-model/2.0.9/maven-model-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-model/2.0.9/maven-model-2.0.9.pom (4 KB at 5.2 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-profile/2.0.9/maven-profile-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-profile/2.0.9/maven-profile-2.0.9.pom (3 KB at 3.6 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact-manager/2.0.9/maven-artifact-manager-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact-manager/2.0.9/maven-artifact-manager-2.0.9.pom (3 KB at 1.6 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-repository-metadata/2.0.9/maven-repository-metadata-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-repository-metadata/2.0.9/maven-repository-metadata-2.0.9.pom (2 KB at 3.3 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-registry/2.0.9/maven-plugin-registry-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-registry/2.0.9/maven-plugin-registry-2.0.9.pom (2 KB at 3.4 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-core/2.0.9/maven-core-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-core/2.0.9/maven-core-2.0.9.pom (8 KB at 6.8 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-parameter-documenter/2.0.9/maven-plugin-parameter-documenter-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-parameter-documenter/2.0.9/maven-plugin-parameter-documenter-2.0.9.pom (2 KB at 3.4 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting-api/2.0.9/maven-reporting-api-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting-api/2.0.9/maven-reporting-api-2.0.9.pom (2 KB at 3.1 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting/2.0.9/maven-reporting-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting/2.0.9/maven-reporting-2.0.9.pom (2 KB at 2.6 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-error-diagnostics/2.0.9/maven-error-diagnostics-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-error-diagnostics/2.0.9/maven-error-diagnostics-2.0.9.pom (2 KB at 3.0 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-descriptor/2.0.9/maven-plugin-descriptor-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-descriptor/2.0.9/maven-plugin-descriptor-2.0.9.pom (3 KB at 3.6 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-monitor/2.0.9/maven-monitor-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-monitor/2.0.9/maven-monitor-2.0.9.pom (2 KB at 2.2 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-toolchain/2.0.9/maven-toolchain-2.0.9.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-toolchain/2.0.9/maven-toolchain-2.0.9.pom (4 KB at 6.0 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-common-artifact-filters/1.3/maven-common-artifact-filters-1.3.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-common-artifact-filters/1.3/maven-common-artifact-filters-1.3.pom (4 KB at 6.4 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-shared-components/12/maven-shared-components-12.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-shared-components/12/maven-shared-components-12.pom (10 KB at 6.6 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-parent/13/maven-parent-13.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-parent/13/maven-parent-13.pom (23 KB at 13.2 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/apache/6/apache-6.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/apache/6/apache-6.pom (13 KB at 14.9 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-container-default/1.0-alpha-9/plexus-container-default-1.0-alpha-9.pomDownloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-container-default/1.0-alpha-9/plexus-container-default-1.0-alpha-9.pom (2 KB at 2.1 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-booter/2.10/surefire-booter-2.10.jarDownloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/maven-surefire-common/2.10/maven-surefire-common-2.10.jarDownloading: http://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-common-artifact-filters/1.3/maven-common-artifact-filters-1.3.jarDownloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-api/2.10/surefire-api-2.10.jarDownloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/2.1/plexus-utils-2.1.jarDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-booter/2.10/surefire-booter-2.10.jar (34 KB at 19.7 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting-api/2.0.9/maven-reporting-api-2.0.9.jarDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-common-artifact-filters/1.3/maven-common-artifact-filters-1.3.jar (31 KB at 18.0 KB/sec)Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting-api/2.0.9/maven-reporting-api-2.0.9.jar (10 KB at 5.2 KB/sec)Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/maven-surefire-common/2.10/maven-surefire-common-2.10.jar (60 KB at 13.0 KB/sec)Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-api/2.10/surefire-api-2.10.jar (158 KB at 14.3 KB/sec)Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/2.1/plexus-utils-2.1.jar (220 KB at 13.6 KB/sec)[INFO] No tests to run.[INFO] Surefire report directory: /root/hadoop-sync-master/target/surefire-reportsDownloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-junit3/2.10/surefire-junit3-2.10.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-junit3/2.10/surefire-junit3-2.10.pom (2 KB at 2.9 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-providers/2.10/surefire-providers-2.10.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-providers/2.10/surefire-providers-2.10.pom (3 KB at 2.1 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-junit3/2.10/surefire-junit3-2.10.jarDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-junit3/2.10/surefire-junit3-2.10.jar (26 KB at 11.3 KB/sec)
-------------------------------------------------------T E S T S-------------------------------------------------------
Results :
Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
[INFO][INFO] --- maven-jar-plugin:2.3.2:jar (default-jar) @ hadoop-sync ---Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-archiver/2.4.2/maven-archiver-2.4.2.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-archiver/2.4.2/maven-archiver-2.4.2.pom (4 KB at 6.7 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-archiver/2.0.1/plexus-archiver-2.0.1.pomDownloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-archiver/2.0.1/plexus-archiver-2.0.1.pom (3 KB at 4.9 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/sonatype/spice/spice-parent/17/spice-parent-17.pomDownloaded: http://repo.maven.apache.org/maven2/org/sonatype/spice/spice-parent/17/spice-parent-17.pom (7 KB at 11.8 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/sonatype/forge/forge-parent/10/forge-parent-10.pomDownloaded: http://repo.maven.apache.org/maven2/org/sonatype/forge/forge-parent/10/forge-parent-10.pom (14 KB at 8.0 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0/plexus-utils-3.0.pomDownloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0/plexus-utils-3.0.pom (4 KB at 3.2 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-io/2.0.1/plexus-io-2.0.1.pomDownloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-io/2.0.1/plexus-io-2.0.1.pom (2 KB at 3.0 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-components/1.1.19/plexus-components-1.1.19.pomDownloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-components/1.1.19/plexus-components-1.1.19.pom (3 KB at 4.7 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus/3.0.1/plexus-3.0.1.pomDownloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus/3.0.1/plexus-3.0.1.pom (19 KB at 7.3 KB/sec)Downloading: http://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.1/commons-lang-2.1.pomDownloaded: http://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.1/commons-lang-2.1.pom (10 KB at 8.7 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-archiver/2.4.2/maven-archiver-2.4.2.jarDownloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-io/2.0.1/plexus-io-2.0.1.jarDownloading: http://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.1/commons-lang-2.1.jarDownloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0/plexus-utils-3.0.jarDownloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-archiver/2.0.1/plexus-archiver-2.0.1.jarDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-archiver/2.4.2/maven-archiver-2.4.2.jar (20 KB at 9.1 KB/sec)Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-io/2.0.1/plexus-io-2.0.1.jar (57 KB at 20.5 KB/sec)Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-archiver/2.0.1/plexus-archiver-2.0.1.jar (176 KB at 19.4 KB/sec)Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0/plexus-utils-3.0.jar (221 KB at 14.8 KB/sec)Downloaded: http://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.1/commons-lang-2.1.jar (203 KB at 12.7 KB/sec)[INFO] Building jar: /root/hadoop-sync-master/target/hadoop-sync-0.1.jar[INFO][INFO] --- maven-dependency-plugin:2.1:copy-dependencies (copy) @ hadoop-sync ---Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/1.5.1/plexus-utils-1.5.1.pomDownloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/1.5.1/plexus-utils-1.5.1.pom (3 KB at 4.0 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia-sink-api/1.0-alpha-10/doxia-sink-api-1.0-alpha-10.pomDownloaded: http://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia-sink-api/1.0-alpha-10/doxia-sink-api-1.0-alpha-10.pom (2 KB at 2.3 KB/sec)...............................略...........................................Downloaded: http://repo.maven.apache.org/maven2/plexus/plexus-utils/1.0.2/plexus-utils-1.0.2.jar (157 KB at 22.3 KB/sec)Downloaded: http://repo.maven.apache.org/maven2/velocity/velocity/1.4/velocity-1.4.jar (353 KB at 13.5 KB/sec)Downloaded: http://repo.maven.apache.org/maven2/velocity/velocity-dep/1.4/velocity-dep-1.4.jar (506 KB at 15.0 KB/sec)[INFO] Copying ant-1.6.5.jar to /root/hadoop-sync-master/target/lib/ant-1.6.5.jar[INFO] Copying commons-beanutils-1.7.0.jar to /root/hadoop-sync-master/target/lib/commons-beanutils-1.7.0.jar[INFO] Copying commons-beanutils-core-1.8.0.jar to /root/hadoop-sync-master/target/lib/commons-beanutils-core-1.8.0.jar[INFO] Copying commons-cli-1.2.jar to /root/hadoop-sync-master/target/lib/commons-cli-1.2.jar[INFO] Copying commons-codec-1.4.jar to /root/hadoop-sync-master/target/lib/commons-codec-1.4.jar[INFO] Copying commons-collections-3.2.1.jar to /root/hadoop-sync-master/target/lib/commons-collections-3.2.1.jar[INFO] Copying commons-configuration-1.6.jar to /root/hadoop-sync-master/target/lib/commons-configuration-1.6.jar[INFO] Copying commons-digester-1.8.jar to /root/hadoop-sync-master/target/lib/commons-digester-1.8.jar[INFO] Copying commons-el-1.0.jar to /root/hadoop-sync-master/target/lib/commons-el-1.0.jar[INFO] Copying commons-httpclient-3.0.1.jar to /root/hadoop-sync-master/target/lib/commons-httpclient-3.0.1.jar[INFO] Copying commons-lang-2.4.jar to /root/hadoop-sync-master/target/lib/commons-lang-2.4.jar[INFO] Copying commons-logging-1.1.1.jar to /root/hadoop-sync-master/target/lib/commons-logging-1.1.1.jar[INFO] Copying commons-net-1.4.1.jar to /root/hadoop-sync-master/target/lib/commons-net-1.4.1.jar[INFO] Copying hsqldb-1.8.0.10.jar to /root/hadoop-sync-master/target/lib/hsqldb-1.8.0.10.jar[INFO] Copying junit-3.8.1.jar to /root/hadoop-sync-master/target/lib/junit-3.8.1.jar[INFO] Copying log4j-1.2.17.jar to /root/hadoop-sync-master/target/lib/log4j-1.2.17.jar[INFO] Copying jets3t-0.7.1.jar to /root/hadoop-sync-master/target/lib/jets3t-0.7.1.jar[INFO] Copying kfs-0.3.jar to /root/hadoop-sync-master/target/lib/kfs-0.3.jar[INFO] Copying commons-math-2.1.jar to /root/hadoop-sync-master/target/lib/commons-math-2.1.jar[INFO] Copying hadoop-core-1.0.4.jar to /root/hadoop-sync-master/target/lib/hadoop-core-1.0.4.jar[INFO] Copying jackson-core-asl-1.0.1.jar to /root/hadoop-sync-master/target/lib/jackson-core-asl-1.0.1.jar[INFO] Copying jackson-mapper-asl-1.0.1.jar to /root/hadoop-sync-master/target/lib/jackson-mapper-asl-1.0.1.jar[INFO] Copying core-3.1.1.jar to /root/hadoop-sync-master/target/lib/core-3.1.1.jar[INFO] Copying jetty-6.1.26.jar to /root/hadoop-sync-master/target/lib/jetty-6.1.26.jar[INFO] Copying jetty-util-6.1.26.jar to /root/hadoop-sync-master/target/lib/jetty-util-6.1.26.jar[INFO] Copying jsp-2.1-6.1.14.jar to /root/hadoop-sync-master/target/lib/jsp-2.1-6.1.14.jar[INFO] Copying jsp-api-2.1-6.1.14.jar to /root/hadoop-sync-master/target/lib/jsp-api-2.1-6.1.14.jar[INFO] Copying servlet-api-2.5-20081211.jar to /root/hadoop-sync-master/target/lib/servlet-api-2.5-20081211.jar[INFO] Copying servlet-api-2.5-6.1.14.jar to /root/hadoop-sync-master/target/lib/servlet-api-2.5-6.1.14.jar[INFO] Copying oro-2.0.8.jar to /root/hadoop-sync-master/target/lib/oro-2.0.8.jar[INFO] Copying postgresql-9.1-901.jdbc4.jar to /root/hadoop-sync-master/target/lib/postgresql-9.1-901.jdbc4.jar[INFO] Copying jasper-compiler-5.5.12.jar to /root/hadoop-sync-master/target/lib/jasper-compiler-5.5.12.jar[INFO] Copying jasper-runtime-5.5.12.jar to /root/hadoop-sync-master/target/lib/jasper-runtime-5.5.12.jar[INFO] Copying xmlenc-0.52.jar to /root/hadoop-sync-master/target/lib/xmlenc-0.52.jar[INFO][INFO] --- maven-install-plugin:2.3.1:install (default-install) @ hadoop-sync ---Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-digest/1.0/plexus-digest-1.0.pomDownloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-digest/1.0/plexus-digest-1.0.pom (2 KB at 1.9 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-components/1.1.7/plexus-components-1.1.7.pomDownloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-components/1.1.7/plexus-components-1.1.7.pom (5 KB at 5.8 KB/sec)Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-digest/1.0/plexus-digest-1.0.jarDownloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-digest/1.0/plexus-digest-1.0.jar (12 KB at 10.4 KB/sec)[INFO] Installing /root/hadoop-sync-master/target/hadoop-sync-0.1.jar to /root/.m2/repository/com/citusdata/sync/hadoop-sync/0.1/hadoop-sync-0.1.jar[INFO] Installing /root/hadoop-sync-master/pom.xml to /root/.m2/repository/com/citusdata/sync/hadoop-sync/0.1/hadoop-sync-0.1.pom[INFO] ------------------------------------------------------------------------[INFO] BUILD SUCCESS[INFO] ------------------------------------------------------------------------[INFO] Total time: 3:41.824s[INFO] Finished at: Mon Mar 18 22:22:00 CST 2013[INFO] Final Memory: 14M/247M[INFO] ------------------------------------------------------------------------
# 安装完后的目录结构 :
# lltotal 16-rw-r--r-- 1 root root 2533 Feb 16 04:56 pom.xml-rw-r--r-- 1 root root 3047 Feb 16 04:56 README.mddrwxr-xr-x 3 root root 4096 Feb 16 04:56 srcdrwxr-xr-x 7 root root 4096 Mar 18 22:21 target# cd target/# ll -R.:total 44drwxr-xr-x 3 root root 4096 Mar 18 22:18 classesdrwxr-xr-x 3 root root 4096 Mar 18 22:14 generated-sources-rw-r--r-- 1 root root 24084 Mar 18 22:19 hadoop-sync-0.1.jardrwxr-xr-x 2 root root 4096 Mar 18 22:21 libdrwxr-xr-x 2 root root 4096 Mar 18 22:19 maven-archiverdrwxr-xr-x 2 root root 4096 Mar 18 22:22 surefire
./classes:total 12drwxr-xr-x 3 root root 4096 Mar 18 22:18 com-rw-r--r-- 1 root root 1228 Mar 18 22:14 log4j.properties-rw-r--r-- 1 root root 266 Mar 18 22:14 sync.properties
./classes/com:total 4drwxr-xr-x 3 root root 4096 Mar 18 22:18 citusdata
./classes/com/citusdata:total 4drwxr-xr-x 3 root root 4096 Mar 18 22:18 sync
./classes/com/citusdata/sync:total 4drwxr-xr-x 2 root root 4096 Mar 18 22:18 hdfs
./classes/com/citusdata/sync/hdfs:total 60-rw-r--r-- 1 root root 10403 Mar 18 22:18 CitusMasterNode.class-rw-r--r-- 1 root root 5728 Mar 18 22:18 CitusWorkerNode.class-rw-r--r-- 1 root root 6867 Mar 18 22:18 HdfsMasterNode.class-rw-r--r-- 1 root root 12373 Mar 18 22:18 HdfsSynchronizer.class-rw-r--r-- 1 root root 1332 Mar 18 22:18 HdfsSynchronizer$MetadataDifference.class-rw-r--r-- 1 root root 2265 Mar 18 22:18 HdfsWorkerNode.class-rw-r--r-- 1 root root 642 Mar 18 22:18 MinMaxValue.class-rw-r--r-- 1 root root 1687 Mar 18 22:18 ShardPlacement.class
./generated-sources:total 4drwxr-xr-x 2 root root 4096 Mar 18 22:14 annotations
./generated-sources/annotations:total 0
./lib:total 16920-rw-r--r-- 1 root root 1034049 Mar 18 22:21 ant-1.6.5.jar-rw-r--r-- 1 root root 188671 Mar 18 22:21 commons-beanutils-1.7.0.jar-rw-r--r-- 1 root root 206035 Mar 18 22:21 commons-beanutils-core-1.8.0.jar-rw-r--r-- 1 root root 41123 Mar 18 22:21 commons-cli-1.2.jar-rw-r--r-- 1 root root 58160 Mar 18 22:21 commons-codec-1.4.jar-rw-r--r-- 1 root root 575389 Mar 18 22:21 commons-collections-3.2.1.jar-rw-r--r-- 1 root root 298829 Mar 18 22:21 commons-configuration-1.6.jar-rw-r--r-- 1 root root 143602 Mar 18 22:21 commons-digester-1.8.jar-rw-r--r-- 1 root root 112341 Mar 18 22:21 commons-el-1.0.jar-rw-r--r-- 1 root root 279781 Mar 18 22:21 commons-httpclient-3.0.1.jar-rw-r--r-- 1 root root 261809 Mar 18 22:21 commons-lang-2.4.jar-rw-r--r-- 1 root root 60686 Mar 18 22:21 commons-logging-1.1.1.jar-rw-r--r-- 1 root root 832410 Mar 18 22:21 commons-math-2.1.jar-rw-r--r-- 1 root root 180792 Mar 18 22:21 commons-net-1.4.1.jar-rw-r--r-- 1 root root 3566844 Mar 18 22:21 core-3.1.1.jar-rw-r--r-- 1 root root 3929148 Mar 18 22:21 hadoop-core-1.0.4.jar-rw-r--r-- 1 root root 706710 Mar 18 22:21 hsqldb-1.8.0.10.jar-rw-r--r-- 1 root root 136059 Mar 18 22:21 jackson-core-asl-1.0.1.jar-rw-r--r-- 1 root root 270781 Mar 18 22:21 jackson-mapper-asl-1.0.1.jar-rw-r--r-- 1 root root 405086 Mar 18 22:21 jasper-compiler-5.5.12.jar-rw-r--r-- 1 root root 76698 Mar 18 22:21 jasper-runtime-5.5.12.jar-rw-r--r-- 1 root root 377780 Mar 18 22:21 jets3t-0.7.1.jar-rw-r--r-- 1 root root 539912 Mar 18 22:21 jetty-6.1.26.jar-rw-r--r-- 1 root root 177131 Mar 18 22:21 jetty-util-6.1.26.jar-rw-r--r-- 1 root root 1024680 Mar 18 22:21 jsp-2.1-6.1.14.jar-rw-r--r-- 1 root root 134910 Mar 18 22:21 jsp-api-2.1-6.1.14.jar-rw-r--r-- 1 root root 121070 Mar 18 22:21 junit-3.8.1.jar-rw-r--r-- 1 root root 11981 Mar 18 22:21 kfs-0.3.jar-rw-r--r-- 1 root root 489884 Mar 18 22:21 log4j-1.2.17.jar-rw-r--r-- 1 root root 65261 Mar 18 22:21 oro-2.0.8.jar-rw-r--r-- 1 root root 539705 Mar 18 22:21 postgresql-9.1-901.jdbc4.jar-rw-r--r-- 1 root root 134133 Mar 18 22:21 servlet-api-2.5-20081211.jar-rw-r--r-- 1 root root 132368 Mar 18 22:21 servlet-api-2.5-6.1.14.jar-rw-r--r-- 1 root root 15010 Mar 18 22:21 xmlenc-0.52.jar
./maven-archiver:total 4-rw-r--r-- 1 root root 112 Mar 18 22:19 pom.properties
./surefire:total 0
6. 配置hadoop-sync
# 配置文件范例
# cat sync.properties
# HDFS related cluster configuration settingsHdfsMasterNodeName = localhostHdfsMasterNodePort = 9000HdfsWorkerNodePort = 50020
# CitusDB related cluster configuration settingsCitusMasterNodeName = localhostCitusMasterNodePort = 5432CitusWorkerNodePort = 9700
7. 用法
# java -jar hadoop-sync-0.1.jar table_name [--fetch-min-max]
5. hadoop-sync readme
hadoop-synchadoop-sync is a Java application that synchronizes block metadata from the HDFS NameNode to the CitusDB master node. The application also creates a colocated foreign table on CitusDB worker nodes for each block in the Hadoop File System (HDFS). Optionally, the application also collects statistics about the underlying HDFS blocks.
hadoop-sync is idempotent. You can run it multiple times; it calculates metadata differences between the HDFS NameNode and CitusDB master node, and only syncs metadata for newly added or removed HDFS blocks. If no such blocks exist, the application does nothing.
hadoop-sync also leaves metadata on the CitusDB master node in a consistent state. When you run the application, it either completes by propagating enough metadata that makes recent HDFS blocks queriable; or it errors out by reverting recent metadata updates and leaves CitusDB in the queriable state it was in before hadoop-sync was run.
hadoop-sync assumes that you are already running a Hadoop cluster and you have your CitusDB databases colocated on this cluster. For step by step instructions on how to get started, please see our documentation page at http://citusdata.com/docs/sql-on-hadoop
UsageOur documentation page explains in detail running the hadoop-sync application. The following section talks about a few details that relate to hadoop-sync's internals. Please note that these details may change over time as hadoop-sync is still in Beta form.
Now, assuming that you already have a Hadoop cluster with data loaded into it, CitusDB deployed on the Hadoop cluster, and a distributed foreign table created on the CitusDB master node, you synchronize metadata like the following:
java -jar target/hadoop-sync-0.1.jar table_name [--fetch-min-max]The distributed foreign table name is a required argument, and hadoop-sync will only synchronize block metadata for that particular table. The fetch-min-max is an optional argument that triggers the collection of min/max statistics for each new HDFS block. Since this requires going over all records in the HDFS block, passing this argument notably increases synchronization times. On the other hand, with these statistics, CitusDB can perform optimizations such as partition or join pruning that dramatically reduce query execution times.
It is worth mentioning that these min/max statistics are only relevant in the context of distributed queries. For local queries, the worker nodes themselves already use PostgreSQL's statistics collector to optimize queries, if the foreign data wrapper supports data sampling.
It is also worth noting that hadoop-sync relies on a properties file to determine node names and port numbers that Hadoop and CitusDB clusters use. This sync.properties file is bundled with the JAR file; and you can change the default settings by using emacs or vi on the JAR bundle, or by copying the file from target/classes/sync.properties to the current directory and editing it.