Mini Centos环境部署YDB,基于haoop,zookeeper和kafka
YDB简介
YDB全称延云YDB ,是一个基于Hadoop分布式架构下的实时的、多维的、交互式的查询、统计、分析引擎,具有万亿数据规模下的秒级性能表现,并具备企业级的稳定可靠表现。
本文目标
搭建基于虚拟机版本的YDB,用于功能体验,不要求性能和稳定性(简单说就是能跑起来),并且虚拟机可以运行在主流的桌面环境。
对于追求高效、稳定地的单机版YDB,请安装和使用延云YDB易捷版。
硬件及操作系统
- CPU: 1x2 core
- MEM: 4G
- HDD: 64G
- SYS: Centos 6.6 x64 mini
系统环境配置
必要软件包
yum install openssh-clients unzip
系统配置
系统配置参考《YDB编程指南》或YDB依赖的操作系统环境详解
软件包
jdk-8u60-linux-x64.tar.gz
hadoop-2.7.3.tar.gz
zookeeper-3.4.6.tar.gz
kafka_2.11-0.10.0.1.tgz
spark1.6.3_hadoop2.7.3.tar.gz <www.ycloud.net.cn>
ya100-1.1.8.11.0710.1988.413.stable.zip <www.ycloud.net.cn>
hdfs
创建数据目录
mkdir -p /data
mkdir -p /data/tmp/hadoop
mkdir -p /data/hadoop/hdfs/nn
mkdir -p /data/hadoop/hdfs/dn
mkdir -p /data/hadoop/hdfs/sn
core-site.xml
fs.defaultFS - hdfs://<hostname>
hadoop.tmp.dir - /data/tmp/hadoop
hdfs-site.xml
dfs.replication - 1
dfs.namenode.name.dir - /data/hadoop/hdfs/nn
dfs.datanode.data.dir - /data/hadoop/hdfs/dn
dfs.namenode.checkpoint.dir - /data/hadoop/hdfs/sn
dfs.namenode.secondary.http-address - <hostname>:50090
dfs.permissions.enabled - false
slaves
<hostname>
hadoop-env.sh
JAVA_HOME=<jdk home>
HADOOP_HEAPSIZE=64
HADOOP_NAMENODE_INIT_HEAPSIZE=64
禁用Secondary Name Node
start-dfs.sh & stop-dfs.sh
unset SECONDARY_NAMENODES # SECONDARY_NAMENODES=
格式化
hdfs namenode -format
启动和停止
start-dfs
start-yarn
yarn
yarn-site.xml
yarn.resourcemanager.hostname - <hostname>
yarn.nodemanager.resource.memory-mb - 2048
yarn.nodemanager.resource.cpu-vcores - 4
yarn.scheduler.minimum-allocation-mb - 8
yarn.scheduler.maximum-allocation-mb - 2048
yarn.scheduler.minimum-allocation-vcores - 1
yarn.scheduler.maximum-allocation-vcores - 4
yarn.nodemanager.vmem-check-enabled - false
yarn.nodemanager.pmem-check-enabled - false
yarn-env.sh
YARN_RESOURCEMANAGER_HEAPSIZE=64
YARN_NODEMANAGER_HEAPSIZE=64
启动和停止
stop-yarn
stop-dfs
zookpeer
zookeeper-env.sh
export JAVA_HOME=<jdk_home>
zoo.cfg
tickTime=2000
dataDir=/data/zookeeper
clientPort=2181
启动和停止
bin/zkServer start
bin/zkServer stop
kafka
server.properties
zookeeper.connect=<hostname>:2181
log.cleaner.dedupe.buffer.size=5242880
kafka-server-start.sh
export KAFKA_HEAP_OPTS="-Xmx64m -Xms64m"
启动和停止
bin/kafka-server-start.sh config/server.properties &
bin/kafka-server-stop.sh
ydb
ya100_env.sh
export HADOOP_CONF_DIR=<hadoop_conf_dir>
export SPARK_HOME=<spark_home>
export YA100_EXECUTORS=1
export YA100_MEMORY=512m
export YA100_CORES=1
export YA100_DRIVER_MEMORY=256m
export HDFS_USER=root
ydb_site.yaml
storm.zookeeper.servers: "<hostname>"
ydb.ya100.hb.connuser: root
bootstrap.servers.ydb_syslog: "<hostname>"
create topic
bin/kafka-topics.sh --create --zookeeper zvm:2181 --replication-factor 1 --partitions 1 --topic bcp003
启动和停止
bin/start-all.sh
bin/stop-all.sh
小结
netstat
netstat -anp | grep LISTEN | grep 50070 # NN WEB
netstat -anp | grep LISTEN | grep 8020 # NN
netstat -anp | grep LISTEN | grep 50010 # DN
netstat -anp | grep LISTEN | grep 8088 # RM WEB
netstat -anp | grep LISTEN | grep 8042 # NM
netstat -anp | grep LISTEN | grep 2181 # ZK
netstat -anp | grep LISTEN | grep 9092 # KF
netstat -anp | grep LISTEN | grep 1210 # YDB WEB
netstat -anp | grep LISTEN | grep 10009 # YDB JDBC
http
http://<hostname>:50070 NN
http://<hostname>:8088 RM
http://<hostname>:8042 NM
http://<hostname>:1210 YDB