kafka cluster

Setup ZooKeeper on a single machine running Ubuntu 14.04 LTS
This post shows how to set up ZooKeeper on a single machine running Ubuntu 14.04 LTS.

Firstly download ZooKeeper from the following link:

http://mirror.nus.edu.sg/apache/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz

Unzip the downloaded file and put the unzipped folder in a folder, say the following
$HOME/Documents/Works/ZooKeeper/zookeeper-3.4.6

Now go the the $HOME directory and run the following command to open the .bashrc file in terminal:

> gedit .bashrc

In the .bashrc file, add the following line to the end:
export ZK_HOME=$HOME/Documents/Works/ZooKeeper/zookeeper-3.4.6

Save and close the .bashrc file, run the following command to update the environment var:

> source .bashrc

Now run the following command to go to the $ZK_HOME/conf directory

> cd $ZK_HOME/conf

Run the following command to create the configuration zoo.cfg:

> touch zoo.cfg
> gedit zoo.cfg

In the zoo.cfg opened, add the following lines:

tickTime=2000
dataDir=/tmp/zookeeper
clientPort=2181

Save and close zoo.cfg, go back to the ZooKeeper home directory:

> cd $ZK_HOME

Run the following command to start the zookeeper on the machine:

> bin/zkServer.sh start

Run the following command to check the java process running:

> jps

Run the following command to check the status of zookeeper:

> bin/zkServer.sh status

Run the following command to stop the zookeeper:

> bin/zkServer.sh stop



#############################################################################################################################

Setup ZooKeeper in a cluster
To setup ZooKeeper in a cluster, suppose we have the following computers interconnected (via a switch, e.g.) with the following ip address:

192.168.2.2
192.168.2.4

Lets assign a unique ID to each computer (ID taken from 1 to 255), said, we assign 192.168.2.2 with id = 1, and assign 192.168.2.4 with id = 2.

Firstly lets setup the Zookeeper on each machine. This can be done by following instructions in http://czcodezone.blogspot.sg/2014/11/setup-zookeeper-on-single-machine.html

Once the zookeeper is setup, log into the terminal of each computer, create a "zookeeper" folder under /var directory and create a myid file in it to contains the unique id assigned to the computer, by running the following commands:

> cd /var
> sudo mkdir zookeeper
> sudo chmod -R 777 zookeeper
> cd zookeeper
> sudo touch myid
> sudo gedit myid

In the myid opened, put the unique id and save the file. For example the content of myid file on 192.168.2.2 is

1

and the content of myid file on 192.168.2.4 is

2

Now navigate to the ZK_HOME directory on each computer and update the zoo.cfg in the conf sub-directory, by running the following commnads:
> cd $ZK_HOME
> cd conf
> gedit zoo.cfg

In the zoo.cfg file, write the following as its content:

tickTime=2000
dataDir=/var/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.1=192.168.2.2:2888:3888
server.2=192.168.2.4:2888:3888

Save and close the zoo.cfg file. Run start the zookeeper on each computer by running the following commands in their terminal:

> cd $ZK_HOME
> bin/zkServer.sh start

To check the status of the zookeeper cluster, type the following command in each computers' terminal:
> cd $ZK_HOME
> bin/zkServer.sh status

To stop a zookeeper on a computer, run the following command:
> cd $ZK_HOME
> bin/zkServer.sh stop

Common problems encountered will cause the zkServer.sh status command to display the following messages:

Error contacting service. It is probably not running.

When this message appeared, it may be because one of the computer is not connected to the network (e.g. the ethernet cable is loose). In this case, try to ping each computer to see if they are connected.

Another caution is to make sure the the /var/zookeeper folder on each computer has write permission (e.g. by running the "sudo chmod -R 777 /var/zookeeper" command in the terminal)

######################################################################################################################################

setup kafka in a single machine

Kafka is a messaging system that can acts as a buffer and feeder for messages processed by Storm spouts. It can also be used as a output buffer for Storm bolts. This post shows how to setup and test Kafka on a single machine running Ubuntu.

Firstly download the kafka 0.8.1.1 from the link below:

https://www.apache.org/dyn/closer.cgi?path=/kafka/0.8.1.1/kafka_2.8.0-0.8.1.1.tgz

Next "tar -xvzf" the kafka_2.8.0-0.8.1.1.tgz file and move it to a destination folder (say, /Documents/Works/Kafka folder under the user root directory):

> tar -xvzf kafka_2.8.0-0.8.1.1.tgz
> mkdir $HOME/Documents/Works/Kafka
> mv kafka_2.8.0-0.8.1.1 $HOME/Documents/Works/Kafka

Now go back to the user root folder and open the .bashrc file for editing:

> cd $HOME
> gedit .bashrc

In the .bashrc file, add the following line to the end:

export KAFKA_HOME=$HOME/Documents/Works/Kakfa/kafka_2.8.0-0.8.1.1

Save and close the .bashrc and run "source .bashrc" to update the environment variables. Now navigate to the kafka home folder and edit the server.properties in its sub-directory "config":

> cd $KAFKA_HOME/config
> gedit server.properties

In the server.properties file, search the line "zookeeper.connect" and change it to the following:

zookeeper.connect=192.168.2.2:2181,192.168.2.4:2181

search the line "log.dirs" and change it to the following:

log.dirs=/var/kafka-logs

Save and close the server.properties file (192.168.2.2 and 192.168.2.4 are the zookeeper nodes). Next we go and create the folder /var/kafka-logs (which will store the topics and partitions data for kafka) with write permissions:

> sudo mkdir /var/kafka-logs
> sudo chmod -R 777 /var/kafka-logs

Now set up and run the zookeeper cluster by following instructions in the link http://czcodezone.blogspot.sg/2014/11/setup-zookeeper-in-cluster.html. Once this is done, we are ready to start the kafka messaging system by running the following commands:

> cd $KAFKA_HOME
> bin/kafka-server-start.sh config/server.properties

To start testing kafka setup, Ctrl+Alt+T to open a new terminal and run the following command to create a topic "verification-topic" (a topic is a named entity in kafka which contain one or more partitions which are message queues that can run in parallel and serialize to individual folder in /var/kafka-log folder):

> cd $KAKFA_HOME
> bin/kafka-topics.sh --create --zookeeper 192.168.2.2:2181 --topic verification-topic --partitions 1 --replication-factor 1

The above command creates a topic named "verification-topic" which contains 1 partition (and with no replication)

Now we can check the list of topics in kafka by running the following command:

> bin/kafka-topics.sh --zookeeper 192.168.2.2:2181 --list

To test the producer and consumer interaction in kafka, fire up the console producer by running

> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic verification-topic

9092 is the default port for a kafka broker node (which is localhost at the moment). Now the terminal enter interaction mode. Let's open another terminal and run the console consumer:

> bin/kafka-console-consumer.sh --zookeeper 192.168.2.2:2181 --topic verification-topic

Now enter some data in the console producer terminal and you should see the data immediately display in the console consumer terminal.

######################################################################################################################################

Setup Kafka in a cluster

To setup Kafka in a cluster, first we must have the zookeeper cluster setup and running (follow this link: http://czcodezone.blogspot.sg/2014/11/setup-zookeeper-in-cluster.html), suppose that the zookeeper cluster consists of the zookeeper servers running at the following hostname:ports:

192.168.2.2:2181
192.168.2.4:2181

As I have only two computers, therefore i will use the same computers (but at different ports) to host the kafka cluster. For this case, the Kafka servers/brokers will be running at the following hostname:ports

192.168.2.2:9092
192.168.2.4:9092

To do this, follow this link (http://czcodezone.blogspot.sg/2014/11/setup-kafka-in-single-machine-running.html) to setup the kafka server. Now navigate to the kafka root folder oif each computer and modify the server.properties in "config" sub folder:

> cd $KAFKA_HOME
> gedit config/server.properties

In the server.properties file, search the line "zookeeper.connect" and change it to:

zookeeper.connect=192.168.2.2:2181,192.168.2.4:2181

Then search the line "broker.id" (unique id for each broker node) and change it to "broke.id=1" on computer 192.168.2.2 and to "broker.id=2" on computer 192.168.2.4

Next search the line "host.name" and change it to "host.name=192.168.2.2" on computer 192.168.2.2 and to "host.name=192.168.2.4" on computer 192.168.2.4

Make sure that the line "port=9092" is there and uncommented in the server.properties

Save and close the server.properties. Now start the kafka server on each computer:

> cd $KAFKA_HOME
> bin/kafka-server-start.sh config/server.properties

At this point, the kafka cluster is set up and running. We can test the cluster by creating a topic named "v-topic":

> bin/kakfa-topics.sh --create --zookeeper 192.168.2.4:2181 --partitions 2 --replication-factor 1 --topic v-topic

Now run the following commands to list the topics in the kafka brokers:

> bin/kafka-topics.sh --zookeeper 192.168.2.4:2181 --list

Now run the following commands to get a description how the topic "v-topic" is partitioned in each broker:

> bin/kafka-topics.sh --describe --zookeeper 192.168.2.4:2181 --topic v-topic

To test the producer and consumer interaction, let's start a consoler producer on the computer 192.168.2.4 by running the following command on that computer's terminal:

> cd $KAFKA_HOME
> bin/kafka-console-producer.sh --broker-list 192.168.2.2:9092,192.168.2.4:9092 --topic v-topic

Now open a terminal of the other computer 192.168.2.4 and start a console consumer:

> cd $KAFKA_HOME
> bin/kafka-console-consumer.sh --zookeeper 192.168.2.4:2181 --topic v-topic --from-beginning

Begin to type something in the console producer on 192.168.2.2 terminal and press ENTER, you will see the output displayed in the console consumer on 192.168.2.4 terminal.

Note:

It is also ok to set up multiple Kafka brokers on the same computer. For example, if we want to have two Kafka brokers running at two different ports on computer 192.168.2.2, say:

192.168.2.2:9092
192.168.2.2:9093

Now all that we need to do is to duplicate the server.properties after it is updated, and rename it server1.properties in the same "config" folder (note that name is not important, can be anything that make sense). Now in the server1.properties, modify to have the following settings:

broker.id=3
log.dirs=/var/kafka1-logs
port=9093

Save and close server1.properties (remember to create the folder /var/kafka1-logs with write permission), open two terminal in 192.168.2.2 and run the following command in the first terminal to start a kafka broker at port 9092:

> $KAFKA_HOME/bin/kafka-server-start.sh $KAFKA_HOME/config/server.properties

On the second terminal, run the following command to start a second kafka broker at port 9093:

> $KAFKA_HOME/bin/kafka-server-start.sh $KAFKA_HOME/config/server1.properties

Now you will have two kafka brokers running on 192.168.2.2 on two different ports. To include the second broker for the console producer, change its start command to:

> $KAKFA_HOME/bin/kafka-console-producer.sh --broker-list 192.168.2.2:9092,192.168.2.2:9093,192.168.2.4:9092 --topic v-topic

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值