Kafka & ZooKeeper | Multi Node Cluster Setup

TODO

In This blog we will explains the setup of the Kafka & ZooKeeper Multi-Node cluster on a distributed environment.

What is Apache Kafka?

A high-throughput distributed messaging system is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumer.

What is ZooKeeper?

ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications. Each time they are implemented there is a lot of work that goes into fixing the bugs and race conditions that are inevitable. Because of the difficulty of implementing these kinds of services, applications initially usually skimp on them ,which make them brittle in the presence of change and difficult to manage. Even when done correctly, different implementations of these services lead to management complexity when the applications are deployed.

Learn more about ZooKeeper on the ZooKeeper Wiki.

Prerequisites

  1. Install Java if you do not have it already. You can get it from here
  2. Kafka Binary files : http://kafka.apache.org/downloads.html

Installation

  • Now first download the Kafka Tarball or binaries on your all instances and extract them
$ tar -xzvf kafka_2.11-0.9.0.1.tgz
$ mv kafka_2.11-0.9.0.1 kafka
  • On Both the Instances, you only need two properties to be changed i.e. zookeeper.properties & server.properties

Lets start to edit “zookeeper.properties” on all the instances

$ vi ~/kafka/config/zookeeper.properties
# The number of milliseconds of each tick
tickTime=2000
 
# The number of ticks that the initial synchronization phase can take
initLimit=10
 
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5

# zoo servers
server.1=x.x.x.x:2888:3888
server.2=x.x.x.x:2888:3888
server.3=x.x.x.x:2888:3888
#add here more servers if you want

Now edit all instances “server.properties” and update the following this

$ vi ~/kafka/config/server.properties
broker.id=1 //Increase by one as per node count
host.name=x.x.x.x //Current node IP
zookeeper.connect=x.x.x.x:2181,x.x.x.x:2181,x.x.x.x:2181
  • After this go to the /tmp of every instance and create following things
$ cd /tmp/
$ mkdir zookeeper #Zookeeper temp dir
$ cd zookeeper
$ touch myid  #Zookeeper temp file
$ echo '1' >> myid #Add Server ID for Respective Instances i.e. "server.1 and server.2 etc"
  • Now all is done, Need to start ZooKeeper and Kafka Server on all instances

$ bin/zookeeper-server-start.sh ~/kafka/config/zookeeper.properties

$ bin/kafka-server-start.sh ~/kafka/config/server.properties

We would look at how we can provide more useful tutorials to grow it , then we would be adding more content to it together. If you have any suggestion feel free to suggest us 🙂 Stay tuned.

Advertisements

One thought on “Kafka & ZooKeeper | Multi Node Cluster Setup

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s