kafka官方文档

转载自:

http://kafka.apache.org/documentation.html#quickstart


You're viewing documentation for an older version of Kafka - check out our current documentation here.

Documentation

Kafka 0.10.2 Documentation

Prior releases: 0.7.x, 0.8.0, 0.8.1.X, 0.8.2.X, 0.9.0.X, 0.10.0.X, 0.10.1.X.

1. Getting Started

1.1 Introduction

1.2 Use Cases

Here is a description of a few of the popular use cases for Apache Kafka™.For an overview of a number of these areas in action, see this blog post.

Messaging
Kafka works well as a replacement for a more traditional message broker.Message brokers are used for a variety of reasons (to decouple processing from data producers, to buffer unprocessed messages, etc).In comparison to most messaging systems Kafka has better throughput, built-in partitioning, replication, and fault-tolerance which makes it a goodsolution for large scale message processing applications.

In our experience messaging uses are often comparatively low-throughput, but may require low end-to-end latency and often depend on the strongdurability guarantees Kafka provides.

In this domain Kafka is comparable to traditional messaging systems such as ActiveMQ orRabbitMQ.

Website Activity Tracking
The original use case for Kafka was to be able to rebuild a user activity tracking pipeline as a set of real-time publish-subscribe feeds.This means site activity (page views, searches, or other actions users may take) is published to central topics with one topic per activity type.These feeds are available for subscription for a range of use cases including real-time processing, real-time monitoring, and loading into Hadoop oroffline data warehousing systems for offline processing and reporting.

Activity tracking is often very high volume as many activity messages are generated for each user page view.

Metrics
Kafka is often used for operational monitoring data.This involves aggregating statistics from distributed applications to produce centralized feeds of operational data.
Log Aggregation
Many people use Kafka as a replacement for a log aggregation solution.Log aggregation typically collects physical log files off servers and puts them in a central place (a file server or HDFS perhaps) for processing.Kafka abstracts away the details of files and gives a cleaner abstraction of log or event data as a stream of messages.This allows for lower-latency processing and easier support for multiple data sources and distributed data consumption.In comparison to log-centric systems like Scribe or Flume, Kafka offers equally good performance, stronger durability guarantees due to replication,and much lower end-to-end latency.
Stream Processing
Many users of Kafka process data in processing pipelines consisting of multiple stages, where raw input data is consumed from Kafka topics and thenaggregated, enriched, or otherwise transformed into new topics for further consumption or follow-up processing.For example, a processing pipeline for recommending news articles might crawl article content from RSS feeds and publish it to an "articles" topic;further processing might normalize or deduplicate this content and published the cleansed article content to a new topic;a final processing stage might attempt to recommend this content to users.Such processing pipelines create graphs of real-time data flows based on the individual topics.Starting in 0.10.0.0, a light-weight but powerful stream processing library called Kafka Streamsis available in Apache Kafka to perform such data processing as described above.Apart from Kafka Streams, alternative open source stream processing tools include Apache Storm and Apache Samza.
Event Sourcing
Event sourcing is a style of application design where state changes are logged as atime-ordered sequence of records. Kafka's support for very large stored log data makes it an excellent backend for an application built in this style.
Commit Log
Kafka can serve as a kind of external commit-log for a distributed system. The log helps replicate data between nodes and acts as a re-syncingmechanism for failed nodes to restore their data.The log compaction feature in Kafka helps support this usage.In this usage Kafka is similar to Apache BookKeeper project.

1.3 Quick Start

1.4 Ecosystem

There are a plethora of tools that integrate with Kafka outside the main distribution. The ecosystem page lists many of these, including stream processing systems, Hadoop integration, monitoring, and deployment tools.

1.5 Upgrading From Previous Versions

2. APIs

3. Configuration

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值