RabbitMQ Turorial

10 篇文章 0 订阅
4 篇文章 0 订阅

Introduction

Where to get help

If you're having trouble going through this tutorial you cancontact us through the discussion list or directly.

RabbitMQ is a message broker. The principal idea is pretty simple: it acceptsand forwards messages. You can think about it as a post office: when you sendmail to the post box you're pretty sure that Mr. Postman will eventuallydeliver the mail to your recipient. Using this metaphor RabbitMQ is a post box,a post office and a postman.

The major difference between RabbitMQ and the post office is the fact that itdoesn't deal with paper, instead it accepts, stores and forwards binaryblobs of data ‒messages.

RabbitMQ, and messaging in general, uses some jargon.

  • Producing means nothing more than sending. A program that sends messages is aproducer. We'll draw it like that, with "P":

  • A queue is the name for a mailbox. It lives inside RabbitMQ. Although messages flow through RabbitMQ and your applications, they can be stored only inside aqueue. Aqueue is not bound by any limits, it can store as many messages as you like ‒ it's essentially an infinite buffer. Manyproducers can send messages that go to the one queue, manyconsumers can try to receive data from onequeue. A queue will be drawn as like that, with its name above it:

  • Consuming has a similar meaning to receiving. Aconsumer is a program that mostly waits to receive messages. On our drawings it's shown with "C":

Hello World!

(using the pika 0.9.5 Python client)

Our "Hello world" won't be too complex ‒ let's send a message, receiveit and print it on the screen. To do so we need two programs: one thatsends a message and one that receives and prints it.

Our overall design will look like:

Producer sends messages to the "hello" queue. The consumer receivesmessages from that queue.

RabbitMQ libraries

RabbitMQ speaks a protocol called AMQP. To use Rabbit you'll need a librarythat understands the same protocol as Rabbit. There is a choice of librariesfor almost every programming language. For python it's no different and thereare a bunch of libraries to choose from:

In this tutorial series we're going to use pika. To install ityou can use the pip package management tool:

$ sudo pip install pika==0.9.5

The installation depends on pip andgit-core packages, you mayneed to install them first.

  • On Ubuntu:

    $ sudo apt-get install python-pip git-core
    
  • On Debian:

    $ sudo apt-get install python-setuptools git-core
    $ sudo easy_install pip
    
  • On Windows:To install easy_install, run the MS Windows Installer forsetuptools

    > easy_install pip
    > pip install pika==0.9.5
    

Sending

Our first program send.py will send a single message to the queue.The first thing we need to do is to establish a connection withRabbitMQ server.

#!/usr/bin/env python
import pika

connection = pika.BlockingConnection(pika.ConnectionParameters(
               'localhost'))
channel = connection.channel()

We're connected now. Next, before sending we need to make sure therecipient queue exists. If we send a message to non-existing location,RabbitMQ will just trash the message. Let's create a queue to whichthe message will be delivered, let's name it hello:

channel.queue_declare(queue='hello')

At that point we're ready to send a message. Our first message willjust contain a stringHello World! and we want to send it to ourhello queue.

In RabbitMQ a message can never be sent directly to the queue, it alwaysneeds to go through anexchange. But let's not get dragged down by thedetails ‒ you can read more aboutexchanges inthe third part of thistutorial. All we need to know now is how to use a default exchangeidentified by an empty string. This exchange is special ‒ itallows us to specify exactly to which queue the message should go.The queue name needs to be specified in therouting_key parameter:

channel.basic_publish(exchange='',
                      routing_key='hello',
                      body='Hello World!')
print " [x] Sent 'Hello World!'"

Before exiting the program we need to make sure the network bufferswere flushed and our message was actually delivered to RabbitMQ. Wecan do it by gently closing the connection.

connection.close()

Receiving

Our second program receive.py will receive messages from the queue and printthem on the screen.

Again, first we need to connect to RabbitMQ server. The coderesponsible for connecting to Rabbit is the same as previously.

The next step, just like before, is to make sure that the queueexists. Creating a queue usingqueue_declare is idempotent ‒ wecan run the command as many times as we like, and only one will becreated.

channel.queue_declare(queue='hello')

You may ask why we declare the queue again ‒ we have already declared itin our previous code. We could avoid that if we were surethat the queue already exists. For example ifsend.py program wasrun before. But we're not yet sure whichprogram to run first. In such cases it's a good practice to repeatdeclaring the queue in both programs.

Listing queues

You may wish to see what queues RabbitMQ has and how manymessages are in them. You can do it (as a privileged user) using therabbitmqctl tool:

$ sudo rabbitmqctl list_queues
Listing queues ...
hello    0
...done.

(omit sudo on Windows)

Receiving messages from the queue is more complex. It works by subscribingacallback function to a queue. Whenever we receivea message, thiscallback function is called by the Pika library.In our case this function will print on the screen the contents ofthe message.

def callback(ch, method, properties, body):
    print " [x] Received %r" % (body,)

Next, we need to tell RabbitMQ that this particular callback function shouldreceive messages from ourhello queue:

channel.basic_consume(callback,
                      queue='hello',
                      no_ack=True)

For that command to succeed we must be sure that a queue which we wantto subscribe to exists. Fortunately we're confident about that ‒ we'vecreated a queue above ‒ usingqueue_declare.

The no_ack parameter will be describedlater on.

And finally, we enter a never-ending loop that waits for data and runs callbackswhenever necessary.

print ' [*] Waiting for messages. To exit press CTRL+C'
channel.start_consuming()

Putting it all together

Full code for send.py:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
#!/usr/bin/env python
import pika

connection = pika.BlockingConnection(pika.ConnectionParameters(
        host='localhost'))
channel = connection.channel()

channel.queue_declare(queue='hello')

channel.basic_publish(exchange='',
                      routing_key='hello',
                      body='Hello World!')
print " [x] Sent 'Hello World!'"
connection.close()

(send.py source)

Full receive.py code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
#!/usr/bin/env python
import pika

connection = pika.BlockingConnection(pika.ConnectionParameters(
        host='localhost'))
channel = connection.channel()

channel.queue_declare(queue='hello')

print ' [*] Waiting for messages. To exit press CTRL+C'

def callback(ch, method, properties, body):
    print " [x] Received %r" % (body,)

channel.basic_consume(callback,
                      queue='hello',
                      no_ack=True)

channel.start_consuming()

(receive.py source)

Now we can try out our programs in a terminal. First, let's send amessage using oursend.py program:

 $ python send.py
 [x] Sent 'Hello World!'

The producer program send.py will stop after every run. Let's receive it:

 $ python receive.py
 [*] Waiting for messages. To exit press CTRL+C
 [x] Received 'Hello World!'

Hurray! We were able to send our first message through RabbitMQ. As you mighthave noticed, thereceive.py program doesn't exit. It will stay ready toreceive further messages, and may be interrupted with Ctrl-C.

Try to run send.py again in a new terminal.

We've learned how to send and receive a message from a namedqueue. It's time to move on topart 2and build a simplework queue.



Work Queues

(using the pika 0.9.5 Python client)

Where to get help

If you're having trouble going through this tutorial you cancontact us through the discussion list or directly.

In the first tutorial wewrote programs to send and receive messages from a named queue. In thisone we'll create aWork Queue that will be used to distributetime-consuming tasks among multiple workers.

The main idea behind Work Queues (aka: Task Queues) is to avoiddoing a resource-intensive task immediately and having to wait forit to complete. Instead we schedule the task to be done later. We encapsulate atask as a message and send it to the queue. A worker process runningin the background will pop the tasks and eventually execute thejob. When you run many workers the tasks will be shared between them.

This concept is especially useful in web applications where it'simpossible to handle a complex task during a short HTTP requestwindow.

Preparation

In the previous part of this tutorial we sent a message containing"Hello World!". Now we'll be sending strings that stand for complextasks. We don't have a real-world task, like images to be resized orpdf files to be rendered, so let's fake it by just pretending we'rebusy - by using the time.sleep() function. We'll take the number of dotsin the string as its complexity; every dot will account for one secondof "work". For example, a fake task described byHello...will take three seconds.

We will slightly modify the send.py code from our previous example,to allow arbitrary messages to be sent from the command line. Thisprogram will schedule tasks to our work queue, so let's name itnew_task.py:

import sys

message = ' '.join(sys.argv[1:]) or "Hello World!"
channel.basic_publish(exchange='',
                      routing_key='hello',
                      body=message)
print " [x] Sent %r" % (message,)

Our old receive.py script also requires some changes: it needs tofake a second of work for every dot in the message body. It will popmessages from the queue and perform the task, so let's call itworker.py:

import time

def callback(ch, method, properties, body):
    print " [x] Received %r" % (body,)
    time.sleep( body.count('.') )
    print " [x] Done"

Round-robin dispatching

One of the advantages of using Task Queue is the ability to easilyparallelise work. If we are building up a backlog of work, we can justadd more workers and that way, scale easily.

First, let's try to run two worker.py scripts at the same time. Theywill both get messages from the queue, but how exactly? Let's see.

You need three consoles open. Two will run the worker.pyscript. These consoles will be our two consumers - C1 and C2.

shell1$ python worker.py
 [*] Waiting for messages. To exit press CTRL+C
shell2$ python worker.py
 [*] Waiting for messages. To exit press CTRL+C

In the third one we'll publish new tasks. Once you've startedthe consumers you can publish a few messages:

shell3$ python new_task.py First message.
shell3$ python new_task.py Second message..
shell3$ python new_task.py Third message...
shell3$ python new_task.py Fourth message....
shell3$ python new_task.py Fifth message.....

Let's see what is delivered to our workers:

shell1$ python worker.py
 [*] Waiting for messages. To exit press CTRL+C
 [x] Received 'First message.'
 [x] Received 'Third message...'
 [x] Received 'Fifth message.....'
shell2$ python worker.py
 [*] Waiting for messages. To exit press CTRL+C
 [x] Received 'Second message..'
 [x] Received 'Fourth message....'

By default, RabbitMQ will send each message to the next consumer,in sequence. On average every consumer will get the same number ofmessages. This way of distributing messages is called round-robin. Trythis out with three or more workers.

Message acknowledgment

Doing a task can take a few seconds. You may wonder what happens if one of the consumers starts a long task and dies with it only partly done. With our current code once RabbitMQ delivers message to the customer it immediately removes it from memory. In this case, if you kill a worker we will lose the message it was just processing. We'll also lose all the messages that were dispatched to this particular worker but were not yet handled.

But we don't want to lose any tasks. If a worker dies, we'd like the task to be delivered to another worker.

In order to make sure a message is never lost, RabbitMQ supports message acknowledgments. An ack(nowledgement) is sent back from the consumer to tell RabbitMQ that a particular message had been received, processed and that RabbitMQ is free to delete it.

If consumer dies without sending an ack, RabbitMQ will understand that amessage wasn't processed fully and will redeliver it to another consumer. That way you can be sure that no message is lost, even if the workers occasionally die.

There aren't any message timeouts; RabbitMQ will redeliver the message only when the worker connection dies. It's fine even if processing a message takes a very, very long time.

Message acknowledgments are turned on by default. In previous examples we explicitly turned them off via theno_ack=Trueflag. It's time to remove this flag and send a proper acknowledgment from the worker, once we're done with a task.

def callback(ch, method, properties, body):
    print " [x] Received %r" % (body,)
    time.sleep( body.count('.') )
    print " [x] Done"
    ch.basic_ack(delivery_tag = method.delivery_tag) /* send ack back*/

channel.basic_consume(callback,
                      queue='hello')

Using this code we can be sure that even if you kill a worker using CTRL+C while it was processing a message, nothing will be lost. Soon after the worker dies all unacknowledged messages will be redelivered.

Forgotten acknowledgment

It's a common mistake to miss the basic_ack. It's an easy error,but the consequences are serious. Messages will be redelivered when your client quits (which may look like random redelivery), but RabbitMQ will eat more and more memory as it won't be able to release any unacked messages.

In order to debug this kind of mistake you can userabbitmqctlto print themessages_unacknowledged field:

$ sudo rabbitmqctl list_queues name messages_ready messages_unacknowledged
Listing queues ...
hello    0       0
...done.

Message durability

We have learned how to make sure that even if the consumer dies, the task isn't lost. But our tasks will still be lost if RabbitMQ server stops.

When RabbitMQ quits or crashes it will forget the queues and messages unless you tell it not to. Two things are required to make sure that messages aren't lost: we need to mark both the queue and messages as durable.

First, we need to make sure that RabbitMQ will never lose our queue. In order to do so, we need to declare it asdurable:

channel.queue_declare(queue='hello', durable=True)

Although this command is correct by itself, it won't work in oursetup. That's because we've already defined a queue calledhello which is not durable.RabbitMQ doesn't allow you to redefine an existing queue with different parameters and will return an error to any programthat tries to do that. But there is a quick workaround - let's declare a queue with different name, for exampletask_queue:

channel.queue_declare(queue='task_queue', durable=True)

This queue_declare change needs to be applied to both the producer and consumer code.

At that point we're sure that the task_queue queue won't be lost even if RabbitMQ restarts. Now we need to mark our messages as persistent - by supplying adelivery_mode property with a value2.

channel.basic_publish(exchange='',
                      routing_key="task_queue",
                      body=message,
                      properties=pika.BasicProperties(
                         delivery_mode = 2, # make message persistent
                      ))
Note on message persistence

Marking messages as persistent doesn't fully guarantee that a messagewon't be lost. Although it tells RabbitMQ to save message to the disk,there is still a short time window when RabbitMQ has accepted a message and hasn't saved it yet. Also, RabbitMQ doesn't do fsync(2) for everymessage -- it may be just saved to cache and not really written to the disk. The persistence guarantees aren't strong, but it's more than enough for our simple task queue. If you need a stronger guarantee you can wrap the publishing code in a transaction.

Fair dispatch

You might have noticed that the dispatching still doesn't work exactly as we want. For example in a situation with two workers, when all odd messages are heavy and even messages are light, one worker will be constantly busy and the other one will do hardly any work. Well,RabbitMQ doesn't know anything about that and will still dispatch messages evenly.

This happens because RabbitMQ just dispatches a message when the message enters the queue. It doesn't look at the number of unacknowledged messages for a consumer. It just blindly dispatches every n-th message to the n-th consumer.

In order to defeat that we can use the basic.qos method with the prefetch_count=1 setting. This tells RabbitMQ not to give more than one message to a worker at a time. Or, in other words, don't dispatch a new message to a worker until it has processed and acknowledged the previous one. Instead, it will dispatch it to the next worker that is not still busy.

channel.basic_qos(prefetch_count=1)
Note about queue size

If all the workers are busy, your queue can fill up. You will want to keep aneye on that, and maybe add more workers, or have some other strategy.

Putting it all together

Final code of our new_task.py script:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
#!/usr/bin/env python
import pika
import sys

connection = pika.BlockingConnection(pika.ConnectionParameters(
        host='localhost'))
channel = connection.channel()

channel.queue_declare(queue='task_queue', durable=True)

message = ' '.join(sys.argv[1:]) or "Hello World!"
channel.basic_publish(exchange='',
                      routing_key='task_queue',
                      body=message,
                      properties=pika.BasicProperties(
                         delivery_mode = 2, # make message persistent
                      ))
print " [x] Sent %r" % (message,)
connection.close()

(new_task.py source)

And our worker:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
#!/usr/bin/env python
import pika
import time

connection = pika.BlockingConnection(pika.ConnectionParameters(
        host='localhost'))
channel = connection.channel()

channel.queue_declare(queue='task_queue', durable=True)
print ' [*] Waiting for messages. To exit press CTRL+C'

def callback(ch, method, properties, body):
    print " [x] Received %r" % (body,)
    time.sleep( body.count('.') )
    print " [x] Done"
    ch.basic_ack(delivery_tag = method.delivery_tag)

channel.basic_qos(prefetch_count=1)
channel.basic_consume(callback,
                      queue='task_queue')

channel.start_consuming()

(worker.py source)

Using message acknowledgments and prefetch_count you can set up a work queue. The durability options let the tasks survive even if RabbitMQ is restarted.

Now we can move on to tutorial 3 and learn howto deliver the same message to many consumers.


Publish/Subscribe

(using the pika 0.9.5 Python client)


In the previous tutorial we created a workqueue. The assumption behind a work queue is that each task is delivered to exactly one worker. In this part we'll do something completely different -- we'lldeliver a message to multiple consumers. This pattern is known as "publish/subscribe".

To illustrate the pattern, we're going to build a simple logging system. It will consist of two programs -- the first will emit log messages and the second will receive and print them.

In our logging system every running copy of the receiver program will get the messages. That way we'll be able to run one receiver and direct the logs to disk; and at the same time we'll be able to run another receiver and see the logs on the screen.

Essentially, published log messages are going to be broadcast to all the receivers.

Exchanges

In previous parts of the tutorial we sent and received messages to and from a queue. Now it's time to introduce the full messaging model in Rabbit.

Let's quickly go over what we covered in the previous tutorials:

  • A producer is a user application that sends messages.
  • A queue is a buffer that stores messages.
  • A consumer is a user application that receives messages.

The core idea in the messaging model in RabbitMQ is that the producer never sends any messages directly to a queue. Actually, quite often the producer doesn't even know if a message will be delivered to any queue at all.

Instead, the producer can only send messages to anexchange. An exchange is a very simple thing. On one side it receives messages from producers and the other side it pushes them to queues. The exchange must know exactly what to do with a message it receives. Should it be appended to a particular queue? Should it be appended to many queues? Or should it get discarded. The rules for that are defined by the exchange type.

There are a few exchange types available:direct, topic, headers and fanout. We'll focus on the last one -- the fanout. Let's create an exchange of that type, and call itlogs:

channel.exchange_declare(exchange='logs',
                         type='fanout')

The fanout exchange is very simple. As you can probably guess from the name, it justbroadcasts all the messages it receives to all the queues it knows. And that's exactly what we need for our logger.

Listing exchanges

To list the exchanges on the server you can run the ever usefulrabbitmqctl:

$ sudo rabbitmqctl list_exchanges
Listing exchanges ...
logs      fanout
amq.direct      direct
amq.topic       topic
amq.fanout      fanout
amq.headers     headers
...done.

In this list there are some amq.* exchanges. These are created by default, but it is unlikely you'll need to use them at the moment.

Nameless exchange

In previous parts of the tutorial we knew nothing about exchanges, but still were able to send messages to queues. That was possible because we were using a default exchange, which we identify by the empty string ("").

Recall how we published a message before:

channel.basic_publish(exchange='',
                      routing_key='hello',
                      body=message)

The exchange parameter is the the name of the exchange. The empty string denotes the default ornameless exchange:messages are routed to the queue with the name specified byrouting_key, if it exists.

Now, we can publish to our named exchange instead:

channel.basic_publish(exchange='logs',
                      routing_key='',
                      body=message)

Temporary queues

As you may remember previously we were using queues which had aspecified name (rememberhello andtask_queue?). Being able to name a queue was crucial for us -- we needed to point the workers to the same queue. Giving a queue a name is important when you want to share the queue between producers and consumers.

But that's not the case for our logger. We want to hear about all log messages, not just a subset of them. We're also interested only in currently flowing messages not in the old ones. To solve that we need two things.

Firstly, whenever we connect to Rabbit we need a fresh, empty queue. Todo it we could create a queue with a random name, or, even better - let the server choose a random queue name for us. We can do this by not supplying thequeue parameter to queue_declare:

result = channel.queue_declare()

At this point result.method.queue contains a random queue name. For example it may look likeamq.gen-U0srCoW8TsaXjNh73pnVAw==.

Secondly, once we disconnect the consumer the queue should bedeleted. There's anexclusive flag for that:

result = channel.queue_declare(exclusive=True)

Bindings

We've already created a fanout exchange and a queue. Now we need to tell the exchange to send messages to our queue. That relationship between exchange and a queue is called abinding.

channel.queue_bind(exchange='logs',
                   queue=result.method.queue)

From now on the logs exchange will append messages to our queue.

Listing bindings

You can list existing bindings using, you guessed it,rabbitmqctl list_bindings.

Putting it all together

The producer program, which emits log messages, doesn't look much different from the previous tutorial. The most important change is that we now want to publish messages to ourlogs exchange instead of the nameless one. We need to supply arouting_key when sending, but its value is ignored forfanout exchanges. Here goes the code foremit_log.py script:

#!/usr/bin/env python
import pika
import sys

connection = pika.BlockingConnection(pika.ConnectionParameters(
        host='localhost'))
channel = connection.channel()

channel.exchange_declare(exchange='logs',
                         type='fanout')

message = ' '.join(sys.argv[1:]) or "info: Hello World!"
channel.basic_publish(exchange='logs', # publish a message to exchange
                      routing_key='',
                      body=message)
print " [x] Sent %r" % (message,)
connection.close()

(emit_log.py source)

As you see, after establishing the connection we declared the exchange. This step is neccesary aspublishing to a non-existing exchange is forbidden.

The messages will be lost if no queue is bound to the exchange yet, but that's okay for us; if no consumer is listening yet we can safely discard the message.

The code for receive_logs.py:

#!/usr/bin/env python
import pika

connection = pika.BlockingConnection(pika.ConnectionParameters(
        host='localhost'))
channel = connection.channel()

channel.exchange_declare(exchange='logs',
                         type='fanout')

result = channel.queue_declare(exclusive=True)
queue_name = result.method.queue

channel.queue_bind(exchange='logs', # bind the default queue to exchange
                   queue=queue_name)

print ' [*] Waiting for logs. To exit press CTRL+C'

def callback(ch, method, properties, body):
    print " [x] %r" % (body,)

channel.basic_consume(callback,
                      queue=queue_name,
                      no_ack=True)

channel.start_consuming()

(receive_logs.py source)

We're done. If you want to save logs to a file, just open a console and type:

$ python receive_logs.py > logs_from_rabbit.log

If you wish to see the logs on your screen, spawn a new terminal and run:

$ python receive_logs.py

And of course, to emit logs type:

$ python emit_log.py

Using rabbitmqctl list_bindings you can verify that the code actually creates bindings and queues as we want. With tworeceive_logs.py programs running you should see something like:

$ sudo rabbitmqctl list_bindings
Listing bindings ...
 ...
logs    amq.gen-TJWkez28YpImbWdRKMa8sg==                []
logs    amq.gen-x0kymA4yPzAT6BoC/YP+zw==                []
...done.

The interpretation of the result is straightforward: data from exchangelogs goes to two queues with server-assigned names. And that's exactly what we intended.

To find out how to listen for a subset of messages, let's move on totutorial 4



其他参考文献:
1. RabbitMQ-C客户端使用说明
    http://www.cnblogs.com/liuhao/archive/2012/04/13/2445641.html


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值