在https://github.com/MarcialRosales/rabbitmq-tracing-guide基础上添加一些个人的实践和理解
RabbitMQ Tracing Guide
rabbitmq_tracing plugin
rabbitmq_tracing 插件能够帮我们跟踪经过MQ的消息,并将他们持久化到磁盘,记录到日志文件中。从而节约问题定位和调试的时间成本。
Who can use it
只有administrators角色的用户可以开启rabbitmq插件,且,只有administrators角色的用户可以添加tracing,实现对消息的跟踪。
How to enable/configure it
需要如下两个步骤开启该插件
- 首先,enable该插件
rabbitmq-plugins enable rabbitmq_tracing
- 其次,配置
rabbitmq.config
文件,包括存储日志文件的位置、默认使用哪个用户创建tracing(创建队列、绑定队列到amq.rabbitmq.trace
,从该队列消费消息从而将消息记录到日志文件中)。(译者注:可以不配置文件,这样tracing插件会使用默认的路径存储日志文件,默认使用guest用户来创建trace。在生产环境中,一般将guest用户删除,所以此处就会报错。可以在添加trace时,填写Tracer connection username和Tracer connection password,使用指定创建trace的用户。故,单机rabbitmq不用配置文件和重启服务,可以避免对生产产生影响。)
....
{rabbitmq_tracing,
[
{directory, "/var/vcap/sys/log/rabbitmq-server/tracing"},
{username, <<"admin">>},
{password, <<"password">>}
]
},
....
- 重启RabbitMQ集群使配置文件生效。
- administrator角色的用户可以在Web UI中的Admin页中,看到Tracing选项。
How to start tracing
-
进入Web UI界面,点击Admin tab页,点击Tracing选项。
-
选择要跟踪消息的节点
-
为tracing命名,选择要监听的vhost、消息在日志中的格式,可以限制要记录消息payload的大小。(译者注,实际使用中发现,Json格式的payload是序列化之后的内容,且消息之间没有明显分隔符,可读性较差,可能更适合提供给下游进一步处理。如果追求可读性,应选择Text格式,payload会自动反序列化为原始文本内容,且消息之间有明确的分隔符。)
Warning: 如果同名的日志文件存在,应该先删除,否则创建tracing时会失败。 -
Pattern的填写:实践中发现,publish需要绑定exchange,deliver需要绑定queue。即,追踪进入MQ的消息,需要绑定到exchange,追踪离开MQ的消息,需要绑定到queue。
#
追踪所有进入和离开MQ的消息publish.#
追踪所有进入MQ的消息publish.myExchage
追踪所有进入到myExchange
的消息deliver.#
跟踪所有离开MQ的消息deliver.myQueue
追踪所有从myQueue
离开的消息#.myQueue
实测效果等同于deliver.myQueue
当成功添加了一个trace之后,我们可以看到
-
新增的trace
-
一个新增的Queue
-
一个新增的connection(相应的会有一个consumer channel),如下图,没有填写配置文件,创建时没有填写username和password会默认使用guest。
TL;DR It is really important to understand that rabbitmq_tracing
will ONLY trace messages that were published via the node(s) we are tracing on and likewise it will ONLY trace messages that were delivered via the node(s) we are tracing on. See section Tracing in action for further details.
TL;DR Purged messages are not delivered hence they are not traced at all.
TL;DR Binding our own queues directly to the amq.rabbitmq.trace
will not work when we are using the plugin. It only works when we have Firehose tracing on.
TL;DR If we have Firehose enabled and also a Trace, each one will work as expected. The Trace will trace the message in the corresponding log and the Firehose will send the message to any bound queue to the amq.rabbitmq.trace
exchange. The Trace will only receive the event once though.
Is it possible to capture publishing via the default exchange ?
How to stop tracing
只需要点击trace旁边的Stop
按钮就会关闭connection、删除创建时建立的Queue。点击Stop
不会将trace的日志文件删除,需要再点击日志文件旁边的Delete
按钮。
在Stop trace之前不要delete日志文件. 否则,trace仍会对消息进行追踪,但不会将消息落盘(相应的无法查看,也就没有意义)
可以通过如下命令完全禁用Tracing插件,
rabbimq-plugins disable rabbitmq_tracing
How to view traced messages
有三种方法:
- 在Web UI上点击trace日志名称,下载日志文件查看
- 通过API下载日志文件查看,API:
GET /api/trace-files/<name>
- 直接ssh到节点上到路径下查看日志文件内容
Tracing in action
Let’s use the rabbitmq_tracing plugin to trace publishing and delivery of messages in a 2 node (rmq/0
and rmq/1
) RabbitMQ cluster.
Tracing message publishing
We will see 3 scenarios where we demonstrate that in order to trace every published message, we need to have a trace on all the nodes the AMQP clients are publishing to.
Scenario 1:
- Define a trace on
rmq/0
node with patternpublish.amq.direct
and namepublish.amq.direct
- Send message via
rmq/0
node toq-rmq0
throughamq.direct
exchange
[AMQP Client]--publish(amq.direct#q-rmq0)--->[ ** RMQ Node rmq/0 ** ]----->{ q-rmq0 }
[ RMQ Node rmq/1 ]------{ q-rmq1 }
Run this script from rmq/0
node:
rabbitmqadmin publish routing_key=q-rmq0 exchange=amq.direct payload="publishing - scenario 1"
- Outcome: message is logged !!!
================================================================================
2018-09-26 15:51:02:262: Message published
Node: rabbit@rmq0-rmq-mrosales-20180925
Connection: <rabbit@rmq0-rmq-mrosales-20180925.1.8586.1>
Virtual host: /
User: admin
Channel: 1
Exchange: amq.direct
Routing keys: [<<"q-rmq0">>]
Routed queues: [<<"q-rmq0">>]
Properties: []
Payload:
publishing - scenario 1
Scenario 2:
- Same as in scenario 1; define a trace on
rmq/0
node with patternpublish.amq.direct
- Send message via
rmq/1
node toq-rmq0
throughamq.direct
exchange
[ ** RMQ Node rmq/0 ** ]---->{ q-rmq0 }
[AMQP Client]--publish(amq.direct)--->[ RMQ Node rmq/1 ]-----{ q-rmq1 }
Run this command from rmq/1
node:
rabbitmqadmin publish routing_key=q-rmq0 exchange=amq.direct payload="publish - scenario 2"
- Outcome: Nothing gets logged !!!
Tracing message delivery
Scenario 1:
- Define a trace on
rmq/0
node with patterndeliver.q-rmq0
- Consume message via
rmq/0
node fromq-rmq0
[ ** RMQ Node rmq/0 ** ]---{ q-rmq0 }------------->[AMQP Client]
[ ** RMQ Node rmq/1 ** ]
Run this command from rmq/0
:
rabbitmqadmin publish routing_key=q-rmq0 exchange=amq.direct payload="delivery - scenario 1"
rabbitmqadmin get queue=q-rmq0 count=1 ackmode=ack_requeue_false
- Outcome: Message gets logged on rmq/0 !!!
================================================================================
2018-09-26 14:22:36:558: Message received
Node: rabbit@rmq0-rmq-mrosales-20180925
Connection: <rabbit@rmq0-rmq-mrosales-20180925.1.18253.0>
Virtual host: /
User: admin
Channel: 1
Exchange: amq.direct
Routing keys: [<<"q-rmq0">>]
Queue: q-rmq0
Properties: []
Payload:
delivery - scenario 1
Scenario 2:
- Same as scenario 1, define a trace on
rmq/0
node with patterndeliver.q-rmq0
- Consume message via
rmq/1
node fromq-rmq0
[ ** RMQ Node rmq/0 ** ]
( q-rmq0 )
|
\/
[ RMQ Node rmq/1 ]-------------->[AMQP Client]
Run this command from rmq/1
:
rabbitmqadmin publish routing_key=q-rmq0 exchange=amq.direct payload="delivery - scenario 2"
rabbitmqadmin get queue=q-rmq0 count=1
- Outcome: Message is not logged because it is delivered thru a node which has no tracing
Scenario 3:
- Define trace on both
rmq/0
andrmq/1
with patterndeliver.q-rmq0
- Consume message from
q-rmq0
viarmq/1
node
[ ** RMQ Node rmq/0 ** ]
( q-rmq0 )
|
\/
[ RMQ Node rmq/1 ]-------------->[AMQP Client]
Run this command from rmq/1
:
rabbitmqadmin publish routing_key=q-rmq0 exchange=amq.direct payload="delivery - scenario 3"
rabbitmqadmin get queue=q-rmq0 count=1
- Outcome: Message is logged in both nodes !!!
Conclusion: If we don’t know where (i.e. RabbitMQ node) messages are being published from or delivered to, it is best to add a Trace to each node of the cluster. Unless we have very specific cases like these ones:
- publisher application is currently connected to
rmq/0
node and we want to capture what message is publishing, or - consumer application is currently connected to
rmq/1
node and we want to capture what messages is receiving
Implications of tracing messages with rabbitmq_tracing plugin
Having rabbitmq_tracing
plugin enabled has no negative performance impact if we have not defined any traces yet.
There is a significant throughput degradation when the messages (publish
and deliver
) are being traced. We run a performance test that shown a 66% throughput reduction. See details below:
- RabbitMQ 3.7.6 running with Erlang 20.3.8.1 on a MBP (2,5 GHz Intel Core i7)
- rabbitmq_tracing was enabled
rabbitmq-perf-test-1.1.0
used to simulate load and also run in the same machine as RabbitMQ
bin/runjava com.rabbitmq.perf.PerfTest -u test
- Baseline produced 24k msg/sec
- Add a trace (the only one in cluster) that traced both ,
publish
anddeliver
, messages. - Message throughput dropped to 8k msg/sec
We observe a slightly worse performance compared to using Firehose.