Scribe是facebook开源的分布式日志系统。项目主页是https://github.com/facebook/scribe
Kestrel是Twitter开源的分布式消息队列系统,采用Scala实现,其原型是采用Ruby实现的starling。项目主页是https://github.com/robey/kestrel
Scribe的server和client之间通信采用Thrift实现,server内部有一个消息队列,接收来自client的log-ertry先进入消息队列,最终写到文件或hdfs。
example给了两种使用模式:
1)single Scribe server:服务端接收到log-entry,直接写到文件
2)multi Scribe sever:server之间有主从之分,master负责执行写操作,slaves负责将接收的log-entry转发给master
二者结合过程中,发现Scribe的最终输出是文件或hdfs,无法与Kestrel直接通信,必然要在他们之间搭个桥。
这里就利用到了上述的第二种模式,Scribe的network type。
slaves和master的通信过程同样是RPC调用,所以,解决方法依旧要靠RPC,
以Python为例,SDK的if目录下有.thrift接口描述文件,编译安装后会有现成的Scribe库,利用该库实现一个Thrift RPC server,在scribeHandler的Log方法将接收到的消息,投递到Kestrel,即完成中转。
消息类型是LogEntry,有两个重要属性category和message,category可以看出是消息队列名称,相同category的message会存放在同一个队列上。
Kestrel有三种协议:memcached、thrift、text。分别在独立的端口提供服务,默认如下:
memcached => 22133,thrift => 2229 ,text => 2222
text相对简单,模拟消息生产和消费:telnet到Kestrel server上,使用"put <queue_name>:\nmessage\n\n"进行生产和"get <queue_name>\n"进行消费,"\n"看作回车键即可。
使用过程中发现text总出奇怪的问题,比如读/写消息失败,阅读Wiki:
Text protocol
Kestrel supports a limited, text-only protocol. You are encouraged to use the memcache protocol instead.
The text protocol does not support reliable reads.
不建议使用纯文本协议,后用Thrift协议重新实现client,比较稳定,目前尚未遇到问题。
初识Scribe和Kestrel,如有疏漏,还请指出!
下面是示例代码
#!/usr/bin/env python2.6 # file: transition.py # author: caosiyang # date: 2013/04/08 import sys import socket from scribe import scribe from thrift.transport import TSocket from thrift.transport import TTransport from thrift.protocol import TBinaryProtocol from thrift.server import TServer from thrift.server import TNonblockingServer import kestrel_handler from utils import * class scribeHandler: """Scribe handler.""" def __init__(self): pass def Log(self, messages): """Log handler.""" print 'recv:' , messages for item in messages: print 'category: %s, message: %s' % (item.category, item.message) return 0 class scribeKestrelHandler(scribeHandler): """Scribe handler with kestrel. Put the log entries that scribe received into kestrel. """ def __init__(self, _kh): self._kh = _kh def Log(self, messages): """Log handler.""" print 'recv:' , messages for item in messages: print 'category: %s, message: %s' % (item.category, item.message) self._kh.put(item.category, item.message[:-1]) return 0 def main(): #kh = kestrel_handler.KestrelTextHandler('127.0.0.1', 2222) kh = kestrel_handler.KestrelThriftHandler('127.0.0.1', 2229) handler = scribeKestrelHandler(kh) #handler = scribeHandler() processor = scribe.Processor(handler) server_transport = TSocket.TServerSocket(port=1463) transport_factory = TTransport.TFramedTransportFactory() protocol_factory = TBinaryProtocol.TBinaryProtocolFactory() #server = TServer.TThreadedServer(processor, server_transport, transport_factory, protocol_factory) server = TNonblockingServer.TNonblockingServer(processor, server_transport, protocol_factory, protocol_factory) print 'Starting Transition-Server ...' server.serve() kh.close() if __name__ == '__main__': main()