Zaqar Data Flow(Newton 版)

一、简介

架构图

老套路,首先上架构图:

zaqar仅有一个服务zaqar-server,其中主要内容集中在transport这个组件中,通过transport的类型(websocket or wsgi)来进行其他组件的初始化,对应关系如下:

当transport选用websocket时,会初始化api组件,在该组件中初始化storage和control中的api作为websocket的api,而选用wsgi时,则对底层的storage和control的api进行了一层封装。

wsgiwebsocket为两条主线进行分析:

二、wsgi组件分析

由于wsgi内部的API覆盖了大部分的zaqar功能,其API与功能一一对应,所以首先分析wsgi组件。

我先总结了transport选用wsgi的初始化流程图,对整体有个概念,有助于后续分析:

可以看到几个主要流程:run –> init transport –> init conf –> init storage –> init cache –> init control 

下面主要分析transport、storage、control:

1.init transport

首先看run方法:

def  run( self ):
     self .transport.listen()

调用self.transport的listen方法,这里主要看self.transport

#该修饰器:第一次创建时才被初始化
@decorators .lazy_property(write = False )
def  transport( self ):
     transport_name  =  self .driver_conf.transport
     LOG.debug(u 'Loading transport driver: %s' , transport_name)
     if  transport_name  = =  consts.TRANSPORT_WEBSOCKET:
         args  =  [ self .conf,  self .api,  self .cache]
     else :
         args  =  [ self .conf,  self .storage,  self .cache,  self .control]
     try :
         mgr  =  driver.DriverManager( 'zaqar.transport' , transport_name, invoke_on_load = True , invoke_args = args)
         return  mgr.driver
     except  RuntimeError as exc:
         LOG.exception(exc)
         LOG.error(_LE(u 'Failed to load transport driver zaqar.transport.'
                       u '%(driver)s with args %(args)s' ),
                   { 'driver' : transport_name,  'args' : args})
         raise  errors.InvalidDriver(exc)

依据transport_name,给args赋不同的值,调用driver.DriverManager类初始化zaqar.transport,并拓展加载args中的内容。

zaqar.transport在setup.cfg中的定义如下:

zaqar.transport  =
     wsgi  =  zaqar.transport.wsgi.driver:Driver
     websocket  =  zaqar.transport.websocket.driver:Driver

来具体看wsgi是如何初始化的:

class  DriverBase( object ):
     def  __init__( self , conf, storage, cache, control):
         self ._conf  =  conf
         self ._storage  =  storage
         self ._cache  =  cache
         self ._control  =  control
         self ._conf.register_opts(_GENERAL_TRANSPORT_OPTIONS)
         self ._defaults  =  ResourceDefaults( self ._conf)
 
class  Driver(transport.DriverBase):
     def  __init__( self , conf, storage, cache, control):
         super (Driver,  self ).__init__(conf, storage, cache, control)
         self ._conf.register_opts(_WSGI_OPTIONS, group = _WSGI_GROUP)
         self ._wsgi_conf  =  self ._conf[_WSGI_GROUP]
         self ._validate  =  validation.Validator( self ._conf)
         self .app  =  None
         self ._init_routes()
         self ._init_middleware()

首先是父类初始化,接着注册配置项、初始化validate(zaqar各功能的限制及认证,在wsgi中仅用于queue的限制验证,其余应用在websocket中)、初始化app(实际的初始化在_init_routes中)、初始化routes(即初始化各类API的路径及endpoint)、初始化middleware(主要是一些hooks,以方便单元测试)

后面在websocket中再分析validate,wsgi也就调用了Validator.queue_identification方法,该方法用于验证queue_name和project_id的长度,这里不详述。主要看_init_routes方法:

def  _init_routes( self ):
     catalog  =  [
         ( '/v1' , v1_0.public_endpoints( self self ._conf)),
         ( '/v1.1' , v1_1.public_endpoints( self self ._conf)),
         ( '/v2' , v2_0.public_endpoints( self self ._conf)),
         ( '/' , [('', version.Resource())])
     ]
     if  self ._conf.admin_mode:
         catalog.extend([
             ( '/v1' , v1_0.private_endpoints( self self ._conf)),
             ( '/v1.1' , v1_1.private_endpoints( self self ._conf)),
             ( '/v2' , v2_0.private_endpoints( self self ._conf)),
         ])
     if  (d_version.LooseVersion(falcon.__version__) > =
             d_version.LooseVersion( "1.0.0" )):
         middleware  =  [FuncMiddleware(hook)  for  hook  in  self .before_hooks]
         self .app  =  falcon.API(middleware = middleware)
     else :
         self .app  =  falcon.API(before = self .before_hooks)
     self .app.add_error_handler(Exception,  self ._error_handler)
     for  version_path, endpoints  in  catalog:
         if  endpoints:
             for  route, resource  in  endpoints:
                 self .app.add_route(version_path  +  route, resource)

catalog中定义了路径和endpoint,若admin_mode为true,则拓展endpoint。依据self.before_hooks内容初始化self.app,最后将catalog中的路由参数提取出来,调用add_route方法进行注册。

以catalog中的('/v2', v2_0.public_endpoints(self, self._conf))为例,看看怎么调用到底层具体位置:

#/transport/wsgi/v2_0/__init__.py
@decorators .api_version_manager(VERSION)
def  public_endpoints(driver, conf):
     queue_controller  =  driver._storage.queue_controller
     message_controller  =  driver._storage.message_controller
     claim_controller  =  driver._storage.claim_controller
     subscription_controller  =  driver._storage.subscription_controller
     defaults  =  driver._defaults
     return  [
         # Home
         ( '/' ,
          homedoc.Resource(conf)),
         # Queues Endpoints
         ( '/queues' ,
          queues.CollectionResource(driver._validate,
                                    queue_controller)),
         ( '/queues/{queue_name}' ,
          queues.ItemResource(driver._validate,
                              queue_controller,
                              message_controller)),

其中传入的driver对应上面Driver类,其父类DriverBase初始化时有:self._storage = storage,该storage初始化后续分析,内容为操作底层具体实现的通道(用于操作数据库或其他存储后端的driver);

return中返回的参数这里没有列全,仅以'/queues/{queue_name}'为例,拆分为_init_routes()方法中对应的参数:

version_path = '/v2', route = '/queues/{queue_name}', resource = queues.ItemResource(driver._validate, queue_controller, message_controller),由falcon库的falcon.API.add_route方法进行注册该条路由信息。

回到Driver.__init__中,接下来看self._init_middleware方法

def  _init_middleware( self ):
     auth_app  =  self .app
     if  self ._conf.auth_strategy:
         strategy  =  auth.strategy( self ._conf.auth_strategy)
         auth_app  =  strategy.install( self .app,  self ._conf)
     self .app  =  auth.SignedHeadersAuth( self .app, auth_app)
     acl.setup_policy( self ._conf)

auth_strategy参数决定是否使用身份验证,如果为空的话不使用身份验证,目前可用的身份验证策略为keystone,配置auth_strategy值为keystone即可。

接着调用oslo_policy库,初始化policy。

wsgi_transport组件的初始化到此告一段落,接下来分析其他组件

2.init storage

storage作为拓展组件被放在driver.DriverManager类中进行初始化,看具体的storage方法:

@decorators .lazy_property(write = False )
def  storage( self ):
     LOG.debug(u 'Loading storage driver' )
     if  self .conf.pooling:
         LOG.debug(u 'Storage pooling enabled' )
         storage_driver  =  pooling.DataDriver( self .conf,  self .cache,  self .control)
     else :
         storage_driver  =  storage_utils.load_storage_driver( self .conf,  self .cache, control_driver = self .control)
     LOG.debug(u 'Loading storage pipeline' )
     return  pipeline.DataDriver( self .conf, storage_driver,  self .control)

这里涉及到pooling参数,若该参数为true,则启用跨多个存储后端的池,使用存储驱动程序配置来查明保存(目录/控制)数据的位置。

pooling为true时,调用pooling.DataDriver类的初始化否则调用storage_utils.load_storage_driver方法初始化,前者直接将pooling.DataDriver类赋值给storage_driver,后者则在load_storage_driver方法中调用driver.DriverManager动态加载storage driver,对应的driver值为conf['drivers'],在配置文件中配置;下面来看代码:

def  load_storage_driver(conf, cache, storage_type = None ,
                         control_mode = False , control_driver = None ):
     if  control_mode:
         mode  =  'control'
         storage_type  =  storage_type  or  conf[ 'drivers' ].management_store
     else :
         mode  =  'data'
         storage_type  =  storage_type  or  conf[ 'drivers' ].message_store
     driver_type  =  'zaqar.{0}.storage' . format (mode)
     _invoke_args  =  [conf, cache]
     if  control_driver  is  not  None :
         _invoke_args.append(control_driver)
     try :
         mgr  =  driver.DriverManager(driver_type,
                                    storage_type,
                                    invoke_on_load = True ,
                                    invoke_args = _invoke_args)
         return  mgr.driver
     except  Exception as exc:
         LOG.error(_LE( 'Failed to load "{}" driver for "{}"' ). format (
             driver_type, storage_type))
         LOG.exception(exc)
         raise  errors.InvalidDriver(exc)

当前加载的为storage,mode值为data,driver_type=‘zaqar.data.storage’,对应的setup.cfg配置如下:

zaqar.data.storage  =
     mongodb  =  zaqar.storage.mongodb.driver:DataDriver
     mongodb.fifo  =  zaqar.storage.mongodb.driver:FIFODataDriver
     redis  =  zaqar.storage.redis.driver:DataDriver
     faulty  =  zaqar.tests.faulty_storage:DataDriver

以上zaqar.data.storage中的各个driver及上述pooling.DataDriver都以base.DataDriverBase类为基类,实现了一些公有的方法,如is_alive()、_health()、capabilities()、close()等等,是对后端存储的驱动方法。

结合之前transport初始化中的内容,其中的_storage.queue_controller、_storage.message_controller、_storage.claim_controller、_storage.subscription_controller都在pipeline.DataDriver类中有对应的实现,该类作为连接后端各存储的通道。

在各自的实现中都执行了一系列的操作,下面详细分析:

来看具体的代码(以_storage.queue_controller为例):

#storage/pipeline.py
class  DataDriver(base.DataDriverBase):
     def  __init__( self , conf, storage, control_driver):
         super (DataDriver,  self ).__init__(conf,  None , control_driver)
         self ._storage  =  storage
 
     @decorators .lazy_property(write = False )
     def  queue_controller( self ):
         stages  =  _get_builtin_entry_points( 'queue' self ._storage,  self .control_driver)
         stages.extend(_get_storage_pipeline( 'queue' self .conf))
         stages.append( self ._storage.queue_controller)
         return  common.Pipeline(stages)
 
def  _get_builtin_entry_points(resource_name, storage, control_driver, conf):
     builtin_entry_points  =  []
     namespace  =  '%s.%s.stages'  %  (storage.__module__, resource_name)
     extensions  =  extension.ExtensionManager(namespace, invoke_on_load = True , invoke_args = [storage, control_driver])
     if  len (extensions.extensions)  = =  0 :
         return  []
     for  ext  in  extensions.extensions:
         builtin_entry_points.append(ext.obj)
     if  conf.profiler.enabled  and  conf.profiler.trace_message_store:
         return  (profiler.trace_cls( "stages_controller" )
                 (builtin_entry_points))
     return  builtin_entry_points
 
def  _get_storage_pipeline(resource_name, conf,  * args,  * * kwargs):
     conf.register_opts(_PIPELINE_CONFIGS,
                        group = _PIPELINE_GROUP)
     storage_conf  =  conf[_PIPELINE_GROUP]
     pipeline  =  []
     for  ns  in  storage_conf[resource_name  +  '_pipeline' ]:
         try :
             mgr  =  driver.DriverManager( 'zaqar.storage.stages' ,
                                        ns,
                                        invoke_args = args,
                                        invoke_kwds = kwargs,
                                        invoke_on_load = True )
             pipeline.append(mgr.driver)
         except  RuntimeError as exc:
             LOG.warning(_(u 'Stage %(stage)s could not be imported: %(ex)s' ),
                         { 'stage' : ns,  'ex' str (exc)})
             continue
     return  pipeline

其中_get_builtin_entry_points方法动态加载zaqar.storage.mongodb.driver.queue.stages中的路径方法(这里选用的后端存储为MongoDB),对应setup.cfg中的内容如下:

zaqar.storage.mongodb.driver.queue.stages  =
     message_queue_handler  =  zaqar.storage.mongodb.messages:MessageQueueHandler

MessageQueueHandler类即对应后端存储的driver,这里作为一个stage存入stages中,接下来调用内建的extend方法,拓展_get_storage_pipeline('queue', self.conf)进stages中,再来看_get_storage_pipeline方法:

其中_PIPELINE_GROUP对应值为'storage',_PIPELINE_CONFIG对应值为resource+‘_pipeline’,对应conf文件中[storage]节中配置项,现有以下配置项:

  • queue_pipeline
  • message_pipeline
  • claim_pipeline
  • subscription_pipeline

这里我们的resource传入的是queue,故_get_storage_pipeline方法动态导入了queue_pipeline配置项指定的类。该类也作为一个stage存入stages中。也可以向这些存储管道中添加其他阶段。

最终会返回common.pipeline类,在该类中会依次取出stage(即各个controller类),并执行传入的method方法,没有则跳过。

3.init control

storage control组件用来管理数据,与storage组件一样,在transport组件初始化中被拓展加载:

#bootstrap.py
     @decorators .lazy_property(write = False )
     def  control( self ):
         LOG.debug(u 'Loading storage control driver' )
         return  storage_utils.load_storage_driver( self .conf,  self .cache,
                                                  control_mode = True )
 
#storage/utils.py
def  load_storage_driver(conf, cache, storage_type = None ,
                         control_mode = False , control_driver = None ):
     if  control_mode:
         mode  =  'control'
         storage_type  =  storage_type  or  conf[ 'drivers' ].management_store
     else :
         mode  =  'data'
         storage_type  =  storage_type  or  conf[ 'drivers' ].message_store
     driver_type  =  'zaqar.{0}.storage' . format (mode)
     _invoke_args  =  [conf, cache]
     if  control_driver  is  not  None :
         _invoke_args.append(control_driver)
     try :
         mgr  =  driver.DriverManager(driver_type, storage_type, invoke_on_load = True , invoke_args = _invoke_args)
         return  mgr.driver
     except  Exception as exc:
         LOG.error(_LE( 'Failed to load "{}" driver for "{}"' ). format (
             driver_type, storage_type))
         LOG.exception(exc)
         raise  errors.InvalidDriver(exc)

与storage的初始化类似,通过driver.DriverManager类进行载入,其命名空间为:zaqar.control.storage:

zaqar.control.storage  =
     sqlalchemy  =  zaqar.storage.sqlalchemy.driver:ControlDriver
     mongodb  =  zaqar.storage.mongodb.driver:ControlDriver
     redis  =  zaqar.storage.redis.driver:ControlDriver
     faulty  =  zaqar.tests.faulty_storage:ControlDriver

剩余调用类似,不再赘述。

接下来我们来看看storage和control有什么区别:

zaqar overview中简要说明了两者区别:control用来管理数据,而storage用来存储数据本身,我们以mongodb作为后端,看看两者的区别:

ControlDriver
DataDriver
catalogue_controllerclaim_controller
flavors_controllermessage_controller
pools_controllersubscription_controller
queue_controller 

显而易见,两者具有不同的控制对象,对于transport层,这两者属于下层并列的关系。

三、websocket组件分析

对应前面wsgi的分析流程,这里总结下当transport选用websocket时的初始化流程:

自己写了一个简单的client脚本,该client与zaqar-websocket的交互流程大致如下:

我觉得其中parse request和process request作为最重要的环节,parse request分析client端传过来的数据,处理后再进行process request,process过程依旧调用到底层的endpoint,通过后端的API执行操作。

1.client脚本

import  mock
import  uuid
from  autobahn.asyncio.websocket  import  WebSocketClientProtocol, \
     WebSocketClientFactory
from  zaqar.common  import  consts
from  zaqar.tests.unit.transport.websocket  import  utils as test_utils
 
class  MyClientProtocol(WebSocketClientProtocol):
     def  onConnect( self , response):
         print ( "Server connected: {0}" . format (response.peer))
     def  onOpen( self ):
         print ( "WebSocket connection open." )
         def  demo():
             self .project_id  =  '7e55e1a7e'
             self .headers  =  {
                 'Client-ID' str (uuid.uuid4()),
                 'X-Project-ID' self .project_id
             }
             sample_message  =  [
                 { 'body' : { 'key' 'value' },  'ttl' 200 },
             ]
             body  =  {
                 "queue_name" "kitkat" ,
                 "messages" : sample_messages
             }
             #Get serialization tools for testing websocket transport
             dumps, loads, create_req  =  test_utils.get_pack_tools(binary = True )
             req  =  create_req(consts.MESSAGE_POST, body,  self .headers)
             self .sendMessage(req, isBinary = True )
             self .factory.loop.call_later( 1 , demo)
         # start sending messages every second ..
         demo()
     def  onMessage( self , payload, isBinary):
         if  isBinary:
             print ( "Binary message received: {0} bytes" . format ( len (payload)))
         else :
             print ( "Text message received: {0}" . format (payload.decode( 'utf8' )))
     def  onClose( self , wasClean, code, reason):
         print ( "WebSocket connection closed: {0}" . format (reason))
 
if  __name__  = =  '__main__' :
     try :
         import  asyncio
     except  ImportError:
         # Trollius >= 0.3 was renamed
         import  trollius as asyncio
 
     factory  =  WebSocketClientFactory(u "ws://127.0.0.1:9000" )
     factory.protocol  =  MyClientProtocol
     loop  =  asyncio.get_event_loop()
     coro  =  loop.create_connection(factory,  '127.0.0.1' 9000 )
     loop.run_until_complete(coro)
     loop.run_forever()
     loop.close()

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值