起因:公司的移动APPsaas后台项目基本稳定,但是总感觉不够精炼,和一些成熟的开源python框架比感觉缺乏美感,总想着重构后台代码,但是做的时候一团乱麻,不知道从何处下手;
由于缺乏框架实现的经验,所以打算从使用的几个Python框架入手,先学习别人的框架设计思路;
以此为为记,2017年3月31日。
pony,一个ORM的mode实现(ORM中M的实现)
pony的mode有点特殊,需要继承Database中的成员类,直接撸关键代码:
classDatabase(object):
@cut_tracebackdef __init__(self, *args, **kwargs):#argument 'self' cannot be named 'database', because 'database' can be in kwargs
self.priority =0
self._insert_cache={}#ER-diagram related stuff:
self._translator_cache ={}
self._constructed_sql_cache={}
self.entities={}
self.schema=None
self.Entity= type.__new__(EntityMeta, 'Entity', (Entity,), {})
self.Entity._database_=self#Statistics-related stuff:
self._global_stats ={}
self._global_stats_lock=RLock()
self._dblocal=DbLocal()
self.provider=Noneif args or kwargs: self._bind(*args, **kwargs)
用户自定义的mode是Database中的Entity变量,这个变量是一个类,实现用户自定义变量的获取的转化处理;这样实现和Database偶合在一起了,即mode实例不能单独存在,必须依附于Database实例。
self.Entity = type.__new__(EntityMeta, 'Entity', (Entity,), {})
自己实现的mode继承方式:
classCustomer(db.Entity):
id= PrimaryKey(int, auto=True)
name=Required(str)
email= Required(str, unique=True)
orders= Set("Order")
既然是一个db实例的的db.Entity成员,只不过这个成员比较特殊,是一个类:
继续查看EntityMeta、Entity是何方神圣:
代码太长只摘取关键部分:
classEntityMeta(type):def __new__(meta, name, bases, cls_dict):if 'Entity' inglobals():if '__slots__' in cls_dict: throw(TypeError, 'Entity classes cannot contain __slots__ variable')
cls_dict['__slots__'] =()return super(EntityMeta, meta).__new__(meta, name, bases, cls_dict)
@cut_tracebackdef __init__(entity, name, bases, cls_dict):
super(EntityMeta, entity).__init__(name, bases, cls_dict)
.......
# 查找mode中用户自定义属性,并根据属性类型做转化从而适配数据库,具体看Attribute类;
direct_bases= [ c for c in entity.__bases__ if issubclass(c, Entity) and c.__name__ != 'Entity']
entity._direct_bases_=direct_bases
base_attrs=[]for base indirect_bases:for a inbase._attrs_:
prev=base_attrs_dict.get(a.name)if prev isNone:
base_attrs_dict[a.name]=a
base_attrs.append(a)
entity._base_attrs_=base_attrs
new_attrs=[]for name, attr in items_list(entity.__dict__):if name in base_attrs_dict: throw(ERDiagramError, "Name '%s' hides base attribute %s" %(name,base_attrs_dict[name]))if not isinstance(attr, Attribute): continue
if name.startswith('_') and name.endswith('_'): throw(ERDiagramError,'Attribute name cannot both start and end with underscore. Got: %s' %name)if attr.entity is notNone: throw(ERDiagramError,'Duplicate use of attribute %s in entity %s' % (attr, entity.__name__))
attr._init_(entity, name)
new_attrs.append(attr)
# 按照定义的顺序排序
new_attrs.sort(key=attrgetter('id'))
# 完成属性的收集
entity._new_attrs_ = new_attrs
entity._attrs_ = base_attrs + new_attrs
entity._adict_ = {attr.name: attr for attr in entity._attrs_}
用户调用接口:
@cut_tracebackdef __getitem__(entity, key):if type(key) is not tuple: key =(key,)if len(key) !=len(entity._pk_attrs_):
throw(TypeError,'Invalid count of attrs in %s primary key (%s instead of %s)'
% (entity.__name__, len(key), len(entity._pk_attrs_)))
kwargs= {attr.name: value for attr, value inizip(entity._pk_attrs_, key)}return entity._find_one_(kwargs)
Entity是以EntityMeta为元类的一个类,主要处理数据库中的复杂关系:
classEntity(with_metaclass(EntityMeta)):
.......
上面的定义和下面等价:
classEntity(object):__metaclass__ = EntityMeta
这样写是为了兼容py2和py3的差异:
py3中的语法为:
class MyClass(metaclass=Meta):pass
由于牵涉到元类的使用,实现难度:4颗星
关键:捕获用户自定义变量,实现底层存储和转化的封装,常用户ORM的M层实现。
总结:要实现子类成员的收集分以下3步,
1、需要实现自己的元类;
2、对子类类型进行判断,同类型属性合并
4、对外实现接口,如:__getitem__,__setter__
元类的使用可以参考:http://blog.jobbole.com/21351/
下面对比infi.clickhouse_orm中M的实现方式:
第一步:创建自己的元类
classModelBase(type):'''A metaclass for ORM models. It adds the _fields list to model classes.'''ad_hoc_model_cache={}def __new__(cls, name, bases, attrs):
new_cls= super(ModelBase, cls).__new__(cls, name, bases, attrs)#Collect fields from parent classes
base_fields =[]for base inbases:ifisinstance(base, ModelBase):
base_fields+=base._fields#Build a list of fields, in the order they were listed in the class
fields = base_fields + [item for item in attrs.items() if isinstance(item[1], Field)]
fields.sort(key=lambda item: item[1].creation_counter)
setattr(new_cls,'_fields', fields)return new_cls
其中,_fields存放用户自定义(类)属性:
第二步:实现M的基类,提供对外调用的接口
classModel(with_metaclass(ModelBase)):'''A base class for ORM models.'''engine=None
readonly=Falsedef __init__(self, **kwargs):'''Creates a model instance, using keyword arguments as field values.
Since values are immediately converted to their Pythonic type,
invalid values will cause a ValueError to be raised.
Unrecognized field names will cause an AttributeError.'''super(Model, self).__init__()
self._database=None#Assign field values from keyword arguments
for name, value inkwargs.items():
field=self.get_field(name)iffield:
setattr(self, name, value)else:raise AttributeError('%s does not have a field called %s' % (self.__class__.__name__, name))#Assign default values for fields not included in the keyword arguments
for name, field inself._fields:if name not inkwargs:
setattr(self, name, field.default)def __setattr__(self, name, value):'''When setting a field value, converts the value to its Pythonic type and validates it.
This may raise a ValueError.'''field=self.get_field(name)
# 当field没有被覆盖,还是Field类型iffield:
value=field.to_python(value, pytz.utc)
field.validate(value)
# 如果已经被覆盖,直接覆盖(此处有bug,初次赋值对类型做检查,再次赋值不会对类型做检查)
super(Model, self).__setattr__(name, value)defget_field(self, name):'''Get a Field instance given its name, or None if not found.'''field= getattr(self.__class__, name, None)return field if isinstance(field, Field) else None
其中:
__init__提供类似ModeSome(**kwargs)的构建方式,_fields的作用1、初始化时设置默认值,2、在类级别保存Field,因为ModeSome(**kwargs)及__setattr__会覆盖Field属性。
__setattr__提供类似字典赋值的接口,
在抓取界赫赫有名的Scrapy中的用户自定义Item也用到了ORM模型的思想:
大家感受一下scrapy中元类的实现方式:
classItemMeta(ABCMeta):def __new__(mcs, class_name, bases, attrs):
classcell= attrs.pop('__classcell__', None)
new_bases= tuple(base._class for base in bases if hasattr(base, '_class'))
_class= super(ItemMeta, mcs).__new__(mcs, 'x_' +class_name, new_bases, attrs)
fields= getattr(_class, 'fields', {})
new_attrs={}for n indir(_class):
v=getattr(_class, n)ifisinstance(v, Field):
fields[n]=velif n inattrs:
new_attrs[n]=attrs[n]
new_attrs['fields'] =fields
new_attrs['_class'] =_classif classcell is notNone:
new_attrs['__classcell__'] =classcellreturn super(ItemMeta, mcs).__new__(mcs, class_name, bases, new_attrs)
元类继承了ABCMeta而来,子类的区分方式是根据是否包含_class变量来区分的
scrapy中Item的父类:
live_refs =defaultdict(weakref.WeakKeyDictionary)classobject_ref(object):"""Inherit from this class (instead of object) to a keep a record of live
instances"""
__slots__ =()def __new__(cls, *args, **kwargs):
obj= object.__new__(cls)
live_refs[cls][obj]=time()returnobjclassBaseItem(object_ref):"""Base class for all scraped items."""
pass
Item的实现:
classDictItem(MutableMapping, BaseItem):
fields ={}
def __init__(self, *args, **kwargs):
self._values ={}
if args or kwargs: # avoid creating dict for most common case
for k, v in six.iteritems(dict(*args, **kwargs)):
self[k] =v
...........
@six.add_metaclass(ItemMeta)
classItem(DictItem):
pass
Item复用了MutableMapping类,其行为更像python原生字典。
pony加载不同的provider实现(动态创建实例)
def _bind(self, *argv, **kwargs):if self.provider is notNone:
throw(TypeError,'Database object was already bound to %s provider' %self.provider.dialect)if args: provider, args = args[0], args[1:]elif 'provider' not in kwargs: throw(TypeError, 'Database provider is not specified')else: provider = kwargs.pop('provider')if isinstance(provider, type) andissubclass(provider, DBAPIProvider):
provider_cls=providerelse:if notisinstance(provider, basestring): throw(TypeError)if provider == 'pygresql': throw(TypeError,'Pony no longer supports PyGreSQL module. Please use psycopg2 instead.')
provider_module= import_module('pony.orm.dbproviders.' +provider)
provider_cls=provider_module.provider_cls
self.provider= provider = provider_cls(*args, **kwargs)
关键代码:
provider_module = import_module('pony.orm.dbproviders.' +provider)
provider_cls=provider_module.provider_cls
self.provider= provider = provider_cls(*args, **kwargs)
provider由是用户输入的标识字符串,所有的provider模块对外统一接口名:provider_cls
以SQLite模块为例:
provider_cls = SQLiteProvider
调用举例:
db = Database("sqlite", "demo.sqlite", create_db=True)
实现技巧:1颗星
适用场景,通过标识创建所需要的对象,工场模式。