Django中数据库保存save()源码分析

文章详细分析了Django中Model的save方法如何处理数据库记录的更新和创建。首先介绍了模型和视图函数的示例,然后深入到源码层面,解释了save方法内部如何先尝试更新,如果无匹配记录则执行插入操作。最后,通过一个具体的例子展示了save方法在已有主键时如何更新记录。
摘要由CSDN通过智能技术生成

环境:

win10 、python 3.6.5 django 1.11.8

背景:

1、数据库记录更新;2、数据库记录创建;3、创建一条已存在该主键的记录

        1、3实现了数据库记录的更新;2实现了记录的创建; save实现的原理是什么?

分析:

1、模型:

class DelayTest(models.Model):
    id = models.AutoField(auto_created=True, primary_key=True)
    name = models.CharField(max_length=20, null=True, verbose_name="名称")
    createdate = models.DateTimeField(auto_now_add=True)
    updatedate = models.DateTimeField(auto_now=True)

    class Meta:
        db_table = "delaytest"

2、视图函数

from datetime import datetime
"创建一条数据"
blogs = DelayTest(name="hello")
blogs.save()
"创建一条已存在的数据"
blogs = DelayTest(id=1, name="hello", createdate=datetime.now())
blogs.save()
"更改一条数据"
blogs = DelayTest.objects.get(id=2)
blogs.save()

源码分析:

        创建一条数据

        save调用了django\db\models\base.py内的save()函数,源码如下:

    def save(self, force_insert=False, force_update=False, using=None,
             update_fields=None):
        """
        Saves the current instance. Override this in a subclass if you want to
        control the saving process.

        The 'force_insert' and 'force_update' parameters can be used to insist
        that the "save" must be an SQL insert or update (or equivalent for
        non-SQL backends), respectively. Normally, they should not be set.
        """
        # Ensure that a model instance without a PK hasn't been assigned to
        # a ForeignKey or OneToOneField on this model. If the field is
        # nullable, allowing the save() would result in silent data loss.
        for field in self._meta.concrete_fields:
            if field.is_relation:
                # If the related field isn't cached, then an instance hasn't
                # been assigned and there's no need to worry about this check.
                try:
                    getattr(self, field.get_cache_name())
                except AttributeError:
                    continue
                obj = getattr(self, field.name, None)
                # A pk may have been assigned manually to a model instance not
                # saved to the database (or auto-generated in a case like
                # UUIDField), but we allow the save to proceed and rely on the
                # database to raise an IntegrityError if applicable. If
                # constraints aren't supported by the database, there's the
                # unavoidable risk of data corruption.
                if obj and obj.pk is None:
                    # Remove the object from a related instance cache.
                    if not field.remote_field.multiple:
                        delattr(obj, field.remote_field.get_cache_name())
                    raise ValueError(
                        "save() prohibited to prevent data loss due to "
                        "unsaved related object '%s'." % field.name
                    )

        using = using or router.db_for_write(self.__class__, instance=self)
        if force_insert and (force_update or update_fields):
            raise ValueError("Cannot force both insert and updating in model saving.")

        deferred_fields = self.get_deferred_fields()
        if update_fields is not None:
            # If update_fields is empty, skip the save. We do also check for
            # no-op saves later on for inheritance cases. This bailout is
            # still needed for skipping signal sending.
            if len(update_fields) == 0:
                return

            update_fields = frozenset(update_fields)
            field_names = set()

            for field in self._meta.fields:
                if not field.primary_key:
                    field_names.add(field.name)

                    if field.name != field.attname:
                        field_names.add(field.attname)

            non_model_fields = update_fields.difference(field_names)

            if non_model_fields:
                raise ValueError("The following fields do not exist in this "
                                 "model or are m2m fields: %s"
                                 % ', '.join(non_model_fields))

        # If saving to the same database, and this model is deferred, then
        # automatically do a "update_fields" save on the loaded fields.
        elif not force_insert and deferred_fields and using == self._state.db:
            field_names = set()
            for field in self._meta.concrete_fields:
                if not field.primary_key and not hasattr(field, 'through'):
                    field_names.add(field.attname)
            loaded_fields = field_names.difference(deferred_fields)
            if loaded_fields:
                update_fields = frozenset(loaded_fields)

        self.save_base(using=using, force_insert=force_insert,
                       force_update=force_update, update_fields=update_fields)
    save.alters_data = True

这里注意:外键、many_to_many、one_to_one这些情况,注释如下:

        # Ensure that a model instance without a PK hasn't been assigned to
        # a ForeignKey or OneToOneField on this model. If the field is
        # nullable, allowing the save() would result in silent data loss.

最终调用了save_base函数,源码如下:

    def save_base(self, raw=False, force_insert=False,
                  force_update=False, using=None, update_fields=None):
        """
        Handles the parts of saving which should be done only once per save,
        yet need to be done in raw saves, too. This includes some sanity
        checks and signal sending.

        The 'raw' argument is telling save_base not to save any parent
        models and not to do any changes to the values before save. This
        is used by fixture loading.
        """
        using = using or router.db_for_write(self.__class__, instance=self)
        assert not (force_insert and (force_update or update_fields))
        assert update_fields is None or len(update_fields) > 0
        cls = origin = self.__class__
        # Skip proxies, but keep the origin as the proxy model.
        if cls._meta.proxy:
            cls = cls._meta.concrete_model
        meta = cls._meta
        if not meta.auto_created:
            pre_save.send(
                sender=origin, instance=self, raw=raw, using=using,
                update_fields=update_fields,
            )
        with transaction.atomic(using=using, savepoint=False):
            if not raw:
                self._save_parents(cls, using, update_fields)
            updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
        # Store the database on which the object was saved
        self._state.db = using
        # Once saved, this is no longer a to-be-added instance.
        self._state.adding = False

        # Signal that the save is complete
        if not meta.auto_created:
            post_save.send(
                sender=origin, instance=self, created=(not updated),
                update_fields=update_fields, raw=raw, using=using,
            )

    save_base.alters_data = True

此处先是返送pre_save信号,然后调用支持事务with方式去修改数据库,之后发送post_save信号,存储完成,其核心处理代码是:

updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)

此处调用了_save_table方法,源码如下:

    def _save_table(self, raw=False, cls=None, force_insert=False,
                    force_update=False, using=None, update_fields=None):
        """
        Does the heavy-lifting involved in saving. Updates or inserts the data
        for a single table.
        """
        meta = cls._meta
        non_pks = [f for f in meta.local_concrete_fields if not f.primary_key]

        if update_fields:
            non_pks = [f for f in non_pks
                       if f.name in update_fields or f.attname in update_fields]

        pk_val = self._get_pk_val(meta)
        if pk_val is None:
            pk_val = meta.pk.get_pk_value_on_save(self)
            setattr(self, meta.pk.attname, pk_val)
        pk_set = pk_val is not None
        if not pk_set and (force_update or update_fields):
            raise ValueError("Cannot force an update in save() with no primary key.")
        updated = False
        # If possible, try an UPDATE. If that doesn't update anything, do an INSERT.
        if pk_set and not force_insert:
            base_qs = cls._base_manager.using(using)
            values = [(f, None, (getattr(self, f.attname) if raw else f.pre_save(self, False)))
                      for f in non_pks]
            forced_update = update_fields or force_update
            updated = self._do_update(base_qs, using, pk_val, values, update_fields,
                                      forced_update)
            if force_update and not updated:
                raise DatabaseError("Forced update did not affect any rows.")
            if update_fields and not updated:
                raise DatabaseError("Save with update_fields did not affect any rows.")
        if not updated:
            if meta.order_with_respect_to:
                # If this is a model with an order_with_respect_to
                # autopopulate the _order field
                field = meta.order_with_respect_to
                filter_args = field.get_filter_kwargs_for_object(self)
                order_value = cls._base_manager.using(using).filter(**filter_args).count()
                self._order = order_value

            fields = meta.local_concrete_fields
            if not pk_set:
                fields = [f for f in fields if f is not meta.auto_field]

            update_pk = meta.auto_field and not pk_set
            result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
            if update_pk:
                setattr(self, meta.pk.attname, result)
        return updated
base_qs是该table内所有数据的查询集;pk_val是一个主键id;updated是标志位,说明该语句是update还是其他类型;先进行更新(_do_update),若是失败,再进行插入(_do_insert),其注释内容也说明了:
# If possible, try an UPDATE. If that doesn't update anything, do an INSERT.

此处调用了_do_update,其源码如下:

    def _do_update(self, base_qs, using, pk_val, values, update_fields, forced_update):
        """
        This method will try to update the model. If the model was updated (in
        the sense that an update query was done and a matching row was found
        from the DB) the method will return True.
        """
        filtered = base_qs.filter(pk=pk_val)
        if not values:
            # We can end up here when saving a model in inheritance chain where
            # update_fields doesn't target any field in current model. In that
            # case we just say the update succeeded. Another case ending up here
            # is a model with just PK - in that case check that the PK still
            # exists.
            return update_fields is not None or filtered.exists()
        if self._meta.select_on_save and not forced_update:
            if filtered.exists():
                # It may happen that the object is deleted from the DB right after
                # this check, causing the subsequent UPDATE to return zero matching
                # rows. The same result can occur in some rare cases when the
                # database returns zero despite the UPDATE being executed
                # successfully (a row is matched and updated). In order to
                # distinguish these two cases, the object's existence in the
                # database is again checked for if the UPDATE query returns 0.
                return filtered._update(values) > 0 or filtered.exists()
            else:
                return False
        return filtered._update(values) > 0

此处将之前的base_qs进行过滤,过滤条件为pk(主键),很明显,创建的时候主键是自动生成的,若是过滤,则返回为空集合,最后执行

return filtered._update(values) > 0

此处调用_update方法,该方法路径为:django\db\models\query.py下的_update方法,源码:

    def _update(self, values):
        """
        A version of update that accepts field objects instead of field names.
        Used primarily for model saving and not intended for use by general
        code (it requires too much poking around at model internals to be
        useful at that level).
        """
        assert self.query.can_filter(), \
            "Cannot update a query once a slice has been taken."
        query = self.query.clone(sql.UpdateQuery)
        query.add_update_fields(values)
        self._result_cache = None
        return query.get_compiler(self.db).execute_sql(CURSOR)
    _update.alters_data = True
    _update.queryset_only = False

显然,此处存在执行sql语句的函数:execute_sql,

路径:django\db\models\sql\compiler.py

源码:

    def execute_sql(self, result_type=MULTI, chunked_fetch=False):
        """
        Run the query against the database and returns the result(s). The
        return value is a single data item if result_type is SINGLE, or an
        iterator over the results if the result_type is MULTI.

        result_type is either MULTI (use fetchmany() to retrieve all rows),
        SINGLE (only retrieve a single row), or None. In this last case, the
        cursor is returned if any query is executed, since it's used by
        subclasses such as InsertQuery). It's possible, however, that no query
        is needed, as the filters describe an empty set. In that case, None is
        returned, to avoid any unnecessary database interaction.
        """
        if not result_type:
            result_type = NO_RESULTS
        try:
            sql, params = self.as_sql()
            if not sql:
                raise EmptyResultSet
        except EmptyResultSet:
            if result_type == MULTI:
                return iter([])
            else:
                return
        if chunked_fetch:
            cursor = self.connection.chunked_cursor()
        else:
            cursor = self.connection.cursor()
        try:
            cursor.execute(sql, params)
        except Exception:
            try:
                # Might fail for server-side cursors (e.g. connection closed)
                cursor.close()
            except Exception:
                # Ignore clean up errors and raise the original error instead.
                # Python 2 doesn't chain exceptions. Remove this error
                # silencing when dropping Python 2 compatibility.
                pass
            raise

        if result_type == CURSOR:
            # Caller didn't specify a result_type, so just give them back the
            # cursor to process (and close).
            return cursor
        if result_type == SINGLE:
            try:
                val = cursor.fetchone()
                if val:
                    return val[0:self.col_count]
                return val
            finally:
                # done with the cursor
                cursor.close()
        if result_type == NO_RESULTS:
            cursor.close()
            return

        result = cursor_iter(
            cursor, self.connection.features.empty_fetchmany_value,
            self.col_count
        )
        if not chunked_fetch and not self.connection.features.can_use_chunked_reads:
            try:
                # If we are using non-chunked reads, we return the same data
                # structure as normally, but ensure it is all read into memory
                # before going any further. Use chunked_fetch if requested.
                return list(result)
            finally:
                # done with the cursor
                cursor.close()
        return result

此处result_type是CURSOR,调用时已经说明了。

显然,执行语句未成功(原因是没有符合条件的id),直接执行sql语句

update delaytest set name="hello" where id=100;

因为该id不存在,返回结果如下:

mysql> update delaytest set name="hello" where id=100;
Query OK, 0 rows affected (0.00 sec)
Rows matched: 0  Changed: 0  Warnings: 0

显然,更新是不行的,之前的注释:

If possible, try an UPDATE. If that doesn't update anything, do an INSERT.

先来尝试update,不行的话再去执行insert。此处返回curser,最终返回到

return filtered._update(values) > 0,显然0 > 0是False,所以,再返回给updated也是False,然后只能去执行
result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)

OK,创建执行成功后,发送post_save信号,save执行结束

创建一条已存在的数据

与之前代码运行不同之处在于_do_update中的过滤,存在一条数据,然后在execute_sql中执行成功,并返回1,显然 1 > 0,其值为True,即updated=True,返回True,然后发送post_save信号,save执行结束。

更新数据

同创建一条已存在的数据

结论:save不能辨别是更新还是新建,都是先以update做,如果是0条,则再去执行创建

PS:

同事问我save的使用方式,我说针对对象可以创建和更新的时候使用,我当时认为save可以辨别是创建还是更新,他感觉我说的不对。结果我给测试失败了,原因是没有添加createdate的问题(测试时是:blogs = DelayTest(id=1, name="hello"),写博文的时候:blogs = DelayTest(id=1, name="hello", createdate=datetime.now())),测试失败意味着我说的有问题,所以才有了源码分析save这个行为。

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值