python 异步数据库_异步Python和数据库

python 异步数据库

The asynchronous programming topic is difficult to cover. These days, it’s not just about one thing, and I’m mostly an outsider to it. However, because I deal a lot with relational databases and the Python stack’s interaction with them, I have to field a lot of questions and issues regarding asynchronous IO and database programming, both specific to SQLAlchemy as well as towards Openstack.

异步编程主题很难涵盖。 如今,这不仅是一件事,而且我基本上是一个局外人。 但是,由于我处理了很多关系数据库以及Python堆栈与它们之间的交互,因此我不得不就异步IO和数据库编程提出许多问题,这些问题既涉及SQLAlchemy ,也涉及OpenStack

As I don’t have a simple opinion on the matter, I’ll try to give a spoiler for the rest of the blog post here. I think that the Python asyncio library is very neat, promising, and fun to use, and organized well enough that it’s clear that some level of SQLAlchemy compatibility is feasible, most likely including most parts of the ORM. As asyncio is now a standard part of Python, this compatiblity layer is something I am interested in producing at some point.

由于我对此事没有简单的看法,因此我将在此处为其余博客文章提供一个破坏者。 我认为Python asyncio库非常简洁,有前途并且易于使用,并且组织得足够好,很明显,某种程度SQLAlchemy兼容性是可行的,很可能包括ORM的大部分。 由于asyncio现在是Python的标准部分,因此我有时希望对此兼容层进行生产。

All of that said, I still think that asynchronous programming is just one potential approach to have on the shelf, and is by no means the one we should be using all the time or even most of the time, unless we are writing HTTP or chat servers or other applications that specifically need to concurrently maintain large numbers of arbitrarily slow or idle TCP connections (where by “arbitrarily” we mean, we don’t care if individual connections are slow, fast, or idle, throughput can be maintained regardless). For standard business-style, CRUD-oriented database code, the approach given by asyncio is never necessary, will almost certainly hinder performance, and arguments as to its promotion of “correctness” are very questionable in terms of relational database programming. Applications that need to do non-blocking IO on the front end should leave the business-level CRUD code behind the thread pool.

综上所述,我仍然认为异步编程只是现成的一种潜在方法,绝不是我们一直或什至大部分时间都应该使用的方法,除非我们正在编写HTTP或聊天。服务器或其他特别需要同时维护大量任意慢速或空闲TCP连接的应用程序(“任意”是指我们不关心单个连接是慢速,快速还是空闲,无论如何都可以保持吞吐量) 。 对于标准的业务样式, 面向CRUD的数据库代码,asyncio所提供的方法从没有必要,几乎肯定会阻碍性能,并且有关其“正确性”的提升的争论在关系数据库编程方面非常可疑。 需要在前端执行非阻塞IO的应用程序应将业务级CRUD代码留在线程池后面。

With my assumedly entirely unsurprising viewpoint revealed, let’s get underway!

我的观点完全出乎意料地揭示了,让我们开始吧!

什么是异步IO? (What is Asynchronous IO?)

Asynchronous IO is an approach used to achieve concurrency by allowing processing to continue while responses from IO operations are still being waited upon. To achieve this, IO function calls are made to be non blocking, so that they return immediately, before the actual IO operation is complete or has even begun. A typically OS-dependent polling system (such as epoll) is used within a loop in order to query a set of file descriptors in search of the next one which has data available; when located, it is acted upon, and when the operation is complete, control goes back to the polling loop in order to act upon the next descriptor with data available.

异步IO是一种用于通过允许在继续等待IO操作的响应的同时继续处理来实现并发的方法。 为此,必须使IO函数调用成为非阻塞的 ,以便它们在实际IO操作完成或什至开始之前立即返回。 在循环中使用通常依赖于操作系统的轮询系统(例如epoll ),以查询一组文件描述符,以搜索下一个具有可用数据的文件描述符。 当找到时,将对其执行操作;当操作完成时,控制权将返回到轮询循环,以便对具有可用数据的下一个描述符进行操作。

Non-blocking IO in its classical use case is for those cases where it’s not efficient to dedicate a thread of execution towards waiting for a socket to have results. It’s an essential technique for when you need to listen to lots of TCP sockets that are arbitrarily “sleepy” or slow – the best example is a chat server, or some similar kind of messaging system, where you have lots of connections connected persistently, only sending data very occasionally; e.g. when a connection actually sends data, we consider it to be an “event” to be responded to.

非阻塞IO在其经典用例中是针对无法将执行线程专用于等待套接字产生结果的情况。 当您需要侦听许多“困”或慢的任意TCP套接字时,这是一项必不可少的技术–最好的例子是聊天服务器或某种类似的消息传递系统,其中您有许多连接持续存在,偶尔发送数据; 例如,当连接实际发送数据时,我们将其视为要响应的“事件”。

In recent years, the asynchronous IO approach has also been successfully applied to HTTP related servers and applications. The theory of operation is that a very large number of HTTP connections can be efficiently serviced without the need for the server to dedicate threads to wait on each connection individually; in particular, slow HTTP clients need not get in the way of the server being able to serve lots of other clients at the same time. Combine this with the renewed popularity of so-called long polling approaches, and non-blocking web servers like nginx have proven to work very well.

近年来,异步IO方法也已成功应用于与HTTP相关的服务器和应用程序。 工作原理是,可以有效地服务大量HTTP连接,而无需服务器专门让线程分别等待每个连接; 特别是,速度较慢的HTTP客户端不必妨碍服务器能够同时为许多其他客户端提供服务。 将其与重新流行的长轮询方法相结合,事实证明像nginx这样的无阻塞Web服务器可以很好地工作。

异步IO和脚本 (Asynchronous IO and Scripting)

Asynchronous IO programming in scripting languages is heavily centered on the notion of an event loop, which in its most classic form uses callback functions that receive a call once their corresponding IO request has data available. A critical aspect of this type of programming is that, since the event loop has the effect of providing scheduling for a series of functions waiting for IO, a scripting language in particular can replace the need for threads and OS-level scheduling entirely, at least within a single CPU. It can in fact be a little bit awkward to integrate multithreaded, blocking IO code with code that uses non-blocking IO, as they necessarily use different programming approaches when IO-oriented methods are invoked.

用脚本语言进行的异步IO编程主要集中在事件循环的概念上,事件循环以其最经典的形式使用回调函数,这些回调函数在其相应的IO请求具有可用数据后即接收呼叫。 此类编程的一个关键方面是,由于事件循环具有为一系列等待IO的功能提供调度的作用,因此至少脚本语言尤其可以完全替代对线程和OS级调度的需求在单个CPU中。 实际上,将多线程的阻塞IO代码与使用非阻塞IO的代码集成在一起可能有点尴尬,因为在调用面向IO的方法时,它们必然使用不同的编程方法。

The relationship of asynchronous IO to event loops, combined with its growing popularity for use in web-server oriented applications as well as its ability to provide concurrency in an intuitive and obvious way, found itself hitting a perfect storm of factors for it to become popular on one platform in particular, Javascript. Javascript was designed to be a client side scripting language for browsers. Browsers, like any other GUI app, are essentially event machines; all they do is respond to user-initiated events of button pushes, key presses, and mouse moves. As a result, Javascript has a very strong concept of an event loop with callbacks and, until recently, no concept at all of multithreaded programming.

异步IO与事件循环的关系,再加上它在面向Web服务器的应用程序中越来越流行,以及它以直观和明显的方式提供并发的能力,使其自身受到了各种因素的完美风暴在一个平台上,尤其是Javascript。 Javascript被设计为浏览器的客户端脚本语言。 就像任何其他GUI应用程序一样,浏览器本质上是事件机。 他们所做的只是响应用户启动的按钮按下,按键按下和鼠标移动事件。 结果,Javascript具有带回调的事件循环的强大概念,直到最近 ,才完全没有多线程编程的概念。

As an army of front-end developers from the 90’s through the 2000’s mastered the use of these client-side callbacks, and began to use them not just for user-initiated events but for network-initiated events via AJAX connections, the stage was set for a new player to come along, which would transport the ever growing community of Javascript programmers to a new place…

从90年代到2000年代的前端开发人员大军掌握了这些客户端回调的用法,并且不仅将它们用于用户启动的事件,还通过AJAX连接将其用于网络启动的事件,这一阶段已定一个新的参与者,这将把不断增长的Javascript程序员社区转移到一个新的地方……

服务器 (The Server)

Node.js is not the first attempt to make Javascript a server side language. However, a key reason for its success was that there were plenty of sophisticated and experienced Javascript programmers around by the time it was released, and that it also fully embraces the event-driven programming paradigm that client-side Javascript programmers are already well-versed in and comfortable with.

Node.js并不是使Javascript成为服务器端语言的首次尝试。 但是,其成功的一个关键原因是,在发布之时,周围已经有许多成熟和经验丰富的Javascript程序员,并且它也完全包含了事件驱动的编程范式,即客户端Javascript程序员已经精通在和舒适。

In order to sell this, it followed that the “non-blocking IO” approach needed to be established as appropriate not just for the classic case of “tending to lots of usually asleep or arbitrarily slow connections”, but as the de facto style in which all web-oriented software should be written. This meant that any network IO of any kind now had to be interacted with in a non-blocking fashion, and this of course includes database connections – connections which are normally relatively few per process, with numbers of 10-50 being common, are usually pooled so that the latency associated with TCP startup is not much of an issue, and for which the response times for a well-architected database, naturally served over the local network behind the firewall and often clustered, are extremely fast and predictable – in every way, the exact opposite of the use case for which non-blocking IO was first intended. The Postgresql database supports an asynchronous command API in libpq, stating a primary rationale for it as – surprise! using it in GUI applications.

为了卖这个,它遵循的是“非阻塞IO”的方式需要建立适当的不只是为“趋于很多平时睡眠或任意网络连接速度太慢”的经典案例,但作为事实上的风格应该编写所有面向Web的软件。 这意味着现在必须以非阻塞方式与任何类型的网络IO进行交互,并且这当然包括数据库连接-通常每个进程的连接相对较少,通常为10-50个集中存储,这样与TCP启动相关的延迟就不再是问题,而且对于结构良好的数据库(自然地通过防火墙后面的本地网络服务并且通常是群集的),其响应时间非常快且可预测–方式与最初打算使用非阻塞IO的用例完全相反。 Postgresql数据库在libpq中支持异步命令API,并指出其主要理由是–令人惊讶! 在GUI应用程序中使用它

node.js already benefits from an extremely performant JIT-enabled engine, so it’s likely that despite this repurposing of non-blocking IO for a case in which it was not intended, scheduling among database connections using non-blocking IO works acceptably well. (authors note: the comment here regarding libuv’s thread pool is removed, as this only regards file IO.)

node.js已经从性能卓越的启用JIT的引擎中受益,因此,尽管在非预期的情况下重新使用了非阻塞IO,但使用非阻塞IO在数据库连接之间进行调度仍然可以令人满意地工作。 (作者注意:此处关于libuv线程池的评论已删除,因为这仅涉及文件IO。)

线程的幽灵 (The Spectre of Threads)

Well before node.js was turning masses of client-side Javascript developers into async-only server side programmers, the multithreaded programming model had begun to make academic theorists complain that they produce non-deterministic programs, and asynchronous programming, having the side effect that the event-driven paradigm effectively provides an alternative model of programming concurrency (at least for any program with a sufficient proportion of IO to keep context switches high enough), quickly became one of several hammers used to beat multithreaded programming over the head, centered on the two critiques that threads are expensive to create and maintain in an application, being inappropriate for applications that wish to tend to hundreds or thousands of connections simultaneously, and secondly that multithreaded programming is difficult and non-deterministic. In the Python world, continued confusion over what the GIL does and does not do provided for a natural tilling of land fertile for the async model to take root more strongly than might have occurred in other scenarios.

在node.js将大量客户端Java开发人员转变为仅异步服务器端编程人员之前,多线程编程模型已经开始使学术理论家抱怨说,他们产生了不确定性程序和异步程序࿰

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值