python 惰性_lazy - 用于快速开发惰性接口的Python库

最新推荐文章于 2023-12-01 17:10:00 发布

weixin_39787628

最新推荐文章于 2023-12-01 17:10:00 发布

阅读量337

点赞数

文章标签： python 惰性

lazy

Python library for rapidly developing lazy interfaces. This is currently a prototype built for playing with the paradigm.

By deferring the execution of your code until the last possible moment (when you actually request the data with .get()) you can optimize its execution while preserving simple imperative semantics.

Optimizations include things like

Minimal execution by tracing dependencies and only execution operations needed to produce the data

Automatic output caching and invalidation

Automatic parallelization of the induced dataflow graph

How it works

This library works by modifying annotated functions to record when they were called and their inputs and outputs. Once .get() is invoked on an output a minimal dataflow graph is generated by inspecting all of its dependencies (including cached outputs). This dataflow graph can optionally be automatically parallelized.

A key requirement of this library is that all annotated functions be stateless and synchronous.

See the execution example at the bottom for details, or try it out yourself!

Usage

Decorate stateless and synchronous functions with @lazy.synchronous

import lazy

@lazy.synchronous

def Square(x):

time.sleep(0.1)

return x ** 2

@lazy.synchronous

def Mul(x, y):

time.sleep(0.1)

return x * y

@lazy.synchronous

def Add(x, y):

time.sleep(0.1)

return x + y

Write your program and access the output of annotated functions with .get()

a = Square(2)

b = Square(3)

c = Mul(a, b)

d = Add(a, b)

t = time.time()

# The code isn't run until you call .get()

print(c.get())

print(time.time() - t)

t = time.time()

print(d.get())

print(time.time() - t)

Run things in parallel automatically with lazy.parallelize = True

lazy.parallelize = True

a = Square(2)

b = Square(3)

c = Mul(a, b)

t = time.time()

print(c.get())

print(time.time() - t)

# Should only take 0.2s instead of 0.3s by automatic parallelism

Asynchronous execution can be made synchronous with locking primitives. Functions annotated with @lazy.asynchronous are fed an extra input t of type Task which has a spin primitive. See below:

@lazy.asynchronous

def Recv(t, ptr):

# Around 10 spins before we break

for _ in t.spin():

r = random.randint(0,10)

if r == 7:

break

return ptr # pretend we actually receieved something from network

ptr = 0x123123

d = Recv(ptr)

print(d.get()) # 7

The idea here is that spin will periodically run the body of the loop until it is broken. The rate at which spin loops is determined by the runtime. After a couple of iterations of the same function, we can actually track how many spins it typically takes for the lock condition to be met and further optimize the rate at which spins happen. As an example, if it takes on average 100ms for the network to respond we can make the first spin take exactly 100ms and speed up all subsequent spins. This frees up cycles to work on other tasks in parallel.

TODO

Support functions that operate in-place and have multiple outputs

Support maximal trace length (to automatically force calls to get())

Execution example

Below was generated with calls to lazy.draw().

Before calling c.get() in the above example we can see that only the input data is valid

After calling c.get() we can see that only Mul was invoked (and not Add)

Once we call d.get() Add is executed using the cached intermediate values calculated when we called c.get()

Other small things

data.dump_cf() to get the calculated controlflow graph (networkx format) of data (i.e. what needs to be executed to generate it)

data.executor = func to set a specific executor for the node. The executor must be of the form func(data : Data) -> None

lazy.dump() to get the full known dataflow graph (networkx format)

lazy.draw() to draw the full known dataflow graph (with colors as in the above example)

If you really want to play with this I'd recommend attacking most ideas with networkx. As an example: to get a subgraph of all the data dependencies of d you can simply do

subgraph = nx.subgraph(nx.ancestors(lazy.dump(), d))

weixin_39787628

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python 惰性_lazy - 用于快速开发惰性接口的Python库

lazyPython library for rapidly developing lazy interfaces. This is currently a prototype built for playing with the paradigm.By deferring the execution of your code until the last possible moment (whe...
复制链接

扫一扫