In this article, we will continue inspecting the quality of the software. Instead of selecting packages to be checked manually, we will use a component called “Dependency Monkey” which can resolve software stacks following programmed rules and verify the application correctness.
在本文中,我们将继续检查软件的质量 。 我们将使用名为“ Dependency Monkey ”的组件来代替手动选择要检查的软件包,该组件可以按照编程的规则解析软件堆栈并验证应用程序的正确性。
为什么要使用不同的包装组合? (Why different combinations of packages?)
In the previous article, but mainly in the introductory article to “How to beat Python’s pip” series, we have described a state space of all the possible software stacks that can be resolved for an application stack given the requirements on libraries. Each resolved software stack in such state space can be scored by a scoring function that can compute “how good the given software is”. In the figure below, we can see an interpolated scoring function for resolved software stacks created out of two libraries simplelib
and anotherlib
.
在上一篇文章中,但主要是在“如何击败Python的点子”系列的介绍性文章中 ,我们描述了所有可能的软件堆栈的状态空间,鉴于库的要求,这些堆栈可以为应用程序堆栈解析。 这种状态空间中的每个已解析软件堆栈都可以由计分功能进行计分,该计分功能可以计算“ 给定软件的质量 ”。 在下图中,我们可以看到两个库的创建了解决软件堆栈插值评分功能simplelib
和anotherlib
。
The interpolated function above shows a score for two-dimensional state space (one dimension for each package). As we add more packages to an application, this state space is becoming larger and larger (especially considering transitive dependencies that need to be added as well to have a valid software stack).
上面的插值函数显示了二维状态空间的得分(每个包装一维)。 随着我们向应用程序添加更多程序包,此状态空间变得越来越大(尤其是考虑到需要添加传递性依赖项以具有有效软件堆栈的状态空间)。
For real-world applications, we can very easily get tens of dimensions (e.g. by installing tensorflow==2.3.0
we include 36 distinct packages in different versions, thus 36 dimensions plus one dimension for the scoring function). These dimensions introduce distinct input features that affect application behavior as reflected by the scoring function. As we already know based on our last article, any issue in any of these packages can introduce a problem in our application (run time or build time).
对于实际应用,我们可以很容易地获得数十个维度(例如,通过安装tensorflow==2.3.0
我们包括36个不同版本的软件包,因此36个维度加上一个维度用于评分功能)。 这些维度引入了独特的输入功能,这些功能影响评分功能所反映的应用程序行为。 正如我们根据上一篇文章所知道的那样,这些软件包中的任何问题都可能在我们的应用程序中引入问题(运行时或构建时)。
All the possible versions (all the possible 36-dimensional vectors following our example) are impossible to test in a reasonable time and thus require some smart picking which versions should be included in the final resolved stack. One slicing mechanism is the actual resolver — it can slice possible resolutions respecting version range specifications of packages in the dependency graph. But how do we limit the number of possible stacks to a reasonable sample even more?
所有可能的版本(我们的示例之后的所有可能的36维向量)都无法在合理的时间内进行测试,因此需要一些明智的选择,才能将哪些版本包含在最终解析的堆栈中。 一种切片机制是实际的解析程序-它可以在依赖关系图中切片有关包的版本范围规范的可能分辨率。 但是,我们如何将可能的堆栈数限制为一个合理的样本呢?
智能的离线解析器 (A smart offline resolver)
Besides removing packages based on version range specification in the resolver, a component called Dependency Monkey is capable of using “pipeline units”. The whole resolution process is treated as a pipeline made out of pipeline units of different types that decide whether packages should be considered during the resolution. In other words, if resolved stacks formed out of selected packages should be inspected.
除了根据解析器中的版本范围规范删除软件包外,名为Dependency Monkey的组件还可以使用“ 管道单元 ”。 整个解析过程被视为由不同类型的管道单元组成的管道,这些管道单元决定在解析过程中是否应考虑软件包。 换句话说,如果应该检查由选定包装形成的已分解堆叠。
An example can be an inspection of a TensorFlow software stack. If we want to test a specific TensorFlow with NumPy versions for compatibility, we can skip already tested software stack combinations (e.g. based on the queries to our database with previous test results).
例如可以检查TensorFlow软件堆栈。 如果我们要使用NumPy版本测试特定的TensorFlow的兼容性,则可以跳过已测试的软件堆栈组合(例如,基于对数据库的查询以及先前的测试结果)。
Pipeline units create a programmable interface to the resolver which can act based on pipeline units decisions.
管道单元为解析器创建一个可编程的接口,该接口可以根据管道单元的决策进行操作。
阿蒙视察:再次 (Amun inspections: revisited)
In the previous article called “How to beat Python’s pip: Inspecting the quality of machine learning software” we introduced a service called Amun that can run software respecting a specification that states how the application is assembled and run. Besides information about the operating system or hardware used, it accepts also a list of packages that should be installed in order to build and run the software.
在上一篇名为“ 如何击败Python的技巧:检查机器学习软件的质量 ”的文章中,我们引入了一项名为Amun的服务,该服务可以按照说明应用程序如何组装和运行的规范来运行软件。 除了有关所使用的操作系统或硬件的信息之外,它还接受应安装以生成和运行该软件的软件包列表。
As Dependency Monkey can resolve Python software stacks, it becomes one of the users of the Amun service. Simply said, if a Dependency Monkey resolves a Python software stack which it considers as a valid candidate for testing, it submits it to Amun to inspect its quality.
由于Dependency Monkey可以解析Python软件堆栈,因此它成为Amun服务的用户之一。 简而言之,如果Dependency Monkey解析了它认为是有效测试对象的Python软件栈,则会将其提交给Amun进行质量检查。
We use “quality” to describe a certain aspect of the software. One of such quality aspect can be performance or other runtime behavior. The fact an application fails to build is also an indicator of the software stack quality.
我们使用“质量”来描述软件的某个方面。 这种质量方面之一可以是性能或其他运行时行为。 应用程序无法构建的事实也表明了软件堆栈的质量。
Dependency Monkey的解决方案管道 (Dependency Monkey’s resolution pipeline)
One can see Dependency Monkey as a resolver that accepts an input vector and resolves one or multiple software stacks considering the input vector and an aggregated knowledge about the software and packages forming the software stacks. This aggregated knowledge can accumulate information about packages or package combinations seen in the software stacks.
可以将Dependency Monkey看作是一个解析器,它接受输入向量并考虑输入向量和有关构成该软件堆栈的软件包和软件包的汇总知识来解析一个或多个软件堆栈。 这种聚合的知识可以积累有关软件包或软件包组合的信息,这些信息在软件堆栈中可见。
检查TensorFlow堆栈中的不同包装组合 (Checking different package combinations in TensorFlow stacks)
Let’s check some dependencies of a TensorFlow stack (I used TensorFlow in version 2.1.0, the dependency listing will differ across versions). If we take a look at the direct dependencies of TensorFlow, we will find packages such as h5py, opt-einsum, scipy, Keras-Preprocessing, and tensorboard in specific versions. They share a common dependency NumPy, a direct dependency of TensorFlow itself (see this GitHub gist for the listing that can change over time with new package releases). All the packages stated can be installed in different versions, which can have different version range requirements on NumPy. The actual version of NumPy installed depends on the resolver and the resolution process that can take into account also other libraries that the user requested to install (besides TensorFlow as a single direct dependency). It’s worth to pinpoint here that any issue in NumPy (even incompatibilities introduced by overpinning or underpinning) can lead to a broken application. So let’s try to test the TensorFlow stack with different combinations of NumPy.
让我们检查一下TensorFlow堆栈的一些依赖关系(我在2.1.0版本中使用过TensorFlow,依赖关系列表在不同版本之间会有所不同)。 如果我们看一下TensorFlow的直接依赖关系 ,我们将在特定版本中找到诸如h5py , opt-einsum , scipy , Keras -Preprocessing和tensorboard之类的软件包 。 它们共享一个公共依赖项NumPy ,它是TensorFlow本身的直接依赖项(请参阅GitHub要点以获得随着新软件包发布可能随时间变化的列表 )。 声明的所有软件包都可以安装在不同的版本中,这些版本在NumPy上可能具有不同的版本范围要求。 安装的NumPy的实际版本取决于解析器和解析过程,可以同时考虑用户请求安装的其他库(除了TensorFlow作为单个直接依赖项之外)。 这是值得在这里查明,在NumPy的任何问题(甚至不兼容的overpinning或基本的介绍)可以导致破裂的应用 。 因此,让我们尝试使用NumPy的不同组合测试TensorFlow堆栈。
In the upcoming video, you can see a brief walk-through on Dependency Monkey together with a service called Amun. In the first part of the demo (starting at 19:25), Dependency Monkey resolves software stacks considering aggregated knowledge (one of such knowledge is dependency information needed during the resolution) and submits these software stacks to Amun to inspect the quality of the software. The tested software stack is TensorFlow in version 2.1.0, using the build published on PyPI, with different combinations of NumPy resolved (the whole application stack is formed with packages in the same package version but NumPy versions get adjusted respecting the dependency graph).
在即将发布的视频中,您可以看到有关Dependency Monkey的简要介绍以及名为Amun的服务。 在演示的第一部分( 从19:25开始 )中, Dependency Monkey会考虑汇总的知识来解析软件堆栈(此类知识之一是解决过程中所需的依赖关系信息 ), 并将这些软件堆栈提交给Amun以检查软件的质量。 。 经过测试的软件堆栈是2.1.0版中的TensorFlow,使用在PyPI上发布的版本,并解析了NumPy的不同组合(整个应用程序堆栈由具有相同程序包版本的程序包组成,但NumPy版本根据相关性图进行了调整)。
A note to video: Dependencies that should be locked could be also stated in the direct dependency listing. Note however that by doing so, the dependency will always be present in all the stacks, even though it would not be used and could affect the dependency graph. That’s why pinning of dependencies is performed on a unit level.
视频说明:也可以在直接依赖项列表中说明应锁定的依赖项。 但是请注意,这样做将使依赖性始终存在于所有堆栈中,即使它不会被使用并且可能会影响依赖性图。 这就是为什么在单元级别执行依赖项固定的原因。
The second part of the demo (starting at 28:13) shows Dependency Monkey resolution that randomly samples the state space of all the possible TensorFlow stacks. As we already know, this state space is too large thus checking all the combinations is impossible in a reasonable time. Dependency Monkey randomly generates software stacks that are valid resolutions of TensorFlow software and submits them to Amun which verifies the software stack builds and runs correctly.
演示的第二部分( 从28:13开始 )显示了Dependency Monkey解析,该解析随机采样所有可能的TensorFlow堆栈的状态空间。 众所周知,此状态空间太大,因此不可能在合理的时间内检查所有组合。 Dependency Monkey会随机生成作为TensorFlow软件的有效分辨率的软件堆栈,并将其提交给Amun,Amun会验证软件堆栈的构建和正常运行。
Such random state space sampling can spot issues. One of such interesting issue in TensorFlow 2.1 stack is a dependencyurllib3
that, when installed in a specific version, can cause runtime errors on TensorFlow imports. See this document for a detailed overview. Note the version installed can depend also on other libraries that an application can use besides TensorFlow so there can be affected applications by this issue.
这种随机状态空间采样可以发现问题。 TensorFlow 2.1堆栈中的此类有趣问题之一是依赖项urllib3
,当将其安装在特定版本中时,可能会导致TensorFlow导入时出现运行时错误。 有关详细概述,请参见本文档 。 请注意,安装的版本还可能取决于TensorFlow之外应用程序可以使用的其他库,因此此问题可能会影响应用程序。
托特计划 (Project Thoth)
Project Thoth is an application that aims to help Python developers. If you wish to be updated on any improvements and any progress we make in project Thoth, feel free to subscribe to our YouTube channel where we post updates as well as recordings from scrum demos.
Project Thoth是旨在帮助Python开发人员的应用程序。 如果您希望了解我们在Thoth项目中所做的任何改进和进展的最新信息,请随时订阅我们的YouTube频道 ,我们在其中发布更新和Scrum演示的录音。
Stay tuned for any updates!
请随时关注任何更新!