【AI测试】人工智能测试整体介绍——第七部分

最新推荐文章于 2024-08-14 16:00:00 发布

凌晨点点

最新推荐文章于 2024-08-14 16:00:00 发布

阅读量2.9k

点赞数

分类专栏： AI测试

本文链接：https://blog.csdn.net/lhh08hasee/article/details/105702297

版权

AI测试专栏收录该内容

28 篇文章 66 订阅

订阅专栏

主要内容是
第一部分：人工智能和测试的介绍
 第二部分：人工智能系统的特性和验收标准
 第三部分：机器学习
 第四部分：机器学习的性能指标和基准
 第五部分：人工智能系统测试简介
 第六部分：人工智能系统的黑盒测试
 第七部分：人工智能系统的白盒测试
 第八部分：测试人工智能的测试环境
 第九部分：使用人工智能进行测试

七、人工智能系统的白盒测试

深度学习（神经网络）的白盒测试

7.1 神经网络的结构

神经网络是受人脑中的神经网络启发的计算模型。它包含许多层连接的节点或神经元，如图2所示。请注意，在本节中，将使用前馈神经网络作为示例，这是第一个也是最简单的人工神经网络类型–唯一我们将添加的额外复杂性是，我们将考虑一个具有多层的网络-称为多层感知器（或具有隐藏层的深层神经网络）。
在这里插入图片描述

输入节点从外界接收信息（例如，每个输入可以是图像中像素的值），并且输出节点向外界提供信息（例如，分类）。隐藏层中的节点与外界没有任何连接，并执行将信息从输入节点传递到输出节点的计算。

每个神经元接受输入值并生成输出值，称为激活值（或输出矢量），可以是正值，负值或零（值为零时，神经元对下游神经元没有影响）。
每个连接都有权重，每个神经元都有偏差。
根据输入的激活值，输入连接的权重和神经元的偏向，通过公式来计算激活值。
对于监督学习，网络通过使用反向传播进行学习。

最初，所有节点都设置为初始值，并且第一个输入的训练数据传递到网络中并通过网络。
将输出与已知的正确答案进行比较，并将计算出的输出与正确答案（错误）之间的差异反馈到网络的上一层，并用于修改权重。这种向后的错误传播会遍及整个网络，并且每个连接权重都会适当更新。随着更多的训练数据被馈送到网络中，它会逐渐从错误中学习，直到被认为已准备好与验证数据一起评估为止，这将确定训练后的网络的性能。

7.2 神经网络的测试覆盖率度量

Traditional coverage measures are not really useful for neural networks as 100% statement coverage is typically achieved with a single test case. The defects are normally hidden in the neural network itself.
Thus, different coverage measures have been proposed based on the activation values of the neurons(or pairs of neurons) in the neural network when the neural network is tested.

7.2.1 Neuron Coverage

Neuron coverage for a set of tests is defined as the proportion of activated neurons divided by the total number of neurons in the neural network (normally expressed as a percentage). For neuron coverage, a neuron is considered to be activated if its activation value exceeds zero.

7.2.2 Threshold Coverage

Threshold coverage for a set of tests is defined as the proportion of neurons exceeding a threshold activation value divided by the total number of neurons in the neural network (normally expressed as a percentage). For threshold coverage, a threshold activation value between 0 and 1 must be chosen as the threshold value. Note that this threshold coverage corresponds to ‘neuron coverage’ in the DeepXplore tool.

7.2.3 Sign Change Coverage

Sign Change coverage for a set of tests is defined as the proportion of neurons activated with both positive and negative activation values divided by the total number of neurons in the neural network (normally expressed as a percentage). An activation value of zero is considered to be a negative activation value.

7.2.4 Value Change Coverage

Value Change coverage for a set of tests is defined as the proportion of neurons activated where their activation values differ by more than a change amount divided by the total number of neurons in the neural network (normally expressed as a percentage).

7.2.5 Sign-Sign Coverage

Sign-Sign coverage for a set of tests is achieved if each neuron by changing sign can be shown to individually cause one neuron in the next layer to change sign while all other neurons in the next layer stay the same (i.e. they do not change sign). In concept, this level of neuron coverage is similar to modified condition/decision coverage (MC/DC) [Sun et al, Testing Deep Neural Nets paper]

7.2.6 Layer Coverage

Coverage measures can also be defined based on whole layers of the neural network and how the activation values for the set of neurons in a whole layer change (e.g. absolutely or relative to each other). Further research is needed in this area.

7.2.7 Test Effectiveness of the White Box Measures

There is currently little data on the test effectiveness of the different white box coverage measures for the white box testing of neural networks. However, it is generally true that criteria requiring more tests will find more defects than those that require fewer tests, so allowing the relative effectiveness of the measures to be deduced. In this respect, the coverage measures described in sections 7.2.1. to 7.2.5 are listed in increasing order of rigour.

Although easy to understand, achieving high levels of neuron coverage can normally be achieved using only a few test cases, so limiting its test effectiveness. Early results for threshold coverage appear to show that this may be a useful measure for generating tests that cover defect-inducing corner cases, but the threshold value may need to be set individually for each neural network. For value change coverage, higher values for the change amount will naturally require more test cases. Sign-sign coverage is normally the most rigorous of the coverage criteria specified here.

7.3 神经网络的白盒测试工具

当前尚无法使用商用工具来支持神经网络的白盒测试，但是有几种研究工具，其中包括：•DeepXplore –专门用于测试深层神经网络，提出了白盒差分测试（背对背）算法
系统地生成涵盖网络中所有神经元的对抗示例（阈值覆盖）。
•DeepTest –系统测试工具，用于自动检测由深度神经网络驱动的汽车的错误行为。
支持DNN的Sign-Sign覆盖。
•DeepCover-提供本节中定义的所有覆盖范围。