熔池 沉积
This article is part 3 of the AI for 3-D printing series. Read part 1 and part 2.
本文是3-D打印AI系列的第3部分。 阅读 第1 部分 和 第2部分 。
The usage of an autoencoder provides a means of describing melt pool images with fewer parameters. Essentially, this is a form of data compression. However, as illustrated in part 2, the latent vectors encoded by the autoencoder are highly-packed and appeared in clusters form. As a result, the latent space is not smooth and continuous. It is possible to have severe overfitting in the latent space where two latent vectors which are near to each other look very different when reconstructed. This is due to the absence of regularisation term to control how the data should be compressed in its loss function.
自动编码器的使用提供了一种描述具有较少参数的熔池图像的方法。 本质上,这是数据压缩的一种形式。 但是,如第2部分中所述 ,由自动编码器编码的潜矢量高度堆积,并以簇的形式出现。 结果,潜在空间不平滑和连续。 在潜在空间中可能存在严重的过度拟合,在重构时,彼此靠近的两个潜在矢量看起来非常不同。 这是由于缺少正规化项来控制数据在其损失函数中应如何压缩。
Well, you can say that the autoencoder compresses data for the sake of compressing and hence it does not necessarily preserve the structure of data in the latent space.
好吧,您可以说自动编码器是为了压缩而对数据进行压缩的,因此它不一定会在潜在空间中保留数据的结构。
This is an issue if we want to have both:
如果我们要同时拥有这两个问题:
- Lesser parameters (achievable with an autoencoder) and 较小的参数(可通过自动编码器实现)和
- Quality data compression(an autoencoder is not optimised for this purpose) 质量数据压缩(未针对此目的优化自动编码器)
to describe melt pool geometry.
描述熔池的几何形状。
It is possible to resolve this issue with a variant of autoencoder. This article presents an automated features extraction methodology for anomalous melt pool detection and classification tasks centred around the usage of a disentangled variational autoencoder. Specifically, the quality data compressing property of a variational autoencoder will be used to extract different melt pool representations.
可以使用自动编码器的变体来解决此问题。 本文介绍了针对异常熔池检测和分类任务的自动特征提取方法,该方法以分散的可变自动编码器的使用为中心。 具体来说,变式自动编码器的质量数据压缩属性将用于提取不同的熔池表示形式。
介绍 (Introduction)
A variational autoencoder (VAE) has a similar structure as an autoencoder, except that it is a probabilistic variant of the latter. The usage of a VAE assumes that there are several unobserved data generating factors (also known as representations), each controlling different aspect(s) of the input. Following that, the goal here is to train a VAE to approximate the representations’ distributions so that the encoder can effectively be used as a features extractor.
可变自动编码器(VAE)具有与自动编码器类似的结构,只是它是后者的概率变体。 VAE的使用假定存在多个未观察到的数据生成因子(也称为表示形式),每个因子控制输入的不同方面。 随后,此处的目标是训练VAE以近似表示的分布,以便编码器可以有效地用作特征提取器。
The generic architecture of a VAE is shown above. Notice that instead of ordinary encoder and decoder, VAE is made up of two probabilistic components . The probabilistic encoder maps the input data X to a latent vector z. On the other hand, the decoder maps any sampled vectors from the latent space back into the original dimension, X’.
VAE的通用体系结构如上所示。 请注意,VAE由两个概率组件组成,而不是普通的编码器和解码器。 概率编码器将输入数据X映射到潜在向量z 。 另一方面,解码器将来自潜在空间的所有采样矢量映射回原始维度X' 。
This loss function has two terms in it. The reconstruction loss ensures that the reconstructed data is similar to the input. The second term, also known as the KL-divergence term, is a measure of the difference between the latent distributions of the data and the prior distribution of the latent encoding. It imposes a penalty to the network when it encodes the input into a highly-packed region, thus encouraging the encodings to acquire a distribution similar to the prior distribution, which is assumed to be ~N(0, I).
该损失函数包含两个项。 重建损失可确保重建的数据与输入相似。 第二项,也称为KL散度项 ,是对数据的潜在分布与潜在编码的先验分布之间的差异的度量。 当网络将输入编码到一个高度打包的区域时,它将对网络造成一定的损失,从而鼓励编码获取类似于先前分布的分布,假定该分布为〜N(0,I)。
Joseph provides a very intuitive and clear explanation on VAE in his Medium article. See article attached below:
约瑟夫(Joseph)在其中型文章中对VAE提供了非常直观,清晰的解释。 请参阅下面的文章:
To obtain the loss function of a β-VAE the loss function of a VAE is modified by multiplying a factor of β to the KL divergence regularising term.
为了获得β-VAE的损失函数,通过将β因子乘以KL散度正则项来修改VAE的损失函数。
Essentially, this modification allows us to control the amount of disentanglement between the encoded latent components. A set of latent components is said to be disentangled when each component is relatively sensitive to changes in a single aspect of the representations while being insensitive to the others. In the context of melt pool geometry, when the latent encodings are perfectly disentangled, varying one latent component will only change an aspect of the melt pool geometry. Whereas for normal VAE, it is difficult to understand which aspect of the melt pool geometry is captured by each individual latent component, as varying one latent component results in multiple changing melt pool representations.
从本质上讲,这种修改使我们可以控制编码后的潜在分量之间的纠缠量。 当每个潜在组件对表示的单个方面的变化相对敏感而对其他表示不敏感时,则将一组潜在组件解开 。 在熔池几何结构中,当潜在编码完全解开时,更改一个潜分量只会改变熔池几何结构的一个方面。 而对于正常的VAE,很难理解每个单独的潜在组分所捕获的熔池几何形状的哪个方面,因为改变一个潜在组分会导致熔池表示形式发生多次变化。
In this project, β was chosen to be 4 as this value of β empirically gives the best disentanglement effects.
在该项目中,选择β为4,因为根据经验,该β值可提供最佳的纠缠效果。
数据预处理 (Data Pre-processing)
With the goal to describe the melt pool geometries in a more precise manner, the melt pool images are cropped from a size of 128x128 into 32x32. This is because the melt pool’s surrounding does not contain much information about its geometry. We also ensure that the centroid of melt pool is aligned with the image’s center.
为了更精确地描述熔池几何形状,熔池图像从128x128的尺寸裁剪为32x32。 这是因为熔池周围没有太多关于其几何形状的信息。 我们还确保熔池的质心与图像的中心对齐。
Min max normalisation is carried out for all cropped video frames. Unlike the one class learning framework, this framework does not rely on any profiling for the normal melt pool. Hence, the data does not need to be pre-sieved.
对所有裁剪的视频帧执行最小最大归一化。 与一类学习框架不同,该框架不依赖于常规熔池的任何概要分析。 因此,不需要预先筛选数据。
特征提取 (Features Extraction)
By training a β-VAE on the melt pool video frames, the encoder learns the probability distribution of the melt pool latent representations. To explore the encodings, one could sample from specified ranges in the latent dimensions and decode the sampled latent vectors for visualisation.
通过在熔池视频帧上训练β-VAE,编码器可以了解熔池潜在表示的概率分布。 为了探索编码,可以从潜在维度的指定范围中采样并解码采样的潜在向量以进行可视化。
Based on the grid plots of generated images, we observed that:
根据生成的图像的网格图,我们观察到:
- The first latent component captures the size of melt pool 第一个潜在成分捕获熔池的大小
- Second latent component captures the roundness of melt pool. The melt pool images get squashed vertically or horizontally. 第二潜在成分捕获熔池的圆度。 熔池图像垂直或水平挤压。
- Third latent component captures the tail length of melt pools. The sign of the third latent component also captures the travelling direction of the melt pool. 第三潜在成分捕获熔池的尾部长度。 第三潜在成分的符号也捕获了熔池的行进方向。
Also, we can verify that the compression is a quality one as melt pool images smoothly “morph” from one form into another in the latent space. Besides, we can almost perfectly isolate the changes in melt pool geometry when we alter a single latent component. For example, increasing the first latent component changes the size but does almost nothing to the other two aspects.This is what I meant by disentangled representations earlier on.
同样,我们可以验证压缩是一种高质量的压缩,因为熔池图像在潜在空间中从一种形式平滑地“变形”为另一种形式。 此外,当我们改变单个潜在成分时,我们几乎可以完美地隔离熔池几何形状的变化。 例如,增加第一个潜在分量会改变大小,但对其他两个方面几乎没有任何作用,这就是我之前所说的解缠表示的意思。
Learn more about the disentangled variational autoencoder from Arxiv Insights:
从Arxiv Insights了解有关解缠的变分自动编码器的更多信息:
If we examine the distributions of the latent components of the training data:
如果我们检查训练数据潜在成分的分布:
The first and second latent components resemble Gaussian distributions whereas the third latent component resemble a bimodal distribution (a superposition of two Gaussian distributions). The bimodal distribution of the third latent component is due to the meander scanning strategy employed which causes melt pool to travel in a zigzag pattern. All distributions are centred around 0 and have a close to unity standard deviation due to the regularizing force applied by the KL divergence term in β-VAE’s loss function.
第一和第二潜分量类似于高斯分布,而第三潜分量类似于双峰分布(两个高斯分布的叠加)。 第三潜在组分的双峰分布归因于所采用的曲折扫描策略,该策略使熔池以锯齿形行进。 由于KL发散项在β-VAE损失函数中施加的正则化力,所有分布都以0为中心,并且具有接近统一的标准偏差。
异常检测与分类 (Anomalies Detection and Classification)
Next, the encoded melt pools are presented in the form of scatter plots.
接下来,以散点图的形式显示编码的熔池。
Based on the scatter plots, the encodings seem to agree qualitatively with the latent representations. Besides, several anomalies are clearly encoded far away from the denser region(s) in the latent space. This suggests the usage of Euclidean Distance metric from some reference point(s) as an anomaly measure. Based on the visualisation, it is also sensible to categorise the datapoints into clusters. Following that, the distance of melt pools from their cluster’s centroid can be computed and used as the anomaly metric.
基于散点图,编码似乎与潜在表示在质量上一致。 此外,在远离潜在空间中较密集区域的地方,清楚地编码了几个异常。 这建议使用从某个参考点开始的欧几里得距离度量作为异常度量。 基于可视化,将数据点归类为群集也是明智的。 然后,可以计算熔池距其簇的质心的距离,并将其用作异常度量。
Conceptually, this means that the more a given set of melt pool characteristics deviates from their average values the more anomalous the melt pool is.
从概念上讲,这意味着给定的一组熔池特征偏离其平均值越多,则熔池异常越多。
A density based clustering algorithm, DBSCAN is used to clean up the data and then K-Means Clustering with k=2 is fitted on the cleaned dataset. The centroids of those clusters will be stored for Euclidean distance computation during testing. The distance metric will be used as a measure for the melt pool degree of anomaly.
使用基于密度的聚类算法DBSCAN清理数据,然后将k = 2的K-Means聚类拟合到清理后的数据集。 这些簇的质心将在测试期间存储,以进行欧几里得距离计算。 距离度量将用作熔池异常程度的度量。
Three different supervised classifiers, Support Vector Machine (SVM), K-Nearest Neighbours (KNN) and Random Forest (RF) are utilised for the melt pool anomalies classification task. For this time, we explicitly classified the type of melt pools into a few categories — melt pool with unstable tail, plume and large melt pool.
三种不同的监督分类器,支持向量机(SVM),K最近邻(KNN)和随机森林(RF)用于熔池异常分类任务。 这次,我们将熔池的类型明确分类为几类-具有不稳定尾部,羽状流和较大熔池的熔池。
The optimal hyperparameters of the classifiers were obtained with grid search cross validation with five folds. Finally, the testing dataset is used to quantify the performance of the classifiers. The classification results are summarised as below:
分类器的最佳超参数是通过五次网格搜索交叉验证获得的。 最后,测试数据集用于量化分类器的性能。 分类结果总结如下:
Some of the correct classifications are shown below:
一些正确的分类如下所示:
Our VAE framework can now assign anomaly metric and additionally it possesses the ability to classify melt pools based on their geometries.
我们的VAE框架现在可以分配异常度量,此外,它还具有根据其几何形状对熔池进行分类的功能。
Example usage of this all-in-one anomaly classification and detection framework on three lines of scan:
在三行扫描中使用这种多合一异常分类和检测框架的示例用法:
一些想法 (Some Thoughts)
The β-VAE framework has a few advantages over an end-to-end supervised deep learning approach (end-to-end deep learning approach being, train a supervised neural network on labelled data without any features extraction):
与端到端监督式深度学习方法相比,β-VAE框架具有一些优势(端到端深度学习方法是,在不提取任何特征的情况下在标记数据上训练监督神经网络):
- For anomaly classification problem, it’s hard to proceed without supervised models. The β-VAE framework works by first extracting melt pool representations. This provides subsequent clustering and classical supervised models a proper features space to operate on. Meaning, the first half of the framework is completely unsupervised. Compared to an end-to-end supervised deep learning approach which overfits easily without sufficient amount of training data, the second half of the β-VAE framework will require less labelled data for training. 对于异常分类问题,没有监督模型很难进行。 β-VAE框架通过首先提取熔池表示来工作。 这为后续的聚类和经典监督模型提供了可操作的适当特征空间。 意思是,框架的前半部分是完全不受监督的。 与无需足够数量的训练数据即可轻松适应的端到端监督式深度学习方法相比,β-VAE框架的后半部分将需要较少的标记数据来进行训练。
- In terms of interpretability, the β-VAE decomposes the melt pool images into just three obvious disentangled representations. We have seen and verified that the representations distributions agree with the grid plots of generated melt pool images. As for end-to-end deep learning, it’ll be harder to explain why certain melt pools are classified into certain categories. 就可解释性而言,β-VAE将熔池图像分解为三个明显的解开表示。 我们已经看到并验证了表示形式的分布与生成的熔池图像的网格图一致。 至于端到端深度学习,将很难解释为什么某些熔池被归为某些类别。
未来的工作 (Future Work)
Future work is likely to involve training the β-VAE with a more curated training dataset. With known limitation (working within a specified set of printing parameters), this framework can then be incorporated into existing LPBF monitoring system. Tackling this problem with sequential models is another interesting approach yet to be experimented. Essentially, we can model melt pool dynamics as a time series problem. This makes sense as anomalous instances such as plumes are characterised by their rapidly changing shapes across multiple frames. Perhaps viewing the problem with this new perspective will be the key to building a more accurate framework.
未来的工作可能涉及使用更有效的训练数据集训练β-VAE。 在已知限制下(在指定的打印参数集内工作),此框架可以并入现有的LPBF监视系统中。 用顺序模型解决这个问题是另一种有待试验的有趣方法。 本质上,我们可以将熔池动力学建模为时间序列问题。 这是有道理的,因为异常实例(例如羽流)的特征是它们在多个框架中快速变化的形状。 也许用这种新观点来观察问题将是建立更准确的框架的关键。
结束语 (Concluding Remark)
In this article, we explored more on the data compressing capability of a disentangled variational autoencoder. From various visualisations, we then verified that the resulting latent space is smooth and continuous in the sense melt pool images which are similar are encoded closely to each other. Furthermore, with this framework we can also extract useful and interpretable melt pool representations for anomalies detection and classification. With a few supervised classifiers trained on the latent representations, we also showed that this framework can be used to classify anomalies and quantify their degree of anomaly simultaneously.
在本文中,我们进一步探讨了纠缠变分自动编码器的数据压缩能力。 通过各种可视化,我们然后验证了在相似的相似感官熔池图像彼此紧密编码的意义上,所得的潜在空间是平滑且连续的。 此外,借助此框架,我们还可以提取有用且可解释的熔池表示形式,以进行异常检测和分类。 借助在潜在表示上训练的一些监督分类器,我们还表明该框架可用于对异常进行分类并同时量化其异常程度。
Thanks for reading :)
谢谢阅读 :)
熔池 沉积