Development of a classification algorithm for efficient handling of multiple classes in sorting systems basesd on hyperspectral imaging



When dealing with practical applications of hyperspectral imaging, the development of efficient, fast and flexible classification algorithms*/'ælgərɪð(ə)mz/* is of the utmost importance.
Indeed, the optimal classification method should be able, in a reasonable time, to maximise the separation between the classes of interest and, at the same time, to correctly reject possible outlier samples.
To this aim, a new extension of Partial Least Squares Discriminant Analysis (PLS-DA), namely Soft PLS-DA, has been implemented.
为此,已实施了偏最小二乘分析(PLS-DA)的新扩展,即Soft PLS-DA。
The basic engine of Soft PLS-DA is the same as PLS-DA, but class assignment is subjected to some additional criteria which allow samples not belonging to the target classes to be identified and rejected.
Soft PLS-DA的基本引擎和PLS-DA相同,但是类别分配要遵循一些附加的条件,从而辨别和排除一些不属于目标分类的样本。
The proposed approach was tested on a real case study of plastic waste sorting based on near infrared hyperspectral imaging.
Household plastic waste objects made of the six recyclable plastic polymers/'pɔliməs/ commonly used for packaging were collected and imaged using a hyperspectral camera mounted on an industrial sorting system.
In addition, paper and not recyclable plastics were also considered as potential foreign materials that are commonly found in plastic waste.
For classification purposes, the Soft PLS-DA algorithm was integrated into a hierarchical classification tree for the discrimination of the different plastic polymers.
出于分类的目的,将Soft PLS-DA算法集成到用于区分不同塑料聚合物的分层分类树中。
Furthermore, Soft PLS-DA was also coupled with sparse-based variable selection to identify the relevant variables involved in the classification and to speed up the sorting process.
此外,Soft PLS-DA还与基于稀疏的变量选择相结合,以识别分类中涉及的相关变量并加快分类过程。
The tree-structured classification model was successfully validated both on a test set of representative spectral of each material for a quantitative evaluation, and at the pixel level on a set of hyperspectral images for a qualitative assessment.

Keywords: PLS-DA, multivariate classification, hierarchical /ˌhaɪəˈrɑːkɪkl/ classification, sparse methods, feature selection, plastic*/ˈplæstɪk/* sorting
关键词: PLS-DA ,多元分类 ,层次分类, 稀疏方法 ,特征选择, 塑性排序


Over the­ past­ decades, ­Hyperspectral ­Imaging­(HSI)­ has­ gained increasing­ attention­ from ­industries ­interested­ in­ the ­implementation­ of automated­ sorting systems ­to­ solve­ a­n umber­ of ­different ­problems.­

Indeed,­HSI­ has­ found ­a­wide­ range­ of­ applications ­in­ the ­food­ industry,­ including ­the ­quality ­evaluation ­and­ safety­ assessment of several food products ,such as fruits and vegetables,meat,cereals and dairy products.

Moreover, other manufacturing environments, such as the pharmaceutical/ˌfɑːməˈsuːtɪkl/ industry , have employed real-time HSI systems for quality control and process monitoring in the frame of the process analytical technology.

Another relevant field of application of HSI is represented by the recycling industry, where hyperspectral sensors are used to separate end-of-life objects,such as plastic,paper or electronic waste, according to material type.

In these contexts, HSI can be considered as a step forward with respect to traditional spectroscopic/,spektrə’skɔpik/ techniques, which allow fast and non-destructive characterisation of the chemical properties of the analysed samples.

In fact, HSI systems couple these advantages with the possibility of also visualising the spatial distribution of the chemical features of interest within the sample surface.

Furthermore, in sorting systems, HSI can also be employed to quickly identify the chemical composition of homogeneous objects moving on a conveyor belt, and to distinguish them from samples with different composition.

In practical situations,hyperspectral imaging can be applied to address complex classification issues,where the sorting problem under investigation requires the discrimination of several classes at the same time, with some classes sharing similar features.

This can be easily managed by using HSI systems, since with a single measurement, i.e. with the acquisition of a single hyperspectral image, it is possible to have a wide range of information.

However, in order to meet the needs of real-time applications, it is necessary to identify classification strategies able to handle a huge amount of spectral data,providing reliable results in short computational times.

When dealing with multiple classes, this issue can be addressed using a tree-structured classification model,where each branching (tree node) corresponds to a local classification model.

In this manner, classification is performed considering a top-down approach, where the samples are initially assigned to general macro-categories, and then each macro-class is split into increasingly specific categories,until reaching the classes of interest.

Another relevant issue to be faced in practical applications of HSI in sorting systems is related to the fact that,generally, it is not easy to have a strict control of the input stream in order to avoid the presence of foreign objects, i.e. objects not belonging to the target classes of the specific application.
在分类系统中HSI的实际应用中要面对的另一个相关问题是,通常很难对输入流进行严格控制以避免异物的存在,即不属于物体 特定应用程序的目标类。

In this context, the availability of algorithms able to maximise the discrimination between the categories of interest and, at the same time, to identify possible foreign materials is of the utmost importance.

Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most widely used methods for multivariate classification of hyperspectral data.

Basically, PLS-DA is an extension of the PLS algorithm, which aims at identifying a new set of variables, named Latent Variables
(LVs), by maximising the between-classes variance.
基本上,PLS-DA是PLS算法的一种延伸,该算法旨在辨别一组称为 潜在变量(LVs)的新变量,通过最大化类间差异。

Class membership is coded using a dummy Y matrix, and the assignment of unknown samples is based on the a posteriori probability associated with the corresponding Y predicted values.

The standard PLS-DA approach assigns a sample to the class for which it has the higher a posteriori probability, resulting in unknown samples always being assigned to one of the target classes.

Conversely, the possibility of having unassigned samples is one of the major advantages of the so-called class-modelling techniques, which are essentially based on describing each single class independently from the others, and then verifying whether an unknown sample is compliant or not with the characteristics of each class of interest.

In this manner, it is possible that a new unknown sample is rejected from all the class models, resulting in an unassigned sample.

Soft Independent Modelling of Class Analogy /əˈnælədʒi/(SIMCA) is the most common class-modelling method.

It calculates local Principal Component Analysis (PCA) models for each considered class, which are used to define class boundaries based on the distances both in the score space (Hotelling’s T2) and in the residual/rɪ’zɪdjʊə/ space (Qresiduals).
它为每个要考虑的类别计算局部主成分分析(PCA)模型,该模型用于根据分数空间(Hotelling T2)和残差空间(Qresiduals)中的距离来定义类别边界。

Notwithstanding the advantages of class-modelling methods like SIMCA,they can provide poor classification results when the modelled classes are quite overlapped, since the model is not oriented towards the discrimination of the considered categories.

Given these considerations, it is reasonable to assume that a classification algorithm to be efficiently employed in sorting systems should comprise the advantages of both classification techniques and of class-modelling methods, i.e. it should be able to maximise the discrimination between the categories of interest and to recognise and reject outlier samples at the same time.
考虑到这些考虑因素,可以合理地假设要在分类系统中有效使用的分类算法应同时包括分类技术和类建模方法的优点,即应能够最大程度地区分兴趣类别之间的区别, 同时识别和拒绝异常样本。

To this aim, in the present paper a modified version of the PLS-DA algorithm, namely Soft PLS-DA, is proposed.
为此,在本文中提出了PLS-DA算法的改进版本,即Soft PLS-DA。

The basic principle of Soft PLS-DA is the same as PLS-DA, but class assignment is performed by fixing additional limits both on the Y predicted values and on the Q residuals.
Soft PLS-DA的基本原理与PLS-DA相同,但是通过在Y预测值和Q残差的附加限制来执行类分配。

In this manner, the classification model is built by maximising the differences between the modelled classes; at the same time, the additional limits allow the rejection of samples belon





