Paper翻译:《A Novel Convolutional Neural Network Based Model for Recognition and Classification of App》

  • 论文名称:《A Novel Convolutional Neural Network Based Model for Recognition and Classification of Apple Leaf Diseases》
  • 论文作者:Yadav, D. , Akanksha, and A. K. Yadav .
  • 发表期刊:Traitement du Signal 37.6(2020):1093-1101.
  • 论文总结:
  1. Research Gap:
    数据增强与CNN对苹果叶部病害进行检测
  2. Importance:
    使用了对比度拉伸和FCM聚了算法进行数据增强
    对4类苹果叶片的ACC为98.06%

Abstract

原文译文
   Plants have a great role to play in biodiversity sustenance. These natural products not only push their demand for agricultural productivity, but also for the manufacturing of medical products, cosmetics and many more. Apple is one of the fruits that is known for its excellent nutritional properties and is therefore recommended for daily intake. However, due to various diseases in apple plants, farmers have to suffer from a huge loss. This not only causes severe effects on fruit’s health, but also decreases its overall productivity, quantity, and quality. A novel convolutional neural network (CNN) based model for recognition and classification of apple leaf diseases is proposed in this paper. The proposed model applies contrast stretching based pre-processing technique and fuzzy c-means (FCM) clustering algorithm for the identification of plant diseases. These techniques help to improve the accuracy of CNN model even with lesser size of dataset. 400 image samples (200 healthy, 200 diseased) of apple leaves have been used to train and validate the performance of the proposed model. The proposed model achieved an accuracy of 98%. To achieve this accuracy, it uses lesser data-set size as compared to other existing models, without compromising with the performance, which become possible due to use of contrast stretching pre-processing combined with FCM clustering algorithm.   植物在维持生物多样性方面发挥着重要作用。这些天然产品不仅推动了他们对农业生产力的需求,还推动了医疗产品、化妆品等的制造。苹果是以其卓越的营养特性而闻名的水果之一,因此建议每天摄入。然而,由于苹果植株的各种病害,农民不得不蒙受巨大的损失。这不仅会对水果的健康造成严重影响,还会降低其整体生产力、数量和质量。本文提出了一种新的基于卷积神经网络(CNN)的苹果叶病害识别和分类模型。所提出的模型应用基于对比度拉伸的预处理技术和模糊 c 均值 (FCM) 聚类算法来识别植物病害。即使使用较小的数据集,这些技术也有助于提高 CNN 模型的准确性。已使用 400 个苹果叶图像样本(200 个健康的,200 个患病的)来训练和验证所提出模型的性能。所提出的模型达到了 98% 的准确率。为了达到这种精度,与其他现有模型相比,它使用更小的数据集大小,而不会影响性能,由于使用对比度拉伸预处理与 FCM 聚类算法相结合,这成为可能。
Keywords: plants, apple, contrast stretching, fuzzy c means, CNN, disease diagnosis关键词:植物,苹果,对比拉伸,模糊c均值,CNN,疾病诊断

1.Introduction

原文译文
   The Indian economy is heavily dependent on efficient agriculture. The detection of diseases in plants therefore plays an important role in agriculture [1]. The use of automated disease detection techniques is advantageous for the fast identification of diseases in plants [2]. For instance, black rot is one of the most prevalent and serious diseases that plagues apple trees. They appear as brown spots, which expand in concentrated circles and finally turn black, decaying the fruits. Later, the disease spreads to the roots of the tree causing cancers that can ultimately kill the tree. Early-stage detection of these diseases in such situations could have been helpful.   印度经济严重依赖高效农业。 因此,植物病害检测在农业中起着重要作用[1]。 自动病害检测技术的使用有利于快速识别植物中的病害 [2]。 例如,黑腐病是困扰苹果树的最普遍和最严重的疾病之一。 它们表现为褐色斑点,呈集中圆形扩大,最后变黑,使果实腐烂。 后来,这种疾病蔓延到树的根部,导致最终可以杀死树的癌症。 在这种情况下早期发现这些疾病可能会有所帮助。
   To prevent large losses, different techniques for diagnosing diseases have been developed in the past. Techniques developed in microbiology and immunology offer correct recognition of the causative agents. Nonetheless, for many farmers, these approaches are inaccessible and require a thorough knowledge of the region or a large amount of money and energy to carry out. As per the United Nations Food and Agriculture Organization, most farms in the world are small and managed by families in developing countries such as India [3]. Such families grow food for a large proportion of the population of the country. Even so, hunger and food scarcity are not unusual and, market access and resources are constrained. For the above reasons, much work has been done in order to develop methods that are sufficiently reliable and available to the majority of farmers. The techniques of digital image processing increase the chance of early identification of diseases in plants, so that the required preventive steps can be taken [4].   为了防止大的损失,过去已经开发了不同的疾病诊断技术。微生物学和免疫学中开发的技术可以正确识别病原体。尽管如此,对于许多农民来说,这些方法是无法获得的,需要对区域有透彻的了解,或者需要大量的资金和精力才能实施。根据联合国粮食及农业组织的数据,世界上大多数农场都很小,由印度等发展中国家的家庭管理[3]。这些家庭为该国大部分人口种植粮食。即便如此,饥饿和粮食短缺并不罕见,市场准入和资源受到限制。由于上述原因,已经做了很多工作来开发足够可靠且可供大多数农民使用的方法。数字图像处理技术增加了早期识别植物病害的机会,从而可以采取所需的预防措施[4]。
   While researchers have worked rigorously to identify plant diseases using different methods such as RNA/DNA, sensor techniques, etc. [5] but the field of machine vision to identify manifestations of fruit leaf diseases is still less examined. Apples are one of the widely consumed fruits, a great source of phytochemicals mostly expressing pertinent antioxidant abilities in vitro, and scientific studies have related apple ingestion to a lower chance of certain cancers, cardiovascular disease, asthma, and diabetes [6, 7]. The consolidated list of abbreviations used in the manuscript is as shown in Table 1.   虽然研究人员已经使用不同的方法(如 RNA/DNA、传感器技术等)进行了严格的工作以识别植物病害 [5],但机器视觉领域识别果叶病害的表现仍然很少被研究。 苹果是一种广泛食用的水果,是植物化学物质的重要来源,主要在体外表现出相关的抗氧化能力,科学研究表明,摄入苹果与某些癌症、心血管疾病、哮喘和糖尿病的几率较低有关 [6, 7]。 手稿中使用的缩略的综合列表如表 1 所示。

在这里插入图片描述

原文译文
   The main contributions in this work are stated as follows:
   1.A novel convolutional neural network based model for recognition and classification of apple leave disease is proposed. The proposed model utilizes contrast stretching based pre-processing and fuzzy c-means clustering for image segmentation. Both these approaches boost the performance of CNN classier even on lesser size of training data as compared to other state of the art methods.
   2.A comprehensive discussion on the existing work is presented to elaborate the research gaps.
   3. Extensive computer simulations are performed to determine the effectiveness of the proposed system. A benchmark dataset (Kaggle) which is composed of 4-types of apple leaves are used for simulations. Simulation result reveals that the proposed system showed competitive performance over the other state-of-the-art methods.
   这项工作的主要贡献如下:
  1.提出了一种新的基于卷积神经网络的苹果叶病识别和分类模型。 所提出的模型利用基于对比度拉伸的预处理和模糊 c 均值聚类进行图像分割。 与其他最先进的方法相比,这两种方法即使在较小规模的训练数据上也能提高 CNN 分类器的性能。
  2.对现有工作进行全面讨论,阐述研究差距。
  3. 进行广泛的计算机模拟以确定所提议系统的有效性。 由 4 种苹果叶组成的基准数据集 (Kaggle) 用于模拟。 仿真结果表明,与其他最先进的方法相比,所提出的系统表现出具有竞争力的性能。

2. RELATED WORK

原文译文
   Traditional ways for identifying as well as analyzing the diseases in fruit leaves are manual. However, these manual processes take time, are cumbersome and also very subjective [8]. Several methods have been developed in recent years incorporating computer vision to detect and identify agricultural and horticultural crop diseases to address the manual techniques issues [9, 10]. Image collection, retrieval of features, filtering of features and classification analysis with parametric or non-parametric statistics are fundamental steps in those processes. Image processing techniques and classification mechanisms are the main concern for the efficient functioning of the computer vision system.   识别和分析果叶病害的传统方法是手动的。 然而,这些手动过程需要时间、繁琐且非常主观[8]。 近年来已经开发了几种方法,结合计算机视觉来检测和识别农业和园艺作物病害,以解决手动技术问题 [9, 10]。 图像收集、特征检索、特征过滤和使用参数或非参数统计的分类分析是这些过程中的基本步骤。 图像处理技术和分类机制是计算机视觉系统有效运行的主要问题。
   Research on the identification of plant disease using machine learning is on the rise. The main reason it may be that expert eye observation of scientists has often proved to be very impractical for such systems and moreover, constant surveillance is needed, which is very costly when dealing with large farms [11]. In some areas, farmers do not have appropriate facilities or even the knowledge that they can seek experts. Under these conditions, automatic disease detection by seeing the signs on the leaves of plants makes the system much faster, easier and cheaper. This also promotes machine vision to offer automated image based process control, examination and robot assistance [12].   使用机器学习识别植物病害的研究正在兴起。 主要原因可能是科学家的专家眼睛观察通常被证明对于此类系统非常不切实际,而且需要持续监视,这在处理大型农场时成本非常高 [11]。 在某些地区,农民没有适当的设施,甚至没有可以寻求专家的知识。 在这些条件下,通过查看植物叶子上的迹象进行自动疾病检测使系统更快、更容易、更便宜。 这也促进了机器视觉提供基于自动化图像的过程控制、检查和机器人辅助 [12]。
   Selvaraj et al. [13] suggested a four step scheme: first an RGB image color transformation system is generated for the input, the green pixels are then covered and replaced using different threshold values preceded by segmentation. For useful segments, texture statistics are calculated and the derived features are finally passed to the SVM classifier. Pujari et al. [14] proposed Support Vector Machine and Artificial Neural Network based identification and classification of fungal disease in cereals. The regions concerned are segmented using k-means segmentation. Color texture characteristics are extracted from affected regions and then used as classifier inputs. Vishnu et al. [15] used K-means clustering technique for leaf segmentation and then calculated the texture features for the segmented infected objects. Lastly, the derived features were processed through neural network model. Muthukannan and Latha [16] proposed a novel solution to image segmentation, called PSO. PSO is an efficient, selfregulating unsupervised algorithm that is used for improved segmentation and extraction of features. The hybrid characteristic coefficients were then obtained from the cooccurrence gray level matrices of various leaves. Chung et al. [17] have suggested an approach using support vector machine (SVM) classifiers to differentiate healthy and Bakanaeinfected rice seedlings.   Selvaraj等人[13] 提出了一个四步方案:首先为输入生成一个 RGB 图像颜色转换系统,然后在分割之前使用不同的阈值覆盖和替换绿色像素。对于有用的片段,计算纹理统计数据,并将导出的特征最终传递给 SVM 分类器。Pujari等人[14] 提出了基于支持向量机和人工神经网络的谷物真菌病害识别和分类。相关区域使用 k-means 分割进行分割。从受影响的区域中提取颜色纹理特征,然后用作分类器输入。Vishnu等[15] 使用 K-means 聚类技术进行叶子分割,然后计算被分割的感染对象的纹理特征。最后,通过神经网络模型处理导出的特征。 Muthukannan 和 Latha [16] 提出了一种新的图像分割解决方案,称为 PSO。 PSO 是一种高效、自调节的无监督算法,用于改进特征的分割和提取。然后从各种叶子的共生灰度矩阵中获得混合特征系数。钟等人 [17] 提出了一种使用支持向量机 (SVM) 分类器来区分健康和 Bakanae 感染的水稻幼苗的方法。
   Zhang et al. [18] suggested a technique for the identification of cucumber disease based on decomposition of the globallocal single value to increase the detection rate. The classification of the unidentified disease leaf picture was implemented by SVM classifier. Ashourloo et al. [19] utilized regression techniques for the identification of wheat plant rust disease. Later, they also evaluated the effect of data set on results. Ali et al. [20] used the ∆E color difference algorithm to isolate the affected area from the leaf as well as color histogram and compositional features to identify diseases. They applied principal components analysis for the features set dimension reduction and Bagged tree classifier for classification. Kaur et al. [21] used desegregated Particle Swarm Optimization (PSO) technique and support vector machine (SVM) for the identification and classification of plant leaf diseases. The prime motive of this study was to identify portion of the leaf, affected by the disease and a stable portion of the leaf. Ma et al. [22] employed comprehensive color function and its method of detection that can segment images of disease spots recorded under real field circumstances. The approach ensures reliable feedback into detection of CNN-based disease recognition.   张等人[18]提出了一种基于全局局部单值分解的黄瓜病害识别技术,以提高检测率。未识别病叶图片的分类由SVM分类器实现。 Ashourloo 等[19] 利用回归技术识别小麦植株锈病。后来,他们还评估了数据集对结果的影响。阿里等人。 [20] 使用 ΔE 色差算法从叶子中分离受影响的区域,以及颜色直方图和成分特征来识别疾病。他们将主成分分析应用于特征集降维和袋装树分类器进行分类。考尔等人。 [21] 使用去分离粒子群优化 (PSO) 技术和支持向量机 (SVM) 进行植物叶片病害的识别和分类。这项研究的主要动机是确定受疾病影响的叶子部分和叶子的稳定部分。马等人。[22]采用综合颜色函数及其检测方法,可以分割在真实现场情况下记录的病斑图像。该方法确保了对基于 CNN 的疾病识别检测的可靠反馈。
   Mondal et al. [23] used forty-three morphological characteristics of okra and bitter gourd leaves to identify symptoms of disease from their images. In this approach, feature set collection was rendered utilizing Pearson correlation coefficient and the entropy-based discretization was used to improve the classification success rate. Zhang et al. [24] utilized various useful features that were chosen by combining the merits of the genetic algorithm (GA) with correlation-based feature selection (CFS). Here, GA and CFS played a key role in decreasing the dissensions of the feature space. Lastly, SVM classifier was used for diseases identification. Singh et al. [25] described a method of image segmentation using GA and later classification was done by using SVM. B. Liu et al. [26] proposed a model for accurate identification of apple leaf disorder that requires producing many pathological representations. Finally, they developed a novel architecture of an AlexNet based deep CNN for disease detection. A computational network of CNNs was proposed by Dechant et al. [27] to tackle the limitations of restricted data and the countless variations that occur in field grown leaf images. Multiple CNNs were trained to identify small image areas, and their predictions were compiled into independent heat maps that are then fed into a final CNN trained model to identify the entire image as diseased or not.   蒙达尔等人[23] 使用秋葵和苦瓜叶的 43 种形态特征从它们的图像中识别疾病症状。在该方法中,利用皮尔逊相关系数呈现特征集集合,并使用基于熵的离散化来提高分类成功率。张等人。 [24] 利用了各种有用的特征,这些特征是通过将遗传算法 (GA) 的优点与基于相关性的特征选择 (CFS) 的优点相结合来选择的。在这里,GA 和 CFS 在减少特征空间的分歧方面发挥了关键作用。最后,SVM分类器用于疾病识别。辛格等人[25] 描述了一种使用 GA 进行图像分割的方法,后来使用 SVM 进行分类。 B. Liu 等人[26] 提出了一个准确识别苹果叶病的模型,该模型需要产生许多病理表征。最后,他们开发了一种新的基于 AlexNet 的深度 CNN 架构,用于疾病检测。 Dechant 等人提出了 CNN 的计算网络[27] 解决受限数据的局限性以及田间种植的叶子图像中发生的无数变化。多个 CNN 被训练来识别小图像区域,它们的预测被编译成独立的热图,然后被输入到最终的 CNN 训练模型中,以识别整个图像是否有病。
   Hanson et al. [28] proposed a new method for the detection of plant diseases using a deep convolutional neural network trained as well as fine-tuned to suit appropriately with plant leaves database that was independently collected for various plant diseases. Yao et al. [29] proposed an efficient three-layer discovery system for the characterization of different stages of growth of white-backed planthoppers on rice crops in paddy farmlands. Sethy et al. [30] proposed fuzzy Logic together with K-means segmentation method to measure the extent of the disease in rice crops. Fuentes et al. [31] proposed a deeplearning method to identify diseases as well as pests in tomato using photographs recorded in-place with various resolutions by camera devices. The research illustrated the efficiency of deep meta-architectures and characteristic extractors.   汉森等人 [28] 提出了一种检测植物病害的新方法,该方法使用经过训练和微调的深度卷积神经网络,以适应为各种植物病害独立收集的植物叶片数据库。 姚等人 [29] 提出了一种高效的三层发现系统,用于表征稻田水稻作物上白背飞虱的不同生长阶段。 塞西等人 [30] 提出模糊逻辑与 K 均值分割方法一起测量水稻病害的程度。 富恩特斯等人 [31] 提出了一种深度学习方法,使用相机设备以各种分辨率就地记录的照片来识别番茄中的病虫害。 该研究说明了深层元架构和特征提取器的效率。
   Sunny et al. [32] proposed two-stage solution to enhance the image clarity. The primary stage uses Contrast Limited Adaptive Histogram Equalization (CLAHE) to pre-process the leaf image, followed by segmentation using K-mean clustering and extraction of texture characteristics via statistical Gray Level Co-Occurrence Matrix (GLCM). The second stage utilizes support vector machine to identify the plant as healthy or ill. Zhang et al. [33] suggested clustering algorithm to split the color-diseased image of the leaf into many small superpixels and then used K-means clustering method to fragment the image of every super-pixel of the lesion. Finally, the Pyramid Histogram of Oriented Gradients (PHOG) functionality was derived from three color attributes of each fragmented lesion image as well as its grayscale image, and a vector was concatenated with four PHOG descriptors.   桑尼等人 [32]提出了两阶段解决方案来增强图像清晰度。 初级阶段使用对比度有限自适应直方图均衡化 (CLAHE) 对叶子图像进行预处理,然后使用 K 均值聚类进行分割,并通过统计灰度共生矩阵 (GLCM) 提取纹理特征。 第二阶段利用支持向量机来识别植物健康或生病。 张等人[33] 提出了聚类算法,将叶子的颜色病变图像分割成许多小的超像素,然后使用 K-means 聚类方法对病变的每个超像素的图像进行碎片化。 最后,定向梯度金字塔直方图 (PHOG) 功能源自每个碎片化病变图像及其灰度图像的三个颜色属性,并将一个向量与四个 PHOG 描述符连接起来。
   Alsuwaidi et al. [34] used a ground-breaking analytical classification system in which they incorporated adaptive feature collection, novelty identification and ensemble learning with the hyper spectral datasets. Singh et al. [35] presented an automated approach to differentiate between Neem and Bakain using the texture characteristics of its leaves and then they used tree classifier to separate them in separate classes. Brahimi et al. [36], suggested the employment of saliency maps for simulation to perceive and decode the CNN classification process. This process of simulation improves the clarity of deep learning frameworks and offers further knowledge about plant disease symptoms.   Alsuwaidi 等 [34] 使用了一个突破性的分析分类系统,其中他们将自适应特征收集、新颖性识别和集成学习与超光谱数据集结合起来。 辛格等人。 [35] 提出了一种使用叶子的纹理特征区分 Neem 和 Bakain 的自动化方法,然后他们使用树分类器将它们分成不同的类。 卜拉希米等人 [36],建议使用显着图进行模拟来感知和解码 CNN 分类过程。 这种模拟过程提高了深度学习框架的清晰度,并提供了有关植物病害症状的进一步知识。
   Yue et al. [37] proposed a super resolution model that relies on the residual deep recursive network and provides the stateof-the-arts performance as compared to the traditional methods. Iqbal et al. [38] presented a review on the different approaches of identification and classification of diseases specific to citrus plants.   岳等人 [37] 提出了一种依赖于残差深度递归网络的超分辨率模型,与传统方法相比,它提供了最先进的性能。 伊克巴尔等人 [38] 对柑橘植物特有病害的不同鉴定和分类方法进行了综述。
   Dhingra et al. [39] has outlined a neutrosophic approach based on computer vision for plant disease analysis. This system uses a fuzzy set extension technique based on neutrosophical logic segmentation to analyze the area of interest and then, new feature subset is assessed on the basis of segmented area to classify the basil leaf as healthy or diseased. Picon et al. [40] used the Deep Residual Neural Network-based algorithm to detect plant diseases under specific acquisition circumstances where various adaptations have been suggested for early disease discovery. This research analyzes the early identification success of three related European endemic diseases of wheat: Septoria, Tan Spot and Rust.   丁格拉等人 [39] 概述了一种基于计算机视觉的植物病害分析中智方法。 该系统使用基于中智逻辑分割的模糊集扩展技术来分析感兴趣的区域,然后在分割区域的基础上评估新的特征子集,将罗勒叶分类为健康或患病。 皮康等人 [40] 使用基于深度残差神经网络的算法来检测特定采集环境下的植物病害,在这些情况下,已建议对早期病害发现进行各种调整。 本研究分析了三种相关的欧洲小麦地方病:Septoria、Tan Spot 和 Rust 的早期鉴定成功。
   Wu et al. [41] proposed a new solution to crop disease detection focused on multi-functional sparse constrain system that mainly includes three phases: segmentation of lesions, extraction of features and disease detection. They applied this model for diseased cucumber images and achieved accuracy of 88.05% Waheed et al. [42] proposed DenseNet, a dense convolutional neural network based model for recognition and identification of diseases in corn leaves. They claimed that their proposed method uses significantly lesser parameters as compared to other state-of-the-arts methods.   吴等人 [41] 提出了一种新的作物病害检测解决方案,该解决方案侧重于多功能稀疏约束系统,主要包括三个阶段:病灶分割、特征提取和病害检测。 他们将该模型应用于患病黄瓜图像,并获得了 88.05% 的准确率 Waheed 等人 [42] 提出了 DenseNet,这是一种基于密集卷积神经网络的模型,用于识别和识别玉米叶片中的病害。 他们声称,与其他最先进的方法相比,他们提出的方法使用的参数要少得多。

3. MODELS AND METHODS

原文译文
   This section discusses in detail about the proposed method of apple leaves disease classification and identification including dataset collection, pre-processing, segmentation, feature extractions, training, testing etc. It is organized in different subsections.  本节详细讨论了所提出的苹果叶片病害分类和识别方法,包括数据集收集、预处理、分割、特征提取、训练、测试等。它分为不同的小节。

3.1 Data set

原文译文
   The dataset is composed of four types of apple leaves downloaded from Kaggle [43]. These are healthy and unhealthy apple leaves. Among the unhealthy apple leaves, it consists of three types of infections namely: apple scab, black rot and apple rust. Figure 1 shows healthy and unhealthy sample of images in the dataset collection used in this article.   该数据集由从 Kaggle [43] 下载的四种苹果叶组成。 这些是健康和不健康的苹果叶。 在不健康的苹果叶中,它由三种类型的感染组成,即:苹果疮痂病、黑腐病和苹果锈病。 图 1 显示了本文使用的数据集集合中健康和不健康的图像样本。

在这里插入图片描述

3.2 Basic model design

原文译文
   Following are fundamental steps followed to design the proposed systems:
   1. Pre-processing of data: The main objective of preprocessing is to highlight the information which is concealed by the contrast stretching [44] approach for improved contrast.
   2. Segmentation: The concerned area is identified after pre-processing using FCM clustering segmentation.
   3. Extraction of features: The classification model used in the proposed work is convolutional neural networks [45] which is also used for extraction of features.
   4. Classification: The methodology used for the identification of leaf disease is CNN.
   以下是设计拟议系统所遵循的基本步骤:
   1.数据预处理:预处理的主要目的是突出对比度拉伸[44]方法隐藏的信息,以提高对比度。
   2.分割:使用FCM聚类分割进行预处理后识别关注区域。
   3.特征提取:所提出的工作中使用的分类模型是卷积神经网络[45],它也用于特征提取。
   4.分类:用于识别叶病的方法是CNN。
   The steps mentioned above are addressed in more depth in the subsequent parts. The flow diagram given in Figure 2 presents the proposed approach. The primary purpose of this work is to enable the system to learn the characteristics that differentiate one class from another. In order to achieve this, dataset can be increased by using augmented images to improve the network’s probability of learning the correct features [46]. Figure 3 shows the result of augmentation on unhealthy apple leaf images.   上述步骤将在后续部分中更深入地讨论。 图 2 中给出的流程图展示了建议的方法。 这项工作的主要目的是使系统能够学习区分一类与另一类的特征。 为了实现这一点,可以通过使用增强图像来增加数据集,以提高网络学习正确特征的概率 [46]。 图 3 显示了对不健康的苹果叶图像进行增强的结果。

在这里插入图片描述
在这里插入图片描述

3.3 Preprocessing

原文译文
   Image preprocessing is an important step in examination and manipulation of a digital photo [47], particularly with a view to highlight problem regions i.e., diseased region after data gathering is done. However, visual enhancement is one of the complexities of image processing [48], and is also very unique to a specific task. In this work, the contrast stretching algorithm is used to enhance the image. The contrast enhancement methods are used to extend the range of brightness levels in an image, so that the image can be viewed effectively in the way the analyst wishes. The contrast level in an image can differ due to poor lighting or inappropriate setting in the calibration device.   图像预处理是检查和处理数码照片的重要步骤 [47],特别是为了突出问题区域,即数据收集完成后的患病区域。 然而,视觉增强是图像处理的复杂性之一 [48],并且对于特定任务也是非常独特的。 在这项工作中,对比度拉伸算法用于增强图像。 对比度增强方法用于扩展图像中的亮度级别范围,以便可以以分析人员希望的方式有效地查看图像。 由于光线不足或校准设备中的设置不当,图像中的对比度水平可能会有所不同。
   The method of contrast stretching is to examine the spread of pixel concentrations in a picture and then dynamically resize the picture so as to include all levels of intensity falling within 2nd and 98th percentiles. The formula for global contrast stretching is governed by the following equation where, inRGB(x, y) is the original RGB value of the pixel, outRGB(x, y) is the new RGB value of the pixel, minRGB is minimum value between the components, and maxRGB is maximum value between the RGB components (red, green, and blue) of the original image.   对比度拉伸的方法是检查图片中像素浓度的分布,然后动态调整图片大小,以包括落在第 2 个和第 98 个百分位数内的所有强度级别。 全局对比度拉伸的公式由以下等式控制 其中,inRGB(x, y) 是像素的原始 RGB 值,outRGB(x, y) 是像素的新 RGB 值,minRGB 是像素之间的最小值 分量,maxRGB 是原始图像的 RGB 分量(红、绿、蓝)之间的最大值。

在这里插入图片描述

3.4 Segmentation technique

原文译文
   Segmentation of images is an arduous process because of the intricacy and variety of images [49]. Factors like lighting [50], contrast, interference etc. affect the outcome of segmentation. The segmentation aim is to identify the areas of concern to determine the disease. We have advocated FCM clustering approach for segmentation. FCM is a clustering technique that enables a piece of information to be a member of more than one cluster and therefore, it belongs to the class of soft segmentation technique. These techniques are popularly used for image segmentation since much better details from the main image could be retrieved in comparison to hard segmentation approaches. FCM is a technique of clustering that allocate pixels to unlabelled clusters with different membership rates. FCM clustering segmentation algorithm is outlined in algorithm 1 [51]. Unlike other clustering methods, where data point must exclusively belong to one cluster center, in FCM data clustering, point is assigned membership to each cluster center as a result of which data point may belong to more than one cluster center. Figure 4 shows the result on unhealthy leaf sample after applying preprocessing and segmentation steps.   由于图像的复杂性和多样性,图像分割是一个艰巨的过程 [49]。光照[50]、对比度、干扰等因素会影响分割的结果。分割的目的是确定关注的领域以确定疾病。我们提倡使用 FCM 聚类方法进行分割。 FCM 是一种聚类技术,它使一条信息成为多个聚类的成员,因此它属于软分割技术的一类。这些技术广泛用于图像分割,因为与硬分割方法相比,可以从主图像中检索出更好的细节。 FCM 是一种聚类技术,它将像素分配给具有不同成员率的未标记聚类。 FCM 聚类分割算法在算法 1 [51] 中有概述。与其他聚类方法不同,数据点必须只属于一个聚类中心,在 FCM 数据聚类中,点被分配给每个聚类中心,因此数据点可能属于多个聚类中心。图 4 显示了应用预处理和分割步骤后对不健康叶子样本的结果。

在这里插入图片描述

在这里插入图片描述

3.5 Feature extraction and classification

原文译文
   The remarkable improvement in performance, achieved for different tasks using deep neural networks inspired us to employ it for the purpose of image classification in the present work. We used CNN to extract features and classify images. The output of the segmentation is given to the CNN to classify whether an image is healthy or not. A CNN is a kind of feedforward network and is an end-to-end pipeline methodology that can inevitably uncover the discriminatory characteristics for image classification. Since the features in CNN are not extracted and implemented on the basis of human knowledge, they are very less prone to artificial feature extraction. CNN consists of layers; however, these layers are not fully interconnected. They have filtering process, which includes series of cube like shape weights applied all through the picture. Each two-dimensional filtration portion is called “kernel”. Major component of CNN is the convolutional layer. A convolution functions between the higher feature maps of the present layer and the convolution kernels decide the resulting feature map for every convolution layer. The output characteristic map can be given by Eq. (5) [26].   使用深度神经网络为不同任务实现的性能显着提高激发了我们在当前工作中将其用于图像分类的目的。我们使用 CNN 来提取特征并对图像进行分类。将分割的输出提供给 CNN 以对图像是否健康进行分类。 CNN 是一种前馈网络,是一种端到端的管道方法,可以不可避免地揭示图像分类的判别特征。由于CNN中的特征不是在人类知识的基础上提取和实现的,因此它们不太容易出现人工特征提取。 CNN由层组成;然而,这些层并没有完全互连。它们具有过滤过程,其中包括在整个图片中应用的一系列立方体形状权重。每个二维过滤部分称为“核”。 CNN 的主要组成部分是卷积层。当前层的较高特征图和卷积核之间的卷积函数决定了每个卷积层的结果特征图。输出特性图可以由方程给出。 (5) [26]。

在这里插入图片描述

原文译文
   where, l is the lth layer, Bb denotes bias, cab is the convolutional kernel and Xb is set of input feature maps. The neural model’s learning capacity is affected by the ReLU activation function and it has a quick convergence speed. This approach is therefore used for the output of each convolutional layer. Mathematically it can be expressed as given in Eq. (6) [26]:  其中,l 是第 l 层,Bb 表示偏差,cab 是卷积核,Xb 是输入特征图的集合。 神经模型的学习能力受 ReLU 激活函数影响,收敛速度快。 因此,这种方法用于每个卷积层的输出。 在数学上,它可以表示为方程(6) [26]:

在这里插入图片描述

原文译文
   Pooling is another crucial layer of CNN. It is a type of nonlinear down sampling. The size of the feature maps obtained from convolutional layers could be reduced using this layer to accomplish spatial invariance. Completely connected layers are inserted before a CNN’s classification output, and used before classification to straighten the result. The endproduct of the final pooling or convolutionary layer is the entry to the totally linked layer. Figure 5 represents the CNN model parameters used in this work.   池化是 CNN 的另一个关键层。 它是一种非线性下采样。 使用该层可以减小从卷积层获得的特征图的大小,以实现空间不变性。 在 CNN 的分类输出之前插入完全连接的层,并在分类之前使用以拉直结果。 最终池化或卷积层的最终产品是完全链接层的入口。 图 5 表示这项工作中使用的 CNN 模型参数。

在这里插入图片描述

4. EXPERIMENTAL OBSERVATION

4.1 Dataset of apple leaf images

原文译文
   The dataset collection consists of 400 images, out of which 200 images are healthy leaf samples and the rest consists of various categories of diseased leaf samples that includes apple scab, black rot, and apple rust. All the experiments are performed employing “Keras framework on top of tensorflow”. The dataset is expanded by utilizing “ImageDataGenerator class” of tensorflow.   数据集集合由 400 张图像组成,其中 200 张图像是健康的叶子样本,其余的由各种类别的病叶样本组成,包括苹果黑星病、黑腐病和苹果锈病。 所有的实验都是使用“基于张量流的 Keras 框架”进行的。 利用 tensorflow 的“ImageDataGenerator 类”扩展数据集。

4.2 Evaluation metrices for system model

原文译文
   The efficiency of the proposed system was assessed using different evaluation parameters [52, 53]. A brief overview about these metrices is provided below.   所提出的系统的效率是使用不同的评估参数进行评估的 [52, 53]。 下面提供了有关这些指标的简要概述。

在这里插入图片描述

4.3 System model evaluation (Training-Validation observation)

原文译文
   In this section, based on the training-testing dataset listed within Table 2 and Table 3, we examined the predictive outcome of CNN model. Data distribution for this model is set at 80% and 20% respectively in the training-validation analysis.   在本节中,基于表 2 和表 3 中列出的训练测试数据集,我们检查了 CNN 模型的预测结果。 在训练验证分析中,该模型的数据分布分别设置为 80% 和 20%。

在这里插入图片描述

原文译文
   Figures 6-9 demonstrates the CNN model’s classification accuracy by adding each proposed step with respect to different assessment parameters, defined in section 4.2. Since on training the model with bigger dataset, we can expect the outcome to be more accurate on test data. Therefore, the assessment parameters are also evaluated on the augmented dataset. Augmented dataset is the improved version of existing dataset in terms of its size and variety, without the need to explicitly gather new data. It is clear from Figure 6 that the proposed model achieves 94% accuracy with lower dataset and this has been further improved to 98% with the use of augmented dataset.   图 6-9 展示了 CNN 模型的分类准确度,方法是针对 4.2 节中定义的不同评估参数添加每个建议的步骤。 由于使用更大的数据集训练模型,我们可以预期结果在测试数据上更准确。 因此,评估参数也在增强数据集上进行评估。 增强数据集是现有数据集在规模和种类方面的改进版本,无需明确收集新数据。 从图 6 中可以清楚地看出,所提出的模型在较低的数据集上实现了 94% 的准确率,并且通过使用增强数据集进一步提高到了 98%。

在这里插入图片描述

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

4.4 Analysis on various Training-Validation partitions

原文译文
   In addition, the accuracy is measured on various trainingvalidation divisions i.e. 50-50, 60-40, 70-30, 80-20 and 90-10, as shown in Tables 4 and 5 for original and augmented dataset respectively.   此外,准确度是在各种训练验证分区上测量的,即 50-50、60-40、70-30、80-20 和 90-10,分别如表 4 和表 5 所示的原始数据集和增强数据集。

在这里插入图片描述

原文译文
   Accuracy rate is evaluated on both original as well as augmented dataset to check the robustness of the proposed model. The accuracy of CNN model depends strongly on its training dataset size. But the presented system achieves sufficiently high accuracy even with the small training dataset. Results in Table 4 show that the presented method can achieve accuracy of 95% when 360 training and 40 validation image samples were used. Also, it is evident from Table 5 that with an improved data set and higher training ratio, the proposed system can achieve accuracy of up to 98%.   在原始数据集和增强数据集上评估准确率,以检查所提出模型的稳健性。 CNN 模型的准确性在很大程度上取决于其训练数据集的大小。 但是,即使使用小的训练数据集,所提出的系统也能达到足够高的准确度。 表 4 中的结果表明,当使用 360 个训练和 40 个验证图像样本时,所提出的方法可以达到 95% 的准确率。 此外,从表 5 中可以明显看出,通过改进的数据集和更高的训练率,所提出的系统可以实现高达 98% 的准确率。

4.5 Comparison with the existing systems

原文译文
   In this experiment, we measure our proposed model’s classification accuracy against the conventional methods [26, 28]. As outlined in Table 6, the proposed approach provides superior performance than other methods with far less augmented dataset size. The comparison with these systems is made on the grounds of common use of augmented dataset and to show how the proposed model performs better even the augmented set is much lower. The main idea of Liu et al. [26] is to generate ample pathological representation of apple images and build an innovative model of AlexNet based deep CNN for disease identification. Waheed et al. [28] also achieved good results by collecting more dataset and tuning of system variables. Waheed et al. [42] proposed optimized DenseNet model corn leaf and achieved an accuracy of 98.06%. In this work, the prime motive was to enhance the system accuracy with minimal dataset which is acquired by using contrast stretching as pre-processing method and FCM clustering algorithm for segmentation.   在这个实验中,我们测量了我们提出的模型相对于传统方法的分类精度 [26, 28]。如表 6 所示,与其他方法相比,所提出的方法具有更好的性能,但数据集大小的增加要少得多。与这些系统的比较是基于增强数据集的普遍使用,并展示了所提出的模型如何在增强集低得多的情况下表现得更好。 Liu等人的主要思想 [26] 是生成苹果图像的充足病理表征,并构建基于 AlexNet 的深度 CNN 的创新模型,用于疾病识别。瓦希德等人 [28] 也通过收集更多的数据集和系统变量的调整取得了良好的效果。瓦希德等人 [42] 提出了优化的 DenseNet 模型玉米叶,准确率达到了 98.06%。在这项工作中,主要动机是通过使用对比度拉伸作为预处理方法和 FCM 聚类算法进行分割获得的最小数据集来提高系统精度。

在这里插入图片描述

5. CONCLUSION

原文译文
   In this work, we proposed a novel approach to efficiently use contrast stretching based preprocessing and fuzzy c-means segmentation, together with CNN to identify disease in apple leaves. The entire operation was outlined from collecting images to segmentation and eventually, feature extraction and classification by CNN. Based on the results of preprocessing followed by segmentation, a new deep convolutional neural network model has been developed that discover distinctive features automatically, and also determine apple leaf diseases accurately. The study proposed has been compared with existing state-of-the-art and the results were found quite impressive. The developed system works better with 98% accuracy rate and that too, with much lesser dataset size.   In this work, we proposed a novel approach to efficiently use contrast stretching based preprocessing and fuzzy c-means segmentation, together with CNN to identify disease in apple leaves. The entire operation was outlined from collecting images to segmentation and eventually, feature extraction and classification by CNN. Based on the results of preprocessing followed by segmentation, a new deep convolutional neural network model has been developed that discover distinctive features automatically, and also determine apple leaf diseases accurately. The study proposed has been compared with existing state-of-the-art and the results were found quite impressive. The developed system works better with 98% accuracy rate and that too, with much lesser dataset size.
   Future research could focus on extending suggested work to characterize each category of diseases separately and approximate the seriousness of the diseases identified. An undiscovered combination of the extraction, collection of features and learning approaches can also be analysed to improve the effectiveness of disease diagnosis and identification models.  未来的研究可以侧重于扩展建议的工作,以分别表征每一类疾病,并估计所确定疾病的严重程度。 还可以分析提取、特征收集和学习方法的未发现组合,以提高疾病诊断和识别模型的有效性。
  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值