cnn图像进行预测
In this article, we will be using the image of the polymer structure to predict its glass transition temperature. This article follows a similar methodology as published in one of the recent research papers by Luis.A. Miccio from Materials Physics Center and Donostia International Physics Center in Spain.
在本文中,我们将使用聚合物结构的图像预测其玻璃化转变温度。 本文采用与Luis.A.最近的一篇研究论文中发表的方法类似的方法。 来自西班牙材料物理中心和Donostia国际物理中心的Miccio 。
介绍: (Introduction:)
Glass Transition temperature is one of the crucial properties of polymers. It marks the temperature range below which the atoms of a supercooled liquid are temporarily frozen (without crystallizing) upon cooling. Predicting glass transition temperature (Tg) provides valuable insights into polymer properties whose synthesis may otherwise be costly and time-consuming. Scientists have always been more keener to develop machine learning models qualitatively(For instance, using several other properties to predict its tensile strength). During the last few years, the major emphasis has been given to Quantitative Structure-Property Relationships. This opens the possibility of predicting various properties with just the Structure of the molecular (i.e just the image) compound avoiding requirement of any additional experimental properties or tedious calculations. In this article, we will be using Convolutional Neural Networks to predict Tg of unknow polymer compounds, using the image of the polymer. This sounds so cool, this literally means that if you just draw the image of the monomeric unit on a whiteboard that would be enough to predict its Tg. We do not need any other external information or properties for the polymer.
玻璃化转变温度是聚合物的关键性能之一。 它标志着温度范围,在该温度范围内,过冷液体的原子在冷却后会暂时冻结(不结晶)。 预测玻璃化转变温度(Tg)提供了有价值的洞察力,以了解聚合物的性能,否则其合成可能既昂贵又费时。 科学家一直更热衷于定性开发机器学习模型(例如,使用其他一些属性来预测其拉伸强度)。 在过去的几年中,主要的重点是定量结构与属性的关系。 这开辟了仅用分子的结构(即仅图像)预测各种性质的可能性,从而无需任何其他实验性质或繁琐的计算。 在本文中,我们将使用卷积神经网络通过聚合物图像来预测未知聚合物的Tg。 这听起来很酷,从字面上看,这意味着,如果仅在白板上绘制单体单元的图像,就足以预测其Tg。 我们不需要该聚合物的任何其他外部信息或特性。
导入相关包 (Importing Relevant Packages)
数据集 (Dataset)
The dataset used in our study was gathered from a popular polymer database. The dataset for this study comprises of 351 polymers along with their smiles codes, molecular names as input attributes and glass transition temperatures as the output variable. Subsets of 300 polymers and their Tg values were used for training validating the dataset, whereas the rest 51 unseen polymers were used to test the results for both the models, the CNN and the proposed ANN. The figure below shows the top 5 rows of the dataset. The dataset for this study can be found here.
我们研究中使用的数据集是从流行的聚合物数据库中收集的。 这项研究的数据集包括351种聚合物及其微笑代码,分子名称作为输入属性和玻璃化转变温度作为输出变量。 300个聚合物的子集及其Tg值用于训练验证数据集,而其余51个看不见的聚合物用于测试模型,CNN和拟议的ANN的结果。 下图显示了数据集的前5行。 这项研究的数据集可以在这里找到。
