gsoc 任务_gsoc 2020与cern hsf暗物质和深度学习

最新推荐文章于 2024-05-31 10:02:43 发布

weixin_26630173

最新推荐文章于 2024-05-31 10:02:43 发布

阅读量339

点赞数

文章标签： python 深度学习人工智能机器学习算法

原文链接：https://towardsdatascience.com/gsoc-2020-with-cern-hsf-dark-matter-and-deep-learning-eb611850bb79

版权

gsoc 任务

This blog is a very brief summary of my Google Summer of Code (GSoC) 2020 project under CERN-HSF. This year marks the 16th anniversary of Google Summer of Code which saw 8,902 proposals from 6,626 students, out of whom 1,198 students were given an opportunity to work with 199 organizations.

该博客是我在CERN-HSF下进行的Google Summer of Code(GSoC)2020项目的非常简短的摘要。今年是Google密码学夏令营成立16周年，共有来自6,626名学生的8,902个提案，其中1,198名学生有机会与199个组织合作。

Image for post — Google Summer of Code Google Summer of Code

DeepLense项目 (The DeepLense Project)

项目介绍 (Project description)

DeepLense is a deep learning pipeline for particle dark matter searches with strong gravitational lensing and is a part of the umbrella organization CERN-HSF. Specifically, my project is an extension of the work published in the paper titled “Deep Learning the Morphology of Dark Matter Substructure,” in which my mentors have explored the use of state-of-the-art supervised deep learning models such as ResNet for the multiclass classification of strong lensing images.

DeepLense是用于具有强引力透镜的暗粒子搜索的深度学习管道，并且是CERN-HSF伞组织的一部分。具体来说，我的项目是在题为“ 深度学习暗物质子结构的形态 ”的论文中发表的工作的扩展，其中，我的导师探索了使用最新的有监督的深度学习模型(例如ResNet)来强力镜头图像的多类分类。

Gravitational lensing has been a cornerstone in many cosmology experiments, and studies since it was discussed in Einstein’s calculations back in 1936 and discovered in 1979, and one area of particular interest is the study of dark matter via substructure in strong lensing images. While statistical and supervised machine learning algorithms have been implemented for this task, the potential of unsupervised deep learning algorithms is yet to be explored and could prove to be crucial in the analysis of LSST data. The primary aim of this GSoC 2020 project is to design a python-based framework for implementing unsupervised deep learning architectures to study strong lensing images.

自从1936年在爱因斯坦的计算中讨论并在1979年发现引力透镜以来，它一直是许多宇宙学实验和研究的基石，特别感兴趣的一个领域是通过强透镜图像中的子结构研究暗物质。尽管已经为该任务实现了统计和监督机器学习算法，但仍有待探索无监督深度学习算法的潜力，并且可能被证明对LSST数据的分析至关重要。 GSoC 2020项目的主要目的是设计一个基于python的框架，以实现无监督的深度学习架构来研究强大的镜头图像。

Refer to the paper “Decoding Dark Matter Substructure without Supervision” for more details.

有关更多详细信息，请参阅论文“ 无需监督即可解码暗物质子结构 ”。

储存库 (Repositories)

I have compiled my work into two open-source repositories. The first one titled PyLensing which is a tool for generating lensing images based on PyAutoLens simulations and the second one titled Unsupervised Lensing which is a PyTorch-based tool for Unsupervised Deep Learning applications in strong lensing cosmology.

我已经将我的工作编译成两个开源存储库。第一个标题为PyLensing ，这是一个基于PyAutoLens模拟生成镜头图像的工具，第二个标题为无监督镜头，这是一个基于PyTorch的工具，适用于强镜头宇宙学中的无监督深度学习应用。

关于我 (About Me)

I am K Pranath Reddy, an M.Sc (Hons.) Physics and B.E (Hons.) Electrical and Electronics Engineering major at Birla Institute of Technology and Science (BITS) Pilani — Hyderabad Campus, India.

我是K Pranath Reddy，是印度海德拉巴校区Birla科技学院(BITS)Pilani的物理学(荣誉)理学硕士和BE(荣誉)电气与电子工程专业。

为什么选择DeepLense？ (Why DeepLense?)

Being a physics student, I am familiar with the operations of CERN and possess a fundamental understanding of a lot of projects associated with the organization, and I have worked extensively on the application of deep learning in the field of cosmology. This experience has motivated me to contribute to the DeepLense project.

作为一名物理专业的学生，我熟悉CERN的运作，并且对与该组织相关的许多项目都有基本的了解，并且我在深度学习在宇宙学领域的应用方面进行了广泛的工作。这段经历激励着我为DeepLense项目做出贡献。

数据 (The Data)

Our Dataset consists of three classes, strong lensing images with no substructure, vortex substructure, and spherical substructure. Considering the samples with substructure to be outliers, we will be training our unsupervised models on a set of strong lensing images with no substructure to solve the task of anomaly detection.

我们的数据集包括三类，无子结构的强透镜图像，涡旋子结构和球状子结构。考虑到具有子结构的样本离群值，我们将在一组无子结构的强透镜图像上训练无监督模型，以解决异常检测的任务。

We have generated two sets of lensing images, Model A and Model B. We have used the python package PyAutoLens for our simulations. The difference between the two models is that all simulated images for Model A are held at fixed redshift while Model B allows the lensed and lensing galaxy redshifts to float over a range of values. An additional difference is the SNR in both models. Images for Model A have SNR ≈ 20 where Model B is constructed such that simulations produce images whose SNR varies from 10 to 30. More details about the simulation can be found in the paper.

我们生成了两组镜头图像，模型A和模型B。我们在仿真中使用了python软件包PyAutoLens 。两种模型之间的区别在于，模型A的所有模拟图像均保持固定的红移，而模型B允许镜头和镜头星系的红移在一定值范围内浮动。两个模型的另一个区别是SNR。模型A的图像的SNR≈20，其中模型B的构建使仿真产生的SNR在10到30之间变化。有关仿真的更多详细信息，请参见本文。

无监督模型 (Unsupervised Models)

I have studied and implemented various Unsupervised Models in the context of anomaly detection. In this section, I will be discussing four models, namely Deep Convolutional Autoencoder (DCAE), Convolutional Variational Autoencoder (VAE), Adversarial Autoencoder (AAE), and Restricted Boltzmann Machine (RBM) along with the code for implementing the models using my PyTorch tool Unsupervised Lensing.

我已经在异常检测的背景下研究并实现了各种无监督模型。在本节中，我将讨论四种模型，即深度卷积自动编码器(DCAE)，卷积变分自动编码器(VAE)，对抗性自动编码器(AAE)和受限玻尔兹曼机(RBM)以及使用PyTorch实现模型的代码无监督镜头。

深度卷积自动编码器(DCAE) (Deep Convolutional Autoencoder (DCAE))

An autoencoder is a type of neural network that learns its own representation and consists of an encoder network and a decoder network. The encoder learns to map the input samples to a latent vector whose dimensionality is lower than the dimensionality of the input samples, and the decoder network learns to reconstruct the input from the latent dimension. Thus, autoencoders can be understood qualitatively as algorithms for finding the optimal compressed representation of a given class.

自动编码器是一种神经网络，可以学习自己的表示形式，由编码器网络和解码器网络组成。编码器学习将输入样本映射到维度小于输入样本的维度的潜矢量，而解码器网络学习从潜维度重建输入。因此，自动编码器可以定性地理解为用于找到给定类的最佳压缩表示的算法。

We first consider a deep convolutional autoencoder, which is primarily used for feature extraction and reconstruction of images. During training, we make use of the mean squared error (MSE),

我们首先考虑深度卷积自动编码器，该编码器主要用于特征提取和图像重建。在训练期间，我们利用均方误差(MSE)

as our reconstruction loss where θ and θ′ are the real and reconstructed samples.

作为我们的重建损失，其中θ和θ'是真实样本和重建样本。

Implementation using the PyTorch tool:

使用PyTorch工具实施：

from unsupervised_lensing.models import Convolutional_AEfrom unsupervised_lensing.models.DCAE_Nets import *from unsupervised_lensing.utils import loss_plotter as pltfrom unsupervised_lensing.utils.EMD_Lensing import EMD# Model Training
out = Convolutional_AE.train(data_path='./Data/no_sub_train.npy', 
                             epochs=100,
                             learning_rate=2e-3,
                             optimizer='Adam',
                             checkpoint_path='./Weights',         
                             pretrain=True,                       
                             pretrain_mode='transfer',            
                             pretrain_model='A')                  # Plot the training loss
plt.plot_loss(out)# Model Validation
recon_loss = Convolutional_AE.evaluate(data_path='./Data/no_sub_test.npy', 
                                       checkpoint_path='./Weights',        
                                       out_path='./Results')               # Plot the reconstruction loss
plt.plot_dist(recon_loss)# Calculate Wasserstein distance
print(EMD(data_path='./Data/no_sub_test.npy', recon_path='./Results/Recon_samples.npy'))

卷积变分自编码器(VAE) (Convolutional Variational Autoencoder (VAE))

We also consider a variational autoencoder, which introduces an additional constraint on the representation of the latent dimension in the form of Kullback-Liebler (KL) divergence,

我们还考虑了变分自动编码器，它以Kullback-Liebler(KL)发散的形式对潜在维度的表示形式引入了额外的约束，

where P(x) is the target distribution and Q(x) is the distribution learned by the algorithm. The first term on the r.h.s. is the cross-entropy between P and Q and the second term is the entropy of P. Thus the KL divergence encodes information of how far the distribution Q is from P. In the context of variational autoencoders, the KL divergence serves as a regularization to impose a prior on the latent space. For our purposes, P is chosen to take the form of a Gaussian prior on the latent space z and Q corresponds to the approximate posterior q(z|x) represented by the encoder. The total loss of the model is the sum of reconstruction (MSE) loss and the KL divergence.

其中P(x)是目标分布，而Q(x)是算法学习的分布。 rhs上的第一项是P和Q之间的交叉熵，第二项是P的熵。因此，KL散度编码分布Q与P有多远的信息。在变分自编码器的情况下，KL散度用作将先验强加于潜在空间的正则化。为了我们的目的，选择P以在潜在空间z上采用高斯先验的形式，并且Q对应于编码器表示的近似后验q(z | x)。该模型的总损失是重建(MSE)损失和KL散度的总和。

Implementation using the PyTorch tool:

使用PyTorch工具实施：

from unsupervised_lensing.models import Variational_AE
from unsupervised_lensing.models.VAE_Nets import *
from unsupervised_lensing.utils import loss_plotter as plt
from unsupervised_lensing.utils.EMD_Lensing import EMD# Model Training
out = Variational_AE.train(data_path='./Data/no_sub_train.npy', 
                           epochs=100,
                           learning_rate=2e-3,
                           optimizer='Adam',
                           checkpoint_path='./Weights',         
                           pretrain=True,                      
                           pretrain_mode='transfer',            
                           pretrain_model='A')# Plot the training loss
plt.plot_loss(out)# Model Validation
recon_loss = Variational_AE.evaluate(data_path='./Data/no_sub_test.npy', 
                                     checkpoint_path='./Weights',        
                                     out_path='./Results')# Plot the reconstruction loss
plt.plot_dist(recon_loss)# Calculate Wasserstein distance
print(EMD(data_path='./Data/no_sub_test.npy', recon_path='./Results/Recon_samples.npy'))

对抗自动编码器(AAE) (Adversarial Autoencoder (AAE))

Finally, we consider an adversarial autoencoder which replaces the KL divergence of the variational autoencoder with adversarial learning. We train a discriminator network D to classify between the samples generated by the autoencoder G and samples taken from a prior distribution P(z) corresponding to our training data. The total loss of the model is the sum of reconstruction (MSE) loss and the loss of the discriminator network,

最后，我们考虑一种对抗性自动编码器，该对抗性自动编码器用对抗性学习取代了变分式自动编码器的KL散度。我们训练一个鉴别器网络D，以在自动编码器G生成的样本与从对应于我们的训练数据的先验分布P(z)中获取的样本之间进行分类。该模型的总损失是重建(MSE)损失与鉴别网络损失的总和，

We additionally add a regularization term to the autoencoder of the following form,

我们还向以下形式的自动编码器添加了正则化项，

As the autoencoder becomes proficient in the reconstruction of inputs the ability of the discriminator is degraded. The discriminator network then iterates by improving its performance at distinguishing the real and generated data.

随着自动编码器精通输入的重构，鉴别器的能力将下降。然后，鉴别器网络通过提高其在区分实际数据和生成数据方面的性能来进行迭代。

Implementation using the PyTorch tool:

使用PyTorch工具实施：

from unsupervised_lensing.models import Adversarial_AE
from unsupervised_lensing.models.AAE_Nets import *
from unsupervised_lensing.utils import loss_plotter as plt
from unsupervised_lensing.utils.EMD_Lensing import EMD# Model Training
out = Adversarial_AE.train(data_path='./Data/no_sub_train.npy', 
                           epochs=100,
                           learning_rate=2e-3,
                           optimizer='Adam',
                           checkpoint_path='./Weights',         
                           pretrain=True,                       
                           pretrain_mode='transfer',            
                           pretrain_model='A')# Plot the training loss
plt.plot_loss(out)# Model Validation
recon_loss = Adversarial_AE.evaluate(data_path='./Data/no_sub_test.npy', 
                                     checkpoint_path='./Weights',        
                                     out_path='./Results')# Plot the reconstruction loss
plt.plot_dist(recon_loss)# Calculate Wasserstein distance
print(EMD(data_path='./Data/no_sub_test.npy', recon_path='./Results/Recon_samples.npy'))

受限玻尔兹曼机(RBM) (Restricted Boltzmann Machine (RBM))

To compare with our three autoencoder models, we also train a restricted Boltzmann machine (RBM), which is a generative artificial neural network algorithm that is realized as a bipartite graph that learns a probability distribution for inputs. RBMs consists of two layers, a hidden layer and a visible layer, where training is done in a process called contrastive divergence.

为了与我们的三个自动编码器模型进行比较，我们还训练了受限的Boltzmann机器(RBM)，这是一种生成型人工神经网络算法，实现为二部图，可以学习输入的概率分布。 RBM由两层组成：隐藏层和可见层，其中训练是在称为对比发散的过程中进行的。

A detailed architecture of all the models can be found in Appendix B of the paper.

有关所有模型的详细架构，请参见本文的附录B。

Implementation using the PyTorch tool:

使用PyTorch工具实施：

from unsupervised_lensing.models import RBM_Model
from unsupervised_lensing.models.RBM_Nets import *
from unsupervised_lensing.utils import loss_plotter as plt
from unsupervised_lensing.utils.EMD_Lensing import EMD# Model Training
out = RBM_Model.train(data_path='./Data/no_sub_train.npy', 
                      epochs=100,
                      learning_rate=2e-3,
                      optimizer='Adam',
                      checkpoint_path='./Weights',         
                      pretrain=True,                       
                      pretrain_mode='transfer',            
                      pretrain_model='A')# Plot the training loss
plt.plot_loss(out)# Model Validation
recon_loss = RBM_Model.evaluate(data_path='./Data/no_sub_test.npy', 
                                checkpoint_path='./Weights',        
                                out_path='./Results')# Plot the reconstruction loss
plt.plot_dist(recon_loss)# Calculate Wasserstein distance
print(EMD(data_path='./Data/no_sub_test.npy', recon_path='./Results/Recon_samples.npy'))

结果 (Results)

I have used 25,000 samples with no substructure and 2,500 validation samples per class for training and evaluating the unsupervised models. The models are implemented using the PyTorch package and are run on a single NVIDIA Tesla K80 GPU for 500 epochs. We utilize the area under the ROC curve (AUC) as a metric for classifier performance for all our models. For unsupervised models, the ROC values are calculated for a set threshold of the reconstruction loss. Additionally, we also use the Wasserstein distance value to compare the fidelity of reconstructions. A more detailed set of results can be found in the paper.

我已经使用了25,000个没有子结构的样本和每个班级2500个验证样本来训练和评估无监督模型。这些模型使用PyTorch软件包实现，并在单个NVIDIA Tesla K80 GPU上运行500个纪元。我们将ROC曲线(AUC)下的面积用作所有模型分类器性能的指标。对于无监督模型，ROC值是针对重建损失的设定阈值计算的。此外，我们还使用Wasserstein距离值比较重建的保真度。可以在本文中找到更详细的结果集。

未来的工作和最后的想法 (Future Work and Final thoughts)

Although we got some very promising results for our unsupervised models, there is still further room for improvement in their performance compared to the supervised results of the ResNet model. I am currently exploring the application of graph-based models since they have been successful in tasks related to sparse datasets such as sparse 3D point clouds and sparse detector data. Another future task is using transfer learning to train our architecture on real data by starting from our models which have been trained on simulations.

尽管我们的非监督模型获得了非常令人鼓舞的结果，但与ResNet模型的监督结果相比，它们的性能仍有进一步提高的空间。我目前正在探索基于图的模型的应用，因为它们已成功完成了与稀疏数据集(如稀疏3D点云和稀疏检测器数据)相关的任务。另一个未来的任务是使用迁移学习从我们已经在模拟中训练的模型开始，以实际数据训练我们的体系结构。

I want to thank my mentors Michael Toomey, Sergei Gleyzer, Stephon Alexander, and Emanuele Usai, and the entire CERN-HSF community for their support. I had a great summer working on my GSoC project. I also want to thank Ali Hariri, Hanna Parul, and Ryker Von Klar for their useful discussions.

我要感谢我的导师Michael Toomey，Sergei Gleyzer，Stephon Alexander和Emanuele Usai，以及整个CERN-HSF社区的支持。我在GSoC项目上度过了一个愉快的暑假。我还要感谢Ali Ali Hariri，Hanna Parul和Ryker Von Klar的有益讨论。

To students who want to participate in GSoC in the future, don’t view GSoC as a competition or an exam that needs to be “cracked”. GSoC is about open-source development and becoming a part of a wonderful community of developers. Find projects that you are passionate about and understand the requirements of the organization. Most importantly, stay active on community forums and interact with your project mentors regularly.

对于希望将来参加GSoC的学生，不要将GSoC视为比赛或需要“破解”的考试。 GSoC与开源开发有关，并已成为一个出色的开发人员社区的一部分。查找您感兴趣的项目，并了解组织的需求。最重要的是，保持活跃在社区论坛上并定期与您的项目指导者进行互动。

Thank you, Google, for giving me such an amazing opportunity.

感谢您，谷歌为我提供了如此难得的机会。

重要连结 (Important links)

翻译自: https://towardsdatascience.com/gsoc-2020-with-cern-hsf-dark-matter-and-deep-learning-eb611850bb79

gsoc 任务

weixin_26630173

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
gsoc 任务_gsoc 2020与cern hsf暗物质和深度学习

gsoc 任务This blog is a very brief summary of my Google Summer of Code (GSoC) 2020 project under CERN-HSF. This year marks the 16th anniversary of Google Summer of Code which saw 8,902 proposals from 6,...
复制链接

扫一扫