傅里叶特征学习高频:Fourier 相关工作+实验分析+代码实现

在这里插入图片描述
个人感觉这个任务还是有一定特点的:输入数据应该呈现类似均匀分布 个人感觉这个任务还是有一定特点的:输入数据应该呈现类似均匀分布 个人感觉这个任务还是有一定特点的:输入数据应该呈现类似均匀分布

基于坐标的MLP的相关实验

       由于基于坐标的MLP易于基于梯度的优化和机器学习,并且可以比网格采样表示更紧凑,因此该策略具有吸引力。并在许多方面各种任务中取得了最先进的结果。

  • 表示图像(合成图案生成网络)
    • Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. CVPR, 2015.
    • Kenneth O. Stanley. Compositional pattern producing networks: A novel abstraction of development. Genetic Programming and Evolvable Machines, 2007.
  • 体积密度
    • Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. NeRF: Representing scenes as neural radiance fields for view synthesis. arXiv preprint arXiv:2003.08934, 2020.
  • 占用率
    • Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. Occupancy networks: Learning 3D reconstruction in function space. CVPR, 2019.
  • 符号距离
    • Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. DeepSDF: Learning continuous signed distance functions for shape representation. CVPR,2019.
  • 形状表示
    • Zhiqin Chen and Hao Zhang. Learning implicit fields for generative shape modeling. CVPR,2019.
    • Boyang Deng, JP Lewis, Timothy Jeruzalski, Gerard Pons-Moll, Geoffrey Hinton, Mohammad Norouzi, and Andrea Tagliasacchi. Neural articulated shape approximation. arXiv preprint arXiv:1912.03207, 2019.
    • Kyle Genova, Forrester Cole, Aaron Sarna Daniel Vlasic, William T. Freeman, and Thomas Funkhouser. Learning shape templates with structured implicit functions. ICCV, 2019.
    • Kyle Genova, Forrester Cole, Avneesh Sud, Aaron Sarna, and Thomas Funkhouser. Local deep implicit functions for 3D shape. CVPR, 2020.
    • Chiyu Jiang, Avneesh Sud, Ameesh Makadia, Jingwei Huang, Matthias Nießner, and Thomas Funkhouser. Local implicit grid representations for 3D scenes. CVPR, 2020.
    • Mateusz Michalkiewicz, Jhony K Pontes, Dominic Jack, Mahsa Baktashmotlagh, and Anders Eriksson. Implicit surface representations as layers in neural networks. ICCV, 2019.
    • Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. DeepSDF: Learning continuous signed distance functions for shape representation. CVPR,2019.
  • 纹理合成
    • Philipp Henzler, Niloy J Mitra, and Tobias Ritschel. Learning a neural 3d texture space from 2d exemplars. CVPR, 2020.
    • Michael Oechsle, Lars Mescheder, Michael Niemeyer, Thilo Strauss, and Andreas Geiger.Texture fields: Learning texture representations in function space. ICCV, 2019.
  • 图像的形状推断
    • Shaohui Liu, Yinda Zhang, Songyou Peng, Boxin Shi, Marc Pollefeys, and Zhaopeng Cui. Dist:Rendering deep implicit signed distance function with differentiable sphere tracing. CVPR,2020.
    • Shichen Liu, Shunsuke Saito, Weikai Chen, and Hao Li. Learning to infer implicit surfaces without 3D supervision. NeurIPS, 2019.
  • 新颖的视图合成
    • Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi,and Ren Ng. NeRF: Representing scenes as neural radiance fields for view synthesis. arXiv preprint arXiv:2003.08934, 2020.
    • Michael Niemeyer, Lars Mescheder, Michael Oechsle, and Andreas Geiger. Differentiable volumetric rendering: Learning implicit 3D representations without 3D supervision. CVPR, 2020.
    • Shunsuke Saito, , Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, and Hao Li. PIFu: Pixel-aligned implicit function for high-resolution clothed human digitization. ICCV, 2019.
    • Vincent Sitzmann, Michael Zollhoefer, and Gordon Wetzstein. Scene representation networks: Continuous 3D-structure-aware neural scene representations. NeurIPS, 2019.

在这里插入图片描述
       在第6节中,我们继续进行高维实验的高斯分布,并将尺度作为超参数,在验证数据集上进行调整.在映射时需要在高斯分布中随机采样参数,分布的std对实验结果的影响如下:

在这里插入图片描述

在这里插入图片描述
除了3Dshape其他的评价指标都为PSNR,3Dshape使用IoU (higher is better for all).

On the Spectral Bias of Deep Neural Networks笔记
深度神经网络为什么不易过拟合?固有频谱偏差On the Spectral Bias of Neural Networks
github代码
论文 18 Jun 2020
论文笔记1
论文笔记2
video
相关链接:
Nerf
https://dellaert.github.io/NeRF/
ICLR 2019 有什么值得关注的亮点?我们终于证明了,只要足够宽,随机初始化的神经网络+梯度下降真的可以拟合所有数据!!!
Gradient Descent Provably Optimizes Over-parameterized Neural Networks

相关工作

       我们的工作的动机是广泛使用基于坐标的MLP来表示各种视觉信号,包括图像[38]和3D场景[24,27,32]。特别地,我们的分析旨在阐明实验结果,证明使用对数间隔轴对齐频率的正弦波进行坐标输入映射(他们称之为“位置编码”)可以提高基于坐标的MLP在从2D图像[27]和蛋白质合成新视图任务中的性能低温电子显微镜的结构建模[44]。我们分析了这种技术,以表明它对应于MLP的NTK的修改,并且我们表明其他非轴对齐频率分布可以优于这种位置编码。
之前在自然语言处理和时间序列分析方面的工作[18,39,42]使用了类似的位置编码来表示时间或一维位置。特别是,Xu等人[42]使用随机傅立叶特征(RFF)1用正弦输入映射逼近平稳核,并提出调整映射参数的技术。我们的工作将这种映射直接解释为对最终网络的NTK的修改,从而扩展了这一点。此外,我们还讨论了多维坐标的嵌入,这是视觉和图形任务所必需的。

       为了分析在输入坐标通过MLP之前将傅里叶特征映射应用于输入坐标的效果,我们依赖于最近的理论工作,该工作使用NTK[2,5,11,16,20]将神经网络在无限宽度和无限小学习率的限制下建模为核回归。
       特别是,我们使用了Lee等人[20]和Arora等人[2]的分析,这表明在梯度下降过程中,网络的输出仍然接近线性动力系统的输出,其收敛速度由NTK矩阵的特征值决定[2,3,5,20,43]。对NTK特征分解(NTK’s eigendecomposition)的分析表明,其特征值谱随频率迅速衰减,这解释了人们广泛观察到的深层网络学习低频函数的“谱偏差”[3,4,33]。
       我们利用这一分析来考虑在网络之前添加傅立叶特征映射的影响,并且我们表明这种映射对NTK的特征值谱和相应的网络在实际中的收敛特性有显著的影响。
       我们利用这一分析来考虑在网络之前添加傅立叶特征映射的影响,并且我们表明这种映射对NTK的特征值谱和相应的网络在实际中的收敛特性有显著的影响。

Model and training code

官方代码

二维图像回归。

以下为官方实现的核心代码:this example,for each input image coordinate ( x , y ) (x, y) (x,y), the model predicts the associated color ( r , g , b ) (r, g, b) (r,g,b).
在这个任务中,我们训练MLP从2D输入像素坐标回归到图像的相应RGB值。对于每个测试图像,我们在一个有规则间隔的屏幕上训练一个MLP
包含1/4像素的网格,并报告剩余像素的测试错误。我们比较了自然图像数据集和文本图像数据集上的输入映射。

B_dict = {}
# Standard network - no mapping
B_dict['none'] = None
# Basic mapping
B_dict['basic'] = np.eye(2)
# Three different scales of Gaussian Fourier feature mappings
B_gauss = random.normal(rand_key, (mapping_size, 2))
for scale in [1., 10., 100.]:
    B_dict[f'gauss_{scale}'] = B_gauss * scale

# This should take about 2-3 minutes
outputs = {}
for k in tqdm(B_dict):
    outputs[k] = train_model(network_size, learning_rate, iters, B_dict[k], train_data, test_data)
init_fn, apply_fn = make_network(*network_size)
apply_fn(params, input_mapping(x, B)))
# JAX network definition
def make_network(num_layers, num_channels):
    layers = []
    for i in range(num_layers - 1):
        layers.append(stax.Dense(num_channels))
        layers.append(stax.Relu)
    layers.append(stax.Dense(3))
    layers.append(stax.Sigmoid)
    return stax.serial(*layers)

项目https://github.com/Atom-101/FourierFeat-Siren

Pytorch implementation and comparison of Fourier Feature Networks and Sinusoidal Representation Networks

MLP(
  (layers): Sequential(
    (0): Linear(in_features=512, out_features=256, bias=True)
    (1): ReLU(inplace=True)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU(inplace=True)
    (4): Linear(in_features=256, out_features=256, bias=True)
    (5): ReLU(inplace=True)
    (6): Linear(in_features=256, out_features=3, bias=True)
  )
)
def map_x(x,B):
    xp = torch.matmul(2*math.pi*x,B)
    return torch.cat([torch.sin(xp),torch.cos(xp)],dim=-1)
def map_x(x,B):
    xp = torch.matmul(2*math.pi*x,B)
    return torch.sin(xp)

====================================================

# Image taken from authors' colab demo: https://colab.research.google.com/github/tancik/fourier-feature-networks/blob/master/Demo.ipynb
import numpy as np
import matplotlib.pyplot as plt
# from tqdm.notebook import tqdm as tqdm
from tqdm import *
import os, imageio
from imageio import imread,imsave
import torch
import torch.nn as nn
import torch.nn.functional as F
import math

# Download image, take a square crop from the center
image_url = 'https://live.staticflickr.com/7492/15677707699_d9d67acf9d_b.jpg'
img = imageio.imread(image_url)[..., :3] / 255.
c = [img.shape[0]//2, img.shape[1]//2]
r = 256
img = img[c[0]-r:c[0]+r, c[1]-r:c[1]+r]

# plt.imshow(img)
# plt.show()

# Create input pixel coordinates in the unit square
coords = np.linspace(0, 1, img.shape[0], endpoint=False)
x_test = np.stack(np.meshgrid(coords, coords), -1)
test_data = [x_test, img]
train_data = [x_test[::2,::2], img[::2,::2]]

#%%

class MLP(nn.Module):
    def __init__(self,depth=4,mapping_size=512,hidden_size=256):
        super().__init__()
        layers = []
        layers.append(nn.Linear(mapping_size,hidden_size))
        layers.append(nn.ReLU(inplace=True))
        # for _ in range(depth-2):
        #     layers.append(nn.Linear(hidden_size,hidden_size))
        #     layers.append(nn.ReLU(inplace=True))
        layers.append(nn.Linear(hidden_size,3))
        self.layers = nn.Sequential(*layers)
    def forward(self,x):
        return torch.sigmoid(self.layers(x))

#%%

xb,yb = torch.tensor(train_data[0]).reshape(-1,2),torch.tensor(train_data[1]).reshape(-1,3)
x_test,y_test = torch.tensor(test_data[0]).reshape(-1,2),torch.tensor(test_data[1]).reshape(-1,3)
xb,yb,x_test,y_test = xb.float().cuda(),yb.float().cuda(),x_test.float().cuda(),y_test.float().cuda()


#%% md

# Original Mapping

#%%

def map_x(x,B):
    xp = torch.matmul(2*math.pi*x,B)
    return torch.cat([torch.sin(xp),torch.cos(xp)],dim=-1)

#%%

model = MLP().cuda()
opt = torch.optim.Adam(model.parameters(),lr=1e-4)
loss = nn.MSELoss()
B = torch.randn(2,256).cuda() * 10
xt = map_x(xb,B)  # [65536,2] - [2,256] -> [65536,256]
for i in tqdm(range(1000)):
    print(i)
    ypred = model(xt)
    l = loss(ypred,yb)
    opt.zero_grad()
    l.backward()
    opt.step()
    model.eval()
    with torch.no_grad():
        ypreds = model(map_x(x_test, B))
        ypreds = ypreds.reshape(512, 512, 3)
        imsave('gaussion\\gaussion'+str(i)+'.png', (ypreds * 255).cpu().numpy())


# Preds
model.cpu().eval()
with torch.no_grad():
    ypreds = model(map_x(x_test,B.cpu()))
    ypreds = ypreds.reshape(512,512,3)
    imsave('gaussion.png',(ypreds*255).numpy() )


















========================================================================

# Image taken from authors' colab demo: https://colab.research.google.com/github/tancik/fourier-feature-networks/blob/master/Demo.ipynb
import numpy as np
import matplotlib.pyplot as plt
# from tqdm.notebook import tqdm as tqdm
from tqdm import *
import os, imageio
from imageio import imread,imsave
import torch
import torch.nn as nn
import torch.nn.functional as F
import math

# Download image, take a square crop from the center
image_url = 'https://live.staticflickr.com/7492/15677707699_d9d67acf9d_b.jpg'
img = imageio.imread(image_url)[..., :3] / 255.
c = [img.shape[0]//2, img.shape[1]//2]
r = 256
img = img[c[0]-r:c[0]+r, c[1]-r:c[1]+r]

# plt.imshow(img)
# plt.show()

# Create input pixel coordinates in the unit square
coords = np.linspace(0, 1, img.shape[0], endpoint=False)
x_test = np.stack(np.meshgrid(coords, coords), -1)
test_data = [x_test, img]
train_data = [x_test[::2,::2], img[::2,::2]]

#%%

class MLP(nn.Module):
    def __init__(self,depth=4,mapping_size=512,hidden_size=256):
        super().__init__()
        layers = []
        layers.append(nn.Linear(mapping_size,hidden_size))
        layers.append(nn.ReLU(inplace=True))
        # for _ in range(depth-2):
        #     layers.append(nn.Linear(hidden_size,hidden_size))
        #     layers.append(nn.ReLU(inplace=True))
        layers.append(nn.Linear(hidden_size,3))
        self.layers = nn.Sequential(*layers)
    def forward(self,x):
        return torch.sigmoid(self.layers(x))

#%%

xb,yb = torch.tensor(train_data[0]).reshape(-1,2),torch.tensor(train_data[1]).reshape(-1,3)
x_test,y_test = torch.tensor(test_data[0]).reshape(-1,2),torch.tensor(test_data[1]).reshape(-1,3)
xb,yb,x_test,y_test = xb.float().cuda(),yb.float().cuda(),x_test.float().cuda(),y_test.float().cuda()


#%% md

# Original Mapping

#%%

def map_x(x,B):
    xp = torch.matmul(2*math.pi*x,B)
    return torch.cat([torch.sin(xp),torch.cos(xp)],dim=-1)

#%%

model = MLP().cuda()
opt = torch.optim.Adam(model.parameters(),lr=1e-4)
loss = nn.MSELoss()

xb = torch.randn(65536,2).cuda()
B = torch.randn(2,256).cuda() * 10
xt = torch.randn(65536,256*2).cuda()#map_x(xb,B)  # [65536,2] - [2,256] -> [65536,256*2]
for i in tqdm(range(1000)):
    print(i)
    ypred = model(xt)
    l = loss(ypred,yb)
    opt.zero_grad()
    l.backward()
    opt.step()
    model.eval()
    with torch.no_grad():
        ypreds  = model(xt)#model(map_x(x_test, B))
        ypreds = ypreds.reshape(256, 256, 3)
        imsave('gaussion\\gaussion'+str(i)+'.png', (ypreds * 255).cpu().numpy())





class MLP(nn.Module):
    def __init__(self,outdim=3,mapping_size=512,hidden_size=256):
        super().__init__()
        layers = []
        layers.append(nn.Linear(mapping_size,hidden_size))
        layers.append(nn.ReLU(inplace=True))
        # for _ in range(depth-2):
        #     layers.append(nn.Linear(hidden_size,hidden_size))
        #     layers.append(nn.ReLU(inplace=True))
        layers.append(nn.Linear(hidden_size,outdim))
        self.layers = nn.Sequential(*layers)
    def forward(self,x):
        return torch.sigmoid(self.layers(x))

  1. Ali Rahimi and Benjamin Recht. Random features for large-scale kernel machines. NeurIPS,2007. ↩︎

  • 3
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值