文章目录
0、前言
本篇基于官方文档的教程。
本篇的目的是解释GPyTorch
中超参是如何起作用的、怎样调整,约束与先验有什么选项,另外还会介绍与别的packages的区别。
import math
import numpy as np
import torch
from matplotlib import pyplot as plt
from gpytorch import kernels, means, models, mlls, settings, likelihoods, constraints,prior
from gpytorch import distributions as distr
1、定义模型
模型与基础教程中的相同。我们用它来演示针对超参数的一些操作。
# Training data is 100 points in [0,1] inclusive regularly spaced
train_x = torch.linspace(0, 1, 100)
# True function is sin(2*pi*x) with Gaussian noise
train_y = torch.sin(train_x * (2 * math.pi)) + torch.randn(train_x.size()) * math.sqrt(0.04)
# We will use the simplest form of GP model, exact inference
class ExactGPModel(models.ExactGP):
def __init__(self, train_x, train_y, likelihood):
super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
self.mean_module = means.ConstantMean()
self.covar_module = kernels.ScaleKernel(kernels.RBFKernel())
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
return distr.MultivariateNormal(mean_x, covar_x)
# initialize likelihood and model
likelihood = likelihoods.GaussianLikelihood()
model = ExactGPModel(train_x, train_y, likelihood)
2、查看模型参数
让我们来看看模型的参数。这儿提到的参数都是指torch.nn.Parameter
,在自动求导时会填充其梯度。有两种方法查看,其中一种是model.state_dict()
,我们这儿用它保存模型。
下面的内容中我们用到的是另一种方法:遍历model.named_parameters()
生成器
for param_name, param in model.named_parameters():
print(f'Parameter name: {param_name:42} value = {param.item()}')
2.1、原始 vs 实际参数
这要特别注意的是模型实际要学习的参数是诸如raw_noise
, raw_outputscale
, raw_lengthscale
的参数。原因是它们必须是正的。这就引入参数的下一个话题了:约束,以及原始和实际参数的区别。
为了实现强制为正和其他约束条件,GPyTorch
通过某种constraint把原始参数raw parameters转变成实际值actual values。现在来看看原始outputscale、其约束与最终的值吧。
raw_outputscale = model.covar_module.raw_outputscale
print('raw_outputscale, ', raw_outputscale)
# Three ways of accessing the raw outputscale constraint
print('\nraw_outputscale_constraint1', model.covar_module.raw_outputscale_constraint)
print('\n\n**Printing all model constraints...**\n')
for constraint_name, constraint in model.named_constraints():
print(f'Constraint name: {constraint_name:55} constraint = {constraint}')
print('\n**Getting raw outputscale constraint from model...**')
print(model.constraint_for_parameter_name("covar_module.raw_outputscale"))
print('\n**Getting raw outputscale constraint from model.covar_module...**')
print(model.covar_module.constraint_for_parameter_name("raw_outputscale"))
(译者注:lengthscale是用来控制频率(input轴的波动)的,outputscale我猜是用来控制振幅(output轴的波动)的,可参见此篇知乎文章。)
2.2、constraints是如何起作用的
constraints定义了transform
和inverse transform
方法来把原始参数转化为真实参数。对于positive约束,我们期望参数总是大于零的。
print('Transformed outputscale', constraint.transform(raw_outputscale))
print(constraint.inverse_transform(constraint.transform(raw_outputscale)))
print(torch.equal(constraint.inverse_transform(constraint.transform(raw_outputscale)), raw_outputscale))
print('Transform a bunch of negative tensors: ', constraint.transform(torch.tensor([-1., -2., -3.])))
(译者注:之前我一直以为
P
o
s
i
t
i
v
e
(
⋅
)
Positive(\cdot)
Positive(⋅)是绝对值函数,但看到这里才发现是
s
o
f
t
p
l
u
s
softplus
softplus函数,记为
ζ
(
x
)
\zeta(x)
ζ(x)。
ζ
(
x
)
=
log
(
1
+
e
x
)
\zeta(x)=\log(1+e^x)
ζ(x)=log(1+ex)。该函数的特性是恒大于零,单调递增
lim
x
→
∞
ζ
(
x
)
=
x
\lim\limits_{x\rightarrow\infty}\zeta(x)=x
x→∞limζ(x)=x,
lim
x
→
0
ζ
(
x
)
=
0
\quad\lim\limits_{x\rightarrow0}\zeta(x)=0
x→0limζ(x)=0)
2.3、便捷地获取/设定转化后的参数值
处理原始参数很不方便(我们容易知道0.01的噪声方差是什么意思,但很难知道-2.791的raw_noise
表示的意义),实际上所有定义原始参数的内置GPyTorch
模组也同时定义了直接处理转化后的参数值的获取器getters和赋值器setters。
下面我们对比获取/设定参数值的繁琐方法与便捷方法。
# Recreate model to reset outputscale
model = ExactGPModel(train_x, train_y, likelihood)
# Inconvenient way of getting true outputscale
raw_outputscale = model.covar_module.raw_outputscale
constraint = model.covar_module.raw_outputscale_constraint
outputscale = constraint.transform(raw_outputscale)
print(f'Actual outputscale: {outputscale.item()}')
# Inconvenient way of setting true outputscale
model.covar_module.raw_outputscale.data.fill_(constraint.inverse_transform(torch.tensor(2.)))
raw_outputscale = model.covar_module.raw_outputscale
outputscale = constraint.transform(raw_outputscale)
print(f'Actual outputscale after setting: {outputscale.item()}')
哎呀,这也太麻烦了叭~所幸GPyTorch
为我们提供了更好的方式:
# Recreate model to reset outputscale
model = ExactGPModel(train_x, train_y, likelihood)
# Convenient way of getting true outputscale
print(f'Actual outputscale: {model.covar_module.outputscale}')
# Convenient way of setting true outputscale
model.covar_module.outputscale = 2.
print(f'Actual outputscale after setting: {model.covar_module.outputscale}')
3、改变参数约束
如果我们查看模型的实际噪声,就会发现GPyTorch
为其设定了1e-4
的下限。
print(f'Actual noise value: {likelihood.noise}')
print(f'Noise constraint: {likelihood.noise_covar.raw_noise_constraint}')
我们可以动态修改噪声约束,抑或是在创建likelihood对象时进行设定。
print(f'Actual noise value: {likelihood.noise}')
print(f'Noise constraint: {likelihood.noise_covar.raw_noise_constraint}')
likelihood = likelihoods.GaussianLikelihood(noise_constraint=constraints.GreaterThan(1e-3))
print(f'Noise constraint: {likelihood.noise_covar.raw_noise_constraint}')
# Changing the constraint after the module has been created
likelihood.noise_covar.register_constraint("raw_noise", constraints.Positive())
print(f'Noise constraint: {likelihood.noise_covar.raw_noise_constraint}')
(译者注:除了Positive
和GreaterThan
,constraints
还有LessThan
和Interval
约束)
4、先验
GPyTorch
中,先验prior储存在模型中,作用于任意参数的任意函数。如同约束,它们既可以在创建模型时设定,也可在创建之后动态修改。
# Registers a prior on the sqrt of the noise parameter
# (e.g., a prior for the noise standard deviation instead of variance)
likelihood.noise_covar.register_prior(
"noise_std_prior",
gpytorch.priors.NormalPrior(0, 1),
lambda module: module.noise.sqrt()
)
# Create a GaussianLikelihood with a normal prior for the noise
likelihood = gpytorch.likelihoods.GaussianLikelihood(
noise_constraint=gpytorch.constraints.GreaterThan(1e-3),
noise_prior=gpytorch.priors.NormalPrior(0, 1)
)
5、融会贯通
下面我们对超参数人为设定一些先验和更严格的约束,来完善我们的模型创建。
class FancyGPWithPriors(models.ExactGP):
def __init__(self, train_x, train_y, likelihood):
super(FancyGPWithPriors, self).__init__(train_x, train_y, likelihood)
self.mean_module = means.ConstantMean()
lengthscale_prior = priors.GammaPrior(3.0, 6.0)
outputscale_prior = priors.GammaPrior(2.0, 0.15)
self.covar_module = kernels.ScaleKernel(
kernels.RBFKernel(
lengthscale_prior=lengthscale_prior,
),
outputscale_prior=outputscale_prior
)
# Initialize lengthscale and outputscale to mean of priors
self.covar_module.base_kernel.lengthscale = lengthscale_prior.mean
self.covar_module.outputscale = outputscale_prior.mean
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
return distr.MultivariateNormal(mean_x, covar_x)
likelihood = likelihoods.GaussianLikelihood(
noise_constraint=constraints.GreaterThan(1e-2),
)
model = FancyGPWithPriors(train_x, train_y, likelihood)
6、一次性初始化
方便起见,GPyTorch
也定义了initialize
方法,使我们用字典来更新模块的参数。
hypers = {
'likelihood.noise_covar.noise': torch.tensor(1.),
'covar_module.base_kernel.lengthscale': torch.tensor(0.5),
'covar_module.outputscale': torch.tensor(2.),
}
model.initialize(**hypers)
print(
model.likelihood.noise_covar.noise.item(),
model.covar_module.base_kernel.lengthscale.item(),
model.covar_module.outputscale.item()
)