激活函数之ReLU/softplus介绍及C++实现

最新推荐文章于 2024-08-01 19:40:18 发布

fengbingchun

最新推荐文章于 2024-08-01 19:40:18 发布

阅读量1.2w

点赞数 2

本文链接：https://blog.csdn.net/fengbingchun/article/details/73872828

版权

Deep Learning 同时被 3 个专栏收录

134 篇文章 122 订阅

订阅专栏

Caffe

51 篇文章 18 订阅

订阅专栏

Math Knowledge

31 篇文章 2 订阅

订阅专栏

softplus函数(softplus function)：ζ(x)=ln(1+exp(x)).

softplus函数可以用来产生正态分布的β和σ参数，因为它的范围是(0,∞)。当处理包含sigmoid函数的表达式时它也经常出现。softplus函数名字来源于它是另外一个函数的平滑(或”软化”)形式，这个函数是x⁺=max(0,x)。softplus 是对 ReLU 的平滑逼近的解析函数形式。

softplus函数别设计成正部函数(positive part function)的平滑版本，这个正部函数是指x⁺=max{0,x}。与正部函数相对的是负部函数(negative part function)x^-=max{0, -x}。为了获得类似负部函数的一个平滑函数，我们可以使用ζ(-x)。就像x可以用它的正部和负部通过等式x⁺-x^-=x恢复一样，我们也可以用同样的方式对ζ(x)和ζ(-x)进行操作，就像下式中那样：ζ(x) -ζ(-x)=x.

Rectifier:In the context of artificial neural networks, the rectifier is an activation function defined as:

f(x)=max(0,x)

where x is the input to a neuron. This activation function was first introduced to a dynamical network by Hahnloser et al. in a 2000 paper in Nature. It has been used in convolutional networks more effectively than the widely used logistic sigmoid (which is inspired by probability theory; see logistic regression) and its more practical counterpart, the hyperbolic tangent. The rectifier is, as of 2015, the most popular activation function for deep neural networks.

A unit employing the rectifier is also called a rectified linear unit (ReLU).

A smooth approximation to the rectifier is the analytic function: f(x)=ln(1+e^x), which is called the softplus function. The derivative of softplus is: f’(x)=e^x/(e^x+1)=1/(1+e^-x), i.e. the logistic function.

Rectified linear units(ReLU) find applications in computer vision and speech recognition using deep neural nets.

Noisy ReLUs: Rectified linear units can be extended to include Gaussian noise, making them noisy ReLUs, giving: f(x)=max(0, x+Y), with Y∽N(0, σ(x)). Noisy ReLUs have been used with some success in restricted Boltzmann machines for computer vision tasks.

Leaky ReLUs：allow a small, non-zero gradient when the unit is not active：

Parametric ReLUs take this idea further by making the coefficient of leakage into a parameter that is learned along with the other neural network parameters:

Note that for a≤1, this is equivalent to: f(x)=max(x, ax), and thus has a relation to "maxout" networks.

ELUs：Exponential linear units try to make the mean activations closer to zero which speeds up learning. It has been shown that ELUs can obtain higher classification accuracy than ReLUs：

a is a hyper-parameter to be tuned and a≥0 is a constraint.

以上内容摘自：《深度学习中文版》和维基百科

以下是C++测试code：

#include "funset.hpp"
#include <math.h>
#include <iostream>
#include <string>
#include <vector>
#include <opencv2/opencv.hpp>
#include "common.hpp"

// ========================= Activation Function: ELUs ========================
template<typename _Tp>
int activation_function_ELUs(const _Tp* src, _Tp* dst, int length, _Tp a = 1.)
{
	if (a < 0) {
		fprintf(stderr, "a is a hyper-parameter to be tuned and a>=0 is a constraint\n");
		return -1;
	}

	for (int i = 0; i < length; ++i) {
		dst[i] = src[i] >= (_Tp)0. ? src[i] : (a * (exp(src[i]) - (_Tp)1.));
	}

	return 0;
}

// ========================= Activation Function: Leaky_ReLUs =================
template<typename _Tp>
int activation_function_Leaky_ReLUs(const _Tp* src, _Tp* dst, int length)
{
	for (int i = 0; i < length; ++i) {
		dst[i] = src[i] > (_Tp)0. ? src[i] : (_Tp)0.01 * src[i];
	}

	return 0;
}

// ========================= Activation Function: ReLU =======================
template<typename _Tp>
int activation_function_ReLU(const _Tp* src, _Tp* dst, int length)
{
	for (int i = 0; i < length; ++i) {
		dst[i] = std::max((_Tp)0., src[i]);
	}

	return 0;
}

// ========================= Activation Function: softplus ===================
template<typename _Tp>
int activation_function_softplus(const _Tp* src, _Tp* dst, int length)
{
	for (int i = 0; i < length; ++i) {
		dst[i] = log((_Tp)1. + exp(src[i]));
	}

	return 0;
}

int test_activation_function()
{
	std::vector<double> src{ 1.23f, 4.14f, -3.23f, -1.23f, 5.21f, 0.234f, -0.78f, 6.23f };
	int length = src.size();
	std::vector<double> dst(length);

	fprintf(stderr, "source vector: \n");
	fbc::print_matrix(src);
	fprintf(stderr, "calculate activation function:\n");
	fprintf(stderr, "type: sigmoid result: \n");
	fbc::activation_function_sigmoid(src.data(), dst.data(), length);
	fbc::print_matrix(dst);
	fprintf(stderr, "type: sigmoid fast result: \n");
	fbc::activation_function_sigmoid_fast(src.data(), dst.data(), length);
	fbc::print_matrix(dst);
	fprintf(stderr, "type: softplus result: \n");
	fbc::activation_function_softplus(src.data(), dst.data(), length);
	fbc::print_matrix(dst);
	fprintf(stderr, "type: ReLU result: \n");
	fbc::activation_function_ReLU(src.data(), dst.data(), length);
	fbc::print_matrix(dst);
	fprintf(stderr, "type: Leaky ReLUs result: \n");
	fbc::activation_function_Leaky_ReLUs(src.data(), dst.data(), length);
	fbc::print_matrix(dst);
	fprintf(stderr, "type: Leaky ELUs result: \n");
	fbc::activation_function_ELUs(src.data(), dst.data(), length);
	fbc::print_matrix(dst);

	return 0;
}

GitHub： https://github.com/fengbingchun/NN_Test