用10阶多项式拟合10000个点
import matplotlib.pyplot as plt
import random
一、数据生成
#首先保证这10000个点x各不相同,从0-20000内随机生成不同的数,除以100,将x值变为0-200
num = random.sample(range(1,20000),10000)
num_x = []
for i in range(10000):
num_x.append(num[i] / 100)
#y也一样,y的生成不需要不一样
num_y = [random.randint(0,100) for i in range(10000)]
print(len(num_y))
10000
#观察点
fig,ax = plt.subplots()
ax.set_ylabel('X')
ax.set_xlabel('Y')
ax.set_title('Polynomial Fitting')
ax.plot(num_x,num_y)
[<matplotlib.lines.Line2D at 0x25fd50f8048>]
-
这些点根本没办法拟合,重新弄数据,想到一种办法,可以先用10次多项式算10000个点,然后加上随机噪声
-
构造多项式
#构造十阶多项式, 该列表中从前到后对应10次,9次,8次等
polynomial = [random.random() for i in range(10)]
print(polynomial)
#计算多项式的值
def calculate(polynomial,x):
n = len(polynomial)
sum = polynomial[0]
for i in range(1,n):
sum *= x
sum += polynomial[i]
return sum
[0.053340626373557076, 0.2183335740410205, 0.9030944136075595, 0.7048816191727242, 0.18165101344430157, 0.9319508221357489, 0.4854520866475778, 0.5540870360221756, 0.5109581693655714, 0.5327543213626857]
- 重新生成数据
n = 10000
polynomial_true = [0.02,0.012,-0.06,0.03,-0.04,0.05,0.4,0.1,-0.2,0.3,10] #真实多项式
#依旧采用上面的方法作为x
num_x = []
num_x = set(num_x)
while len(num_x)<n:
k = (random.random()-0.5)*5
num_x.add(k)
num_x = set(num_x)
print(len(num_x))
num_x = list(num_x)
num_y = []
for i in range(n):
# temp = calculate(polynomial_true,num_x[i]) + random.randint(0,10) #真实值加上一个随机噪声
temp = calculate(polynomial_true,num_x[i]) #想想其实不用随机噪声
num_y.append(temp)
print(len(num_y))
10000
10000
#观察点
fig,ax = plt.subplots()
ax.set_ylabel('Y')
ax.set_xlabel('X')
ax.set_title('Polynomial Fitting')
ax.scatter(num_x,num_y)
<matplotlib.collections.PathCollection at 0x25fd8486320>
二、使用bp的思想做,构造预测多项式,减少多项式偏差
设预测值为o,真实值为y,那么误差
l
o
s
s
=
1
2
(
o
−
y
)
2
loss = {\frac{1}{2}}(o-y)^2
loss=21(o−y)2,定义
l
=
(
o
−
y
)
l=(o-y)
l=(o−y)其中y为
y
=
a
1
x
10
+
a
2
x
9
+
a
3
x
8
+
a
4
x
7
+
a
5
x
6
+
a
6
x
5
+
a
7
x
4
+
a
8
x
3
+
a
9
x
2
+
a
10
x
+
a
11
y = a_1x^{10}+a_2x^9+a_3x^8+a_4x^7+a_5x^6+a_6x^5+a_7x^4+a_8x^3+a_9x^2+a_{10}x+a_{11}
y=a1x10+a2x9+a3x8+a4x7+a5x6+a6x5+a7x4+a8x3+a9x2+a10x+a11
此时调整,需要向误差减小最明显的方向调整,用l分别对
a
1
,
a
2
…
…
a_1,a_2……
a1,a2……求导得到,
−
x
10
l
,
−
x
9
l
,
−
x
8
l
…
…
-x^{10}l,-x^9l,-x^8l……
−x10l,−x9l,−x8l……
设学习率为lr,那么
a
1
,
a
2
,
…
…
a_1,a_2,……
a1,a2,……的调整值为
−
l
r
x
10
l
,
−
l
r
x
9
l
-lrx^{10}l,-lrx^9l
−lrx10l,−lrx9l
归纳为,list中第i个元素每次的改变量为
−
l
r
x
10
−
i
l
,
i
=
0
,
1
,
…
…
,
9
-lrx^{10-i}l , i=0,1,……,9
−lrx10−il,i=0,1,……,9
- 单单上面这种推导实现起来爆炸了,改进想法是,将所有的loss加起来除以10000,再去调整,以及使用绝对值代替二次函数防止溢出。
调整学习率使每次学习到合适的量,同时再根据x次数不同项,乘上一个权值代替指数的 x 10 − i x^{10-i} x10−i,即 l r × x × l × ( 10 − i ) {lr\times x\times l\times (10-i)} lr×x×l×(10−i)。加上dropout。
误差最好的时候降到35.拟合程度大体差不多。
random.seed(0)
polynomial_pre = [random.random()-0.5 for i in range(10)] #推测多项式,先随机赋值
print(polynomial_pre)
#将每个点带入拟合
epoch = 800
lr = 0.0000005
l = 0
for i in range(epoch):
loss = []
for j in range(n):
loss.append(calculate(polynomial_pre,num_x[j]) - num_y[j])
l += abs(calculate(polynomial_pre,num_x[j]) - num_y[j])
l/=n
if i%100 == 0:
print('第',i,'轮误差',l)
for k in range(10):
# drop = random.randint(0,11)
# if k == drop:
# continue
polynomial_pre[k] -= lr * (num_x[k]) * (loss[k] - num_y[j])*(10-k)
print(polynomial_pre)
[0.3444218515250481, 0.2579544029403025, -0.079428419169155, -0.24108324970703665, 0.01127472136860852, -0.09506586254958571, 0.2837985890347726, -0.19668727392107255, -0.02340304584764419, 0.0833820394550312]
第 0 轮误差 125.56605977264955
第 100 轮误差 96.63328691501964
第 200 轮误差 76.40355209636932
第 300 轮误差 61.9613787759121
第 400 轮误差 51.27587641809833
第 500 轮误差 43.080682824405415
第 600 轮误差 37.97494996982369
第 700 轮误差 35.637086267932055
[0.3076666374487756, 0.4425229539123861, -1.095939035078886, -1.161408091549099, -1.335224244129828, -0.001029364227706257, 0.32750621887204023, -2.3130008991325552, 0.018007058085728295, -0.16715429895537884]
# 此时用新的曲线绘图,和原来比较
num_pre_y = [calculate(polynomial_pre,num_x[i]) for i in range(n)]
#观察点
fig,ax = plt.subplots()
ax.set_ylabel('Y')
ax.set_xlabel('X')
ax.set_title('Polynomial Fitting')
ax.scatter(num_x,num_pre_y)
<matplotlib.collections.PathCollection at 0x25fd8662e10>