第一题
1. 任意给定一组数字value(0≤valve≤100.00),数量为2n(5≤n≤10),将数组分为两组A,B满足如下条件:
a)数组A,B数量均为n;
b)数组B满足。
c)当给定 SUM=s 和 WEA=wa 时,分别求出数组A, B。
2. 数据结构
No. | A | B |
1 | A1 | B1 |
2 | A2 | B2 |
3 | A3 | B3 |
4 | A4 | B4 |
5 | A5 | B5 |
6 | A6 | B6 |
7 | A7 | B7 |
8 | A8 | B8 |
SUM | WEA |
3. 下面是给定一组:value=[1.62,2.13,2.55,3.46,5.17,8.23,10.69,11.55,15.50,20.14,25.50,34.28,42.94,46.15,52.91,79.64];
a)给定s=100.00,wa=20.50324时,求出满足上述条件的数组A,B。
b)当|1-SUM/s|≤,|1-WEA/wa|≤ 即视为满足条件,m为value的小数位数。
c)结果可能不唯一,求出一组值即可。
4. 你可以自己模拟一组值,满足上述条件即可,再去编写算法。
解法:首先把value 分为两组,使得A组的和为 s = 100。然后B就确定了,因为从小到大排列。然后遍历A,与B 加权求和等于固定值。
import itertools
value = [1.62,2.13,2.55,3.46,5.17,8.23,10.69,11.55,15.50,20.14,25.50,34.28,42.94,46.15,52.91,79.64]
s = 100
wa = 20.50324
# wa = 30
m = 32
e1 = pow(10, -(2*m+1))
e2 = pow(10, -(3*m+1))
n = len(value)
for i in range(1,n+1):
data = itertools.combinations(value, i)
for val in data:
if abs(sum(val) - s) <= e1:
value_sub1 = val # sum = 100
sum_ = sum(val)
print("组合:{}元素相加等于{}".format(val, s))
def array_diff(a, b):
#创建数组在,且数组元素在a不在b中
return b, [x for x in a if x not in b]
value_sub1, value_sub2 = array_diff(value, list(value_sub1))
mul_add = 0
st = False
from itertools import permutations
for p in permutations(value_sub1):
# print(p)
for a, b in zip(p, value_sub2):
# print(a, b)
mul_add += a * b
if abs(1-mul_add/sum_/wa) <= e2:
print('差值:', abs(1-mul_add/sum_/wa))
st = True
break
mul_add = 0
if st:
print("找到了!", p)
else:
print("没找到!", p)
print("最终结果:", "A=", p, ", B=", value_sub2)
显然穷举不是一个好的方法,但是是可行的。有大佬有别的解决方案欢迎私信。
第二题
1.
No. | A | B | C | D |
1 | A1 | B1 | C1 | D1 |
2 | A2 | B2 | C2 | D2 |
3 | A3 | B3 | C3 | D3 |
4 | A4 | B4 | C4 | D4 |
5 | A5 | B5 | C5 | D5 |
6 | A6 | B6 | C6 | D6 |
7 | A7 | B7 | C7 | D7 |
8 | A8 | B8 | C8 | D8 |
n | An | Bn | Cn | Dn |
- 给出1000-10000组数据A,B,C,D,训练模型。另给出20组ABC预测D,来验证模型。
- 数据见 链接: https://pan.baidu.com/s/1aWYNskLHURoX8ITrOC6JZw 提取码: 94nf 。
解法:使用sklearn 的多元线性回归建模即可。
训练:
#1. 导入需要的模块和库
from sklearn.linear_model import LinearRegression as LR
from sklearn.model_selection import train_test_split
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
#2. 导入数据,探索数据
data_file = 'data.xlsx'
df_train = pd.read_excel(data_file)[0:10000]
x = df_train[['A', 'B', 'C']]
y = df_train[['D']]
print(x.shape)
#3. 分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=420)
print('X_train.shape={}\n y_train.shape ={}\n X_test.shape={}\n, y_test.shape={}'.format(X_train.shape,y_train.shape,X_test.shape, y_test.shape))
#4. 建模
model = LR().fit(X_train, y_train)
print(model)
y_pred = model.predict(X_test)
sum_mean = 0
for i in range(len(y_pred)):
sum_mean += (y_pred[i] - y_test.values[i]) ** 2
sum_erro = np.sqrt(sum_mean /len(y_pred)) # 测试级的数量
# calculate RMSE
print ("RMSE by hand:", sum_erro)
# 做ROC曲线
plt.figure()
plt.plot(range(len(y_pred)), y_pred, 'b', label="predict")
plt.plot(range(len(y_pred)), y_test, 'r', label="test")
plt.legend(loc="upper right") # 显示图中的标签
plt.xlabel("the number of sales")
plt.ylabel('value of sales')
plt.show()
#5. 探索建好的模型
#reg.coef_ 参数w1,w2.....wn
print(model.coef_)
#截距 reg.intercept_
print(model.intercept_)
预测:
import csv
import pandas as pd
import matplotlib.pyplot as plt
## 预测
data_file = 'data.xlsx'
df = pd.read_excel(data_file)[10009:]
print(df)
#利用方程进行拟合 对比 并存储数据到data2.csv
pd_data = df
sam=[]
a=['A','B','C','D']
dic={}
for i in a:
y = pd_data.loc[:, i]
dic[i] = list(y) # 归一化
print(dic)
for i in range(len(dic['D'])):
x = 7856. + float(dic['A'][i])*-1.05000000e+02 + float(dic['B'][i])*5.32907052e-15 + float(dic['C'][i]) * -8.90000000e+01
sam.append(x)
with open('data2.csv', 'w') as file:
writer = csv.writer(file)
writer.writerow(['D','Predictive value'])
for i in range(len(sam)):
writer.writerow([dic['D'][i], sam[i]])
print('完毕')
pd_data=pd.read_csv('data2.csv')
pd_data.plot()
plt.show()