简介
线性回归模型是机器学习中非常基础的模型,重点就是在通过最小二乘法计算参数w和b的值。用python实现线性回归模型很简单,直接调包就好了,出来的结果也高大上。matlab实现线性回归也很方便,因为matlab做矩阵运算很方便。我出血java和C,简单做了一个二维的线性回归模型,供大家参考。多维的也好做。最后测是模型性能是用的留出法。
java代码
public class studentdemi {
public static void main(String[] args) {
//训练数据集
double[] x= { 7, 8, 9.5, 10.5, 14, 16, 18, 22, 15 };
//训练数据标签
float[] y= { 14, 14, 14, 17, 17, 17, 17, 17, 17 };
// 根据最小二乘参数估计公式可以得出w和b
int size_x = x.length;//数据集的长度
double avg_x=0;
double sum_x = 0;
for (int i = 0; i < size_x - 1; i++)
{
sum_x = sum_x + x[i];
}
avg_x = sum_x / size_x;
double fz = 0;
double fm = 0;
for (int i = 0; i <= size_x - 1; i++)
{
fz =fz+ y[i] * (x[i] - avg_x);
fm =fm+x[i] * x[i] - sum_x * sum_x / size_x;
}
double w = fz / fm;
double b = 0;
for (int i = 0; i <= size_x - 1; i++)
{
b = b + y[i] - w*x[i];
}
b = b / size_x;
//测试集数据
double[] x1 = { 9, 10, 7, 22, 14 };
//测试集标签
double[] y1 = { 14, 14, 14, 17, 17 };
double mse = 0;
double avg = 0;
for (int i = 0; i <= x1.length - 1; i++)
{
avg = avg + x1[i];
}
avg = avg / x1.length;
for (int i = 0; i <= x1.length - 1; i++)
{
mse = mse + y1[i] - avg;
}
mse = mse / x1.length;
System.out.println("均方误差:"+mse);
}
}
java其实和C很像,这里java 其实可以定义一些函数 public static double(返回值为double类型)这样可以让代码模块化,更加方便美观。
C语言代码
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
int main()
{
//训练数据集
float x[] = { 7, 8, 9.5, 10.5, 14, 16, 18, 22, 15 };
//训练数据标签
float y[] = { 14, 14, 14, 17, 17, 17, 17, 17, 17 };
// 根据最小二乘参数估计公式可以得出w和b
int size_x = sizeof(x) / sizeof(x[0]);//数据集的长度
float avg_x=0;
float sum_x = 0;
for (int i = 0; i < size_x - 1; i++)
{
sum_x = sum_x + x[i];
}
avg_x = sum_x / size_x;
float fz = 0;
float fm = 0;
for (int i = 0; i <= size_x - 1; i++)
{
fz =fz+ y[i] * (x[i] - avg_x);
fm =fm+x[i] * x[i] - sum_x * sum_x / size_x;
}
float w = fz / fm;
float b = 0;
for (int i = 0; i <= size_x - 1; i++)
{
b = b + y[i] - w*x[i];
}
b = b / size_x;
//测试集数据
float x1[] = { 9, 10, 7, 22, 14 };
//测试集标签
float y1[] = { 14, 14, 14, 17, 17 };
float mse = 0;
float avg = 0;
for (int i = 0; i <= sizeof(x1) / sizeof(x1[0]) - 1; i++)
{
avg = avg + x1[i];
}
avg = avg / (sizeof(x1) / sizeof(x1[0]));
for (int i = 0; i <= sizeof(x1) / sizeof(x1[0]) - 1; i++)
{
mse = mse + y1[i] - avg;
}
mse = mse / (sizeof(x1) / sizeof(x1[0]));
printf("均方误差:%f\n", mse);
return 0;
}
这里模型性能指标用的mse最后结果为2.8,其实还挺差劲的,主要原因可能因为数据少,而且数据都是瞎编的,哈哈哈。(欸,不过这样的模型有什么意义呢,下次严谨一点)