Linear Regression Model for Machine Learning

Abstract

This article is trying to share Linear Regression Model usage and basic concept information with classmates and teachers in Machine Learning class. The first part is about Linear Regression mathematical model and matrix expression via Neural network model. The 2nd part shows how to implement Linear Regression Model from scratch without using any 3rd party library (e.g. pytorch, tensor flow and so on). The last part is to introduce the GPU library in CUDA.

目录

Basic Concept of Linear Regression Model usage in Neural Network Model

Implementing Simple Linear Model from Scratch in Neural Network Model

Mathematical proof of Jacobian Matrix

C++ Program which implements the SSE regression model

Introduction of CUDA library and Parallel Programming

Advanced questions of Machine Learning


Basic Concept of Linear Regression Model usage in Neural Network Model

 

 Deep learning includes 2 major parts. They are feedforward process and Backpropagation process. Above picture shows feedforward process. Where X1, X2, ….. Xp are the independent variables. Y1, Y2,….Yn are dependent variable. If we only consider the training model uses linear regression model (i.e. MSE/SSE) and the Activate function is selected as f(x) = x. then we can simply to calculate the gradient of Backpropagation process of weighting values as below formula of output layer coefficients (weighting variables of YWi) as:

 

 

Implementing Simple Linear Model from Scratch in Neural Network Model

If we simplify the Neural Network model as just one input layer with only one out put node as following figure. Then Its implies that we are trying to calculate the linear regression model of below formula. In other words, we are trying to find a hyper plane to have the shortest distance on Z axis.

 

For the simplified Linear Regression Model, we can use the gradient method to update the weighting values in Backpropagation process. The gradient of function F(X) can be expressed as to minimize below SSE:

 

We can use the matrix formula to explain the data in below first picture and express the gradient of the partial derivative of w in Jacobian Matrix form can be expressed as second one.

 

Jacobian Matrix:

 

 Cost function derivative expressed in Matrix form:

 

where alpha is the learning rate.

Mathematical proof of Jacobian Matrix

 

 

 

C++ Program which implements the SSE regression model

Below is the C++ source code and screenshot for reading 100 random (Xi,Yi,Zi) data from csv file. Then calculate the interpolation plane and draw it in 3D model. Here I also use python->Sklearn->LinearRegression API to verify the coefficients and interception equals to the C++ program calculation result.

[Regression Plane is located between the scatter sample points]:

 

[Coefficients Comparison between Python and C++ program]:

 

 [Screenshots of Python and C++ source code]:

C++ Part:

 

Introduction of CUDA library and Parallel Programming

Folloing statements comes from Nvidia official website: CUDA C++ Programming GuideCUDA C++ Programming Guide.

CUDA C++ Programming Guide

The advent of multicore CPUs and manycore GPUs means that mainstream processor chips are now parallel systems. The challenge is to develop application software that transparently scales its parallelism to leverage the increasing number of processor cores, much as 3D graphics applications transparently scale their parallelism to manycore GPUs with widely varying numbers of cores. The CUDA parallel programming model is designed to overcome this challenge while maintaining a low learning curve for programmers familiar with standard programming languages such as C. At its core are three key abstractions - a hierarchy of thread groups, shared memories, and barrier synchronization - that are simply exposed to the programmer as a minimal set of language extensions.

 

Advanced questions of Machine Learning

Question1: What is the detailed mathematical proof of Backpropagation process? The Hidden Layer of the mathematical proof seems to have d/dx of f(g(h(x))). What’s the matrix form of the hidden layer Jacobian matrix.


Question2: How to normalize skill of the independent variables when large matrix operation is needed?

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值