matlab用求逻辑回归图形,利用Matlab工具箱logistic regression

最新推荐文章于 2022-12-03 00:12:51 发布

weixin_39559486

最新推荐文章于 2022-12-03 00:12:51 发布

阅读量510

点赞数

文章标签： matlab用求逻辑回归图形

Introduction

Often, the analyst is required to construct a model which estimates

probabilities. This is common in many fields: medical diagnosis

(probability of recovery, relapse, etc.), credit scoring

(probability of a loan being repaid), sports (probability of a team

beating a competitor- wait... maybe that belongs in the

"investment" category?).

Many people are familiar with linear regression- why not

just use that? There are several good reasons not to do this, but

probably the most obvious is that linear models will always fall

below 0.0 and poke out above 1.0, yielding answers which do not

make sense as probabilities.

Many different classification models have been devised which

estimate the probability of class membership, such as linear and

quadratic discriminant analysis, neural networks and tree

induction. The technique covered in this article is logistic

regression- one of the simplest modeling procedures.

Logistic Regression

Logistic regression is a member of the family of methods called

generalized linear models ("GLM"). Such models include a

linear part followed by some "link function". If you are familiar

with neural networks, think of "transfer functions" or "squashing

functions". So, the linear function of the predictor variables is

calculated, and the result of this calculation is run through the

link function. In the case of logistic regression, the linear

result is run through a logistic function (see figure 1),

which runs from 0.0 (at negative infinity), rises monotonically to

1.0 (at positive infinity). Along the way, it is 0.5 when the input

value is exactly zero. Among other desirable properties, note that

this logistic function only returns values between 0.0 and 1.0.

Other GLMs operate similarly, but employ different link functions-

some of which are also bound by 0.0 - 1.0, and some of which are

not.

Figure 1: The Most Interesting Part of the Logistic

Function (Click figure to enlarge)

While calculating the optimal coefficients of a least-squares

linear regression has a direct, closed-form solution, this is not

the case for logistic regression. Instead, some iterative fitting

procedure is needed, in which successive "guesses" at the right

coefficients are incrementally improved. Again, if you are

familiar with neural networks, this is much like the various

training rules used with the simplest "single neuron" models.

Hopefully, you are lucky enough to have a routine handy to perform

this process for you, such as glmfit, from the Statistics

Toolbox.

glmfit

The glmfit function is easy to apply. The syntax for

logistic regression is:

B = glmfit(X, [Y N], 'binomial', 'link', 'logit');

B will contain the discovered coefficients for the linear

portion of the logistic regression (the link function has no

coefficients). X contains the pedictor data, with examples

in rows, variables in columns. Y contains the target

variable, usually a 0 or a 1 representing the outcome. Last, the

variable N contains the count of events for each row of

the example data- most often, this will be a columns of 1s, the

same size as Y. The count parameter, N, will

be set to values greater than 1 for grouped data. As an example,

think of medical cases summarized by country: each country will

have averaged input values, an outcome which is a rate (between 0.0

and 1.0), and the count of cases from that country. In the event

that the counts are greater than one, then the target variable

represents the count of target class observations.

Here is a very small example:

>> X = [0.0 0.1 0.7 1.0 1.1 1.3

1.4 1.7 2.1 2.2]';

>> Y = [0 0 1 0 0 0 1 1 1 1]';

>> B = glmfit(X, [Y ones(10,1)],

'binomial', 'link', 'logit')

B =

-3.4932

2.9402

The first element of B is the constant term, and the

second element is the coefficient for the lone input variable. We

apply the linear part of this logistic regression thus:

>> Z = B(1) + X * (B(2))

Z =

-3.4932

-3.1992

-1.4350

-0.5530

-0.2589

0.3291

0.6231

1.5052

2.6813

2.9753

To finish, we apply the logistic function to the output of the

linear part:

>> Z = Logistic(B(1) + X *

(B(2)))

Z =

0.0295

0.0392

0.1923

0.3652

0.4356

0.5815

0.6509

0.8183

0.9359

0.9514

Despite the simplicity of the exponential function, I built it into

a small function, Logistic, so that I wouldn't have to

repeatedly write out the formula:

% Logistic: calculates the logistic function of the input

% by Will Dwinnell

% Last modified: Sep-02-2006

function Output = Logistic(Input)

Output = 1 ./ (1 + exp(-Input));

% EOF

Conclusion

Though it is structurally very simple, logistic regression still

finds wide use today in many fields. It is quick to fit, easy to

implement the discovered model and quick to recall. Frequently, it

yields better performance than competing, more complex techniques.

I recently built a logistic regression model which beat out a

neural network, decision trees and two types of discriminant

analysis. If nothing else, it is worth fitting a simple model such

as logistic regression early in a modeling project, just to

establish a benchmark for the project.

Logistic regression is closely related to another GLM procedure,

probit regression, which differs only in its link function

(specified in glmfit by replacing 'logit' with 'probit').

I believe that probit regression has been losing popularity since

its results are typically very similar to those from logistic

regression, but the formula for the logistic link function is

simpler than that of the probit link function.

References

Generalized Linear Models, by McCullagh and Nelder

(ISBN-13: 978-0412317606)

weixin_39559486

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
matlab用求逻辑回归图形,利用Matlab工具箱logistic regression

IntroductionOften, the analyst is required to construct a model which estimatesprobabilities. This is common in many fields: medical diagnosis(probability of recovery, relapse, etc.), credit scoring(p...
复制链接

扫一扫