机器学习_分类算法1:朴素贝叶斯分类器

毕业需要,记录一点机器学习的分类算法,copy,copy,copy!!!

一些名词解释放在前面:

(1)朴素贝叶斯分类器(Naive Bayes Classifier):

目标(Latex好麻烦,先放点图吧):

\mathop{\textnormal{argmin}}\limits_{c\epsilon Y}R(c_i|\textit{\textbf x})=\sum_{j=1}^{N}\lambda_{ij}P(c_j|\textit{\textbf x})

贝叶斯公式:

连续样本处理:

用matlab写个案例:X = randi(datanum,10), Y = {y<-2,-2<y<0,0<y<2,y>2}4类:y=sum(X);

clc
clear
close all;
% 初始化
datanum  = 50000;
featurenum = 12;
classification = [-2,0,2]; %分类方法
% 创建数据
[X, Y] = CreateFun(datanum, featurenum, classification);
% 贝叶斯分类器创建
[P, u, s] = BayesBulid(X,Y);
% 测试数据创建
testnum  = 3000;
[testX, testY] = CreateFun(testnum, featurenum, classification);
%测试
[outY, Acc] = BayesTest(testX,testY,P,u,s);

function [X, Y] = CreateFun(datanum, featurenum, classification)
X = randn(datanum, featurenum);
TempY = sum(X')'; %函数
Y = TempY;
for i = 1:length(classification)
    Y(TempY>=classification(i)) = i+1;
end
Y(TempY<classification(1)) = 1;
end

function [P, u, s] = BayesBulid(X,Y)
datanum = length(Y);
classification = unique(Y);
classificnum  = length(classification);
featurenum = length(X(1,:));
P = zeros(classificnum,1);
u = zeros(classificnum,featurenum);
s = zeros(classificnum,featurenum);
for i = 1:classificnum
    [tempYindex, tempY] = find(Y == classification(i));
    tempX = X(tempYindex,:);
    P(i,1) = length(tempY)/datanum;
    for j = 1:featurenum
        u(i,j) = sum(tempX(:,j))/length(tempY);
        s(i,j) = sqrt(sum((tempX(:,j)-u(i,j)).*(tempX(:,j)-u(i,j)))/length(tempY));
    end
end
end

function [outY, Acc] = BayesTest(X,Y,P,u,s)
testnum = length(Y(:,1));
classification = unique(Y);
classificnum  = length(classification);
featurenum = length(X(1,:));
Pc = zeros(classificnum,featurenum);
Pt = zeros(classificnum,1);
outY = zeros(testnum,2);
for i = 1:testnum
    for j = 1:classificnum
        Pt(j,1) = 1;
        for k = 1:featurenum
            Pc(j,k) =  1/sqrt(2*pi)/s(j,k)*exp(-(X(i,k)-u(j,k))*(X(i,k)-u(j,k))/2/s(j,k)/s(j,k));
            Pt(j,1) = Pt(j,1)*Pc(j,k);
        end
        Pt(j,1) = Pt(j,1)*P(j,1);
    end
    [outY(i,1),outY(i,2)] = max(Pt);
    outY(i,1) = outY(i,1)/sum(Pt(:,1));
end
Acc = length(find(outY(:,2)==Y))/testnum;
end

三次Acc分别为0.89,0.89,0.90;

使类增加:

classification = [-4,-2,0,2,4]; %分类方法

三次Acc分别为0.56,0.58,0.57,效果降低;

统计了下:Acc随以下参数增加而:

datanum 增加;

classificnum 减小;

featurenum 不变;

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值