机器学习——监督分类小项目(识别作者)


整理了六种监督算法在同一项目上实现的方法。

朴素贝叶斯小项目——识别作者

参见文章 机器学习(1)——贝叶斯网络分类算法 最后一部分

SVM小项目——识别作者

参见 文章 机器学习——SVM

决策树小项目——识别作者

参见 文章 机器学习——决策树

KNN小项目——识别作者

#!/usr/bin/python

import matplotlib.pyplot as plt
from prep_terrain_data import makeTerrainData
from class_vis import prettyPicture

features_train, labels_train, features_test, labels_test = makeTerrainData()


### the training data (features_train, labels_train) have both "fast" and "slow"
### points mixed together--separate them so we can give them different colors
### in the scatterplot and identify them visually
grade_fast = [features_train[ii][0] for ii in range(0, len(features_train)) if labels_train[ii]==0]
bumpy_fast = [features_train[ii][1] for ii in range(0, len(features_train)) if labels_train[ii]==0]
grade_slow = [features_train[ii][0] for ii in range(0, len(features_train)) if labels_train[ii]==1]
bumpy_slow = [features_train[ii][1] for ii in range(0, len(features_train)) if labels_train[ii]==1]


#### initial visualization
plt.xlim(0.0, 1.0)
plt.ylim(0.0, 1.0)
plt.scatter(bumpy_fast, grade_fast, color = "b", label="fast")
plt.scatter(grade_slow, bumpy_slow, color = "r", label="slow")
plt.legend()
plt.xlabel("bumpiness")
plt.ylabel("grade")
plt.show()
################################################################################



from time import time
from sklearn.neighbors import KNeighborsClassifier

clf = KNeighborsClassifier()

t0 = time()
clf.fit(features_train, labels_train)
print "training time:", round(time()-t0, 3), "s"
t1 = time()

pred = clf.predict(features_test)
print "testing time:", round(time()-t1, 3), "s"

from sklearn.metrics import accuracy_score

acc = accuracy_score(pred, labels_test)
print acc




try:
    prettyPicture(clf, features_test, labels_test)
except NameError:
    pass

training time: 0.164 s
testing time: 0.012 s
0.92

Adaboost小项目——识别作者

#!/usr/bin/python

import matplotlib.pyplot as plt
from prep_terrain_data import makeTerrainData
from class_vis import prettyPicture

features_train, labels_train, features_test, labels_test = makeTerrainData()


### the training data (features_train, labels_train) have both "fast" and "slow"
### points mixed together--separate them so we can give them different colors
### in the scatterplot and identify them visually
grade_fast = [features_train[ii][0] for ii in range(0, len(features_train)) if labels_train[ii]==0]
bumpy_fast = [features_train[ii][1] for ii in range(0, len(features_train)) if labels_train[ii]==0]
grade_slow = [features_train[ii][0] for ii in range(0, len(features_train)) if labels_train[ii]==1]
bumpy_slow = [features_train[ii][1] for ii in range(0, len(features_train)) if labels_train[ii]==1]


#### initial visualization
plt.xlim(0.0, 1.0)
plt.ylim(0.0, 1.0)
plt.scatter(bumpy_fast, grade_fast, color = "b", label="fast")
plt.scatter(grade_slow, bumpy_slow, color = "r", label="slow")
plt.legend()
plt.xlabel("bumpiness")
plt.ylabel("grade")
plt.show()
################################################################################


from time import time
from sklearn.ensemble import AdaBoostClassifier

clf = AdaBoostClassifier()

t0 = time()
clf.fit(features_train, labels_train)
print "adaboost training time:", round(time()-t0, 3), "s"
t1 = time()

pred = clf.predict(features_test)
print "adaboost testing time:", round(time()-t1, 3), "s"

from sklearn.metrics import accuracy_score

acc = accuracy_score(pred, labels_test)
print "adaboost accuracy: ", acc



try:
    prettyPicture(clf, features_test, labels_test)
except NameError:
    pass

adaboost training time: 0.196 s
adaboost testing time: 0.019 s
adaboost accuracy: 0.924

RandomForest小项目——识别作者

#!/usr/bin/python

import matplotlib.pyplot as plt
from prep_terrain_data import makeTerrainData
from class_vis import prettyPicture

features_train, labels_train, features_test, labels_test = makeTerrainData()


### the training data (features_train, labels_train) have both "fast" and "slow"
### points mixed together--separate them so we can give them different colors
### in the scatterplot and identify them visually
grade_fast = [features_train[ii][0] for ii in range(0, len(features_train)) if labels_train[ii]==0]
bumpy_fast = [features_train[ii][1] for ii in range(0, len(features_train)) if labels_train[ii]==0]
grade_slow = [features_train[ii][0] for ii in range(0, len(features_train)) if labels_train[ii]==1]
bumpy_slow = [features_train[ii][1] for ii in range(0, len(features_train)) if labels_train[ii]==1]


#### initial visualization
plt.xlim(0.0, 1.0)
plt.ylim(0.0, 1.0)
plt.scatter(bumpy_fast, grade_fast, color = "b", label="fast")
plt.scatter(grade_slow, bumpy_slow, color = "r", label="slow")
plt.legend()
plt.xlabel("bumpiness")
plt.ylabel("grade")
plt.show()
################################################################################




from time import time
from sklearn.ensemble import RandomForestClassifier
import math

n_features = len(features_train[0])

clf = RandomForestClassifier(n_estimators=1)

t0 = time()
clf.fit(features_train, labels_train)
print "RandomForest training time:", round(time()-t0, 3), "s"
t1 = time()

pred = clf.predict(features_test)
print "RandomForest testing time:", round(time()-t1, 3), "s"

from sklearn.metrics import accuracy_score

acc = accuracy_score(pred, labels_test)
print "RandomForest accuracy: ", acc


try:
    prettyPicture(clf, features_test, labels_test)
except NameError:
    pass

RandomForest training time: 0.04 s
RandomForest testing time: 0.009 s
RandomForest accuracy: 0.92

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
本文首先研究使用Python语言和OpenCV对数字高程图DEM,运用高斯核、阈值化、以及形态学中的开运算、闭运算、对图像进行分割、边缘提取、提取建筑物轮廓,并将其组织成顶点和线段的矢量格式,输出成为矢量化数字地图。然后输出XML格式文件保存矢量化数据[2]。 关键词:图像矢量化;数字高程图;python;OpenCV;轮廓处理 1 前言 1.1图像与图像处理简介 在社会生活中,图像处理有着广泛的应用,人们可以通过图像获取很多日常生活中的一些重要信息,而由于社会的科技迅速发展,图像信息处理的技术也发展得越来越快,并且涉及到了日常生活中的方方面面,如科研、工业、农业现代化、军事国防、医疗卫生、公安机关、制造业等等多个领域,并且图像处理也涉及到了很多学科、如计算机科学、数学、物理、模式识别、人工智能、深度学习等,所以图像处理是一门具有交叉性很强的学科、并且能够与很多学科的融合并促进相关学科的发展。 2 相关工作 1. 工欲善其事必先利其器,由于需要使用Python OpenCV进行图像的操作,所以一开始需要先了解python的相关的使用语法以及编译器方面的使用,还有OpenCV的安装,然后才能更加顺利的开始进行接下来的步骤。 2. 需要对模式识别与图像处理的项目内的具体要求的方法进行初步了解,然后再进行深入的学习和掌握,所以需要对《OpenCV-Python-Tutorial-中文版》本书的相关知识进行学习并自己实践一番。如图像的基本操作、平滑、阈值化、形态学转换中的开运算、闭运算,以及轮廓处理的相关知识。 3. 对相关要求使用Python进行实现,并调试运行,查看相应的效果,最后输出XML文件存矢量化数据。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值