任务描述
本关任务:编写一个例子讲解决策树如何预测患者需要佩戴的隐形眼镜类型。使用小数据集,我们就可以利用决策树学到很多知识:眼科医生是如何判断患者需要佩戴的镜片类型,一旦理解了决策树的工作原理,我们甚至也可以帮助人们判断需要佩戴的镜片类型。
相关知识
为了完成本关任务,你需要掌握:1.如何处理隐形眼镜数据集,2.如何使用决策树来进行预测如何处理隐形眼镜数据集
隐形眼镜数据集包含很多患者眼部状况的观察条件以及医生推荐的隐形眼镜类型。隐形眼镜类型包括硬材质、软材质以及不适合佩戴隐形眼镜。数据来源于UCI数据库,为了更容易显示数据,我么对数据做了简单的更改。
代码如下:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from six import StringIO
from sklearn import tree
if __name__ == '__main__':
with open('./src/step3/lenses.txt', 'r') as fr: #加载文件
lenses = [inst.strip().split('\t') for inst in fr.readlines()]#处理文件
lenses_target = [] #提取每组数据的类别,保存在列表里
for each in lenses:
lenses_target.append(each[-1])
print(lenses_target)
lensesLabels = ['age', 'prescript', 'astigmatic', 'tearRate'] #特征标签
lenses_list = [] #保存lenses数据的临时列表
lenses_dict = {} #保存lenses数据的字典,用于生成pandas
for each_label in lensesLabels: #提取信息,生成字典
for each in lenses:
lenses_list.append(each[lensesLabels.index(each_label)])
lenses_dict[each_label] = lenses_list
lenses_list = []
# print(lenses_dict) #打印字典信息
# print(lenses_dict) #打印字典信息
###########
lenses_pd = pd.DataFrame(lenses_dict) #生成pandas.DataFrame
lenses_pd = lenses_pd[lensesLabels]
print(lenses_pd) #打印pandas.DataFrame
le = LabelEncoder() #创建LabelEncoder()对象,用于序列化
for col in lenses_pd.columns: #为每一列序列化
lenses_pd[col] = le.fit_transform(lenses_pd[col])
print(lenses_pd)
clf = tree.DecisionTreeClassifier(max_depth = 4) #创建DecisionTreeClassifier()类
clf = clf.fit(lenses_pd.values.tolist(), lenses_target) #使用数据,构建决策树
#############
很多同学会在six 这个库这里报错,这个库之前属于sklearn,目前已经独立出来了,需要独立引用!!!!!!!!!!把sklearn.externals.six换掉就好啦~
附数据集:lenses.txt
young myope no reduced no lenses
young myope no normal soft
young myope yes reduced no lenses
young myope yes normal hard
young hyper no reduced no lenses
young hyper no normal soft
young hyper yes reduced no lenses
young hyper yes normal hard
pre myope no reduced no lenses
pre myope no normal soft
pre myope yes reduced no lenses
pre myope yes normal hard
pre hyper no reduced no lenses
pre hyper no normal soft
pre hyper yes reduced no lenses
pre hyper yes normal no lenses
presbyopic myope no reduced no lenses
presbyopic myope no normal no lenses
presbyopic myope yes reduced no lenses
presbyopic myope yes normal hard
presbyopic hyper no reduced no lenses
presbyopic hyper no normal soft
presbyopic hyper yes reduced no lenses
presbyopic hyper yes normal no lenses