决策树python3.6示例

最新推荐文章于 2024-04-29 23:11:29 发布

renzao_ai

最新推荐文章于 2024-04-29 23:11:29 发布

阅读量287

点赞数

分类专栏：深度学习 Python 文章标签： graphviz decision tree sklearn csv dictVectorizer

本文链接：https://blog.csdn.net/zt5169/article/details/90734092

版权

Python 同时被 2 个专栏收录

58 篇文章 5 订阅

订阅专栏

深度学习

18 篇文章 1 订阅

订阅专栏

from sklearn.feature_extraction import DictVectorizer
import csv
from sklearn import tree
from sklearn import preprocessing
import numpy as np
from sklearn.externals.six import StringIO

# Read in the csv file and put features into list of dict and list of class label
# adaption to PYTHON 3.6 1: rb --> rt  (\r\n --> \n)
# adaption to PYTHON 3.6 2: reader.next()  --> next(reader)
allElectronicsData = open(r'd:/dl_code/01DTree/AllElectronics.csv', 'rt')
reader = csv.reader(allElectronicsData)
headers = next(reader)

print(headers)

featureList = []
labelList = []

for row in reader:
    labelList.append(row[len(row)-1])
    rowDict = {}
    for i in range(1, len(row)-1):
        rowDict[headers[i]] = row[i]
    featureList.append(rowDict)

print(featureList)

# Vetorize features
vec = DictVectorizer()
dummyX = vec.fit_transform(featureList) .toarray()

print("dummyX: " + str(dummyX))
print(vec.get_feature_names())

print("labelList:" + str(labelList))

# vectorize class labels
lb = preprocessing.LabelBinarizer()
dummyY = lb.fit_transform(labelList)
print("dummyY:\n" + str(dummyY))

# Using decision tree for classification
# clf = tree.DecisionTreeClassifier()
clf = tree.DecisionTreeClassifier(criterion='entropy')
clf = clf.fit(dummyX, dummyY)
print("clf: " + str(clf))


# Visualize model
with open("allElectronicInformationGainOri.dot", 'w') as f:
    f = tree.export_graphviz(clf, feature_names=vec.get_feature_names(), out_file=f)

oneRowX = dummyX[0, :]
print("oneRowX: " + str(oneRowX))

newRowX = oneRowX
newRowX[0] = 1
newRowX[2] = 0
print("newRowX: " + str(newRowX))

# adaption to PYTHON 3.6 3: reshape to 2D
newRowX_2D = np.array(newRowX).reshape(1,-1)
predictedY = clf.predict(newRowX_2D)
print("predictedY: " + str(predictedY))

pycharm run result:

[‘RID’, ‘age’, ‘income’, ‘student’, ‘credit_rating’, ‘class_buys_computer’]
[{‘age’: ‘youth’, ‘income’: ‘high’, ‘student’: ‘no’, ‘credit_rating’: ‘fair’}, {‘age’: ‘youth’, ‘income’: ‘high’, ‘student’: ‘no’, ‘credit_rating’: ‘excellent’}, {‘age’: ‘middle_aged’, ‘income’: ‘high’, ‘student’: ‘no’, ‘credit_rating’: ‘fair’}, {‘age’: ‘senior’, ‘income’: ‘medium’, ‘student’: ‘no’, ‘credit_rating’: ‘fair’}, {‘age’: ‘senior’, ‘income’: ‘low’, ‘student’: ‘yes’, ‘credit_rating’: ‘fair’}, {‘age’: ‘senior’, ‘income’: ‘low’, ‘student’: ‘yes’, ‘credit_rating’: ‘excellent’}, {‘age’: ‘middle_aged’, ‘income’: ‘low’, ‘student’: ‘yes’, ‘credit_rating’: ‘excellent’}, {‘age’: ‘youth’, ‘income’: ‘medium’, ‘student’: ‘no’, ‘credit_rating’: ‘fair’}, {‘age’: ‘youth’, ‘income’: ‘low’, ‘student’: ‘yes’, ‘credit_rating’: ‘fair’}, {‘age’: ‘senior’, ‘income’: ‘medium’, ‘student’: ‘yes’, ‘credit_rating’: ‘fair’}, {‘age’: ‘youth’, ‘income’: ‘medium’, ‘student’: ‘yes’, ‘credit_rating’: ‘excellent’}, {‘age’: ‘middle_aged’, ‘income’: ‘medium’, ‘student’: ‘no’, ‘credit_rating’: ‘excellent’}, {‘age’: ‘middle_aged’, ‘income’: ‘high’, ‘student’: ‘yes’, ‘credit_rating’: ‘fair’}, {‘age’: ‘senior’, ‘income’: ‘medium’, ‘student’: ‘no’, ‘credit_rating’: ‘excellent’}]
dummyX: [[0. 0. 1. 0. 1. 1. 0. 0. 1. 0.]
[0. 0. 1. 1. 0. 1. 0. 0. 1. 0.]
[1. 0. 0. 0. 1. 1. 0. 0. 1. 0.]
[0. 1. 0. 0. 1. 0. 0. 1. 1. 0.]
[0. 1. 0. 0. 1. 0. 1. 0. 0. 1.]
[0. 1. 0. 1. 0. 0. 1. 0. 0. 1.]
[1. 0. 0. 1. 0. 0. 1. 0. 0. 1.]
[0. 0. 1. 0. 1. 0. 0. 1. 1. 0.]
[0. 0. 1. 0. 1. 0. 1. 0. 0. 1.]
[0. 1. 0. 0. 1. 0. 0. 1. 0. 1.]
[0. 0. 1. 1. 0. 0. 0. 1. 0. 1.]
[1. 0. 0. 1. 0. 0. 0. 1. 1. 0.]
[1. 0. 0. 0. 1. 1. 0. 0. 0. 1.]
[0. 1. 0. 1. 0. 0. 0. 1. 1. 0.]]
[‘age=middle_aged’, ‘age=senior’, ‘age=youth’, ‘credit_rating=excellent’, ‘credit_rating=fair’, ‘income=high’, ‘income=low’, ‘income=medium’, ‘student=no’, ‘student=yes’]
labelList:[‘no’, ‘no’, ‘yes’, ‘yes’, ‘yes’, ‘no’, ‘yes’, ‘no’, ‘yes’, ‘yes’, ‘yes’, ‘yes’, ‘yes’, ‘no’]
dummyY:
[[0]
[0]
[1]
[1]
[1]
[0]
[1]
[0]
[1]
[1]
[1]
[1]
[1]
[0]]
clf: DecisionTreeClassifier(class_weight=None, criterion=‘entropy’, max_depth=None,
max_features=None, max_leaf_nodes=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, presort=False, random_state=None,
splitter=‘best’)
oneRowX: [0. 0. 1. 0. 1. 1. 0. 0. 1. 0.]
newRowX: [1. 0. 0. 0. 1. 1. 0. 0. 1. 0.]
predictedY: [1]

dos cmd: dot -Tpdf a.dot -o 1.pdf

源码来自麦子学院

附件一: AllElectronics.csv

RID,age,income,student,credit_rating,class_buys_computer
1,youth,high,no,fair,no
2,youth,high,no,excellent,no
3,middle_aged,high,no,fair,yes
4,senior,medium,no,fair,yes
5,senior,low,yes,fair,yes
6,senior,low,yes,excellent,no
7,middle_aged,low,yes,excellent,yes
8,youth,medium,no,fair,no
9,youth,low,yes,fair,yes
10,senior,medium,yes,fair,yes
11,youth,medium,yes,excellent,yes
12,middle_aged,medium,no,excellent,yes
13,middle_aged,high,yes,fair,yes
14,senior,medium,no,excellent,no

renzao_ai

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
决策树python3.6示例

from sklearn.feature_extraction import DictVectorizerimport csvfrom sklearn import treefrom sklearn import preprocessingimport numpy as npfrom sklearn.externals.six import StringIO# Read in the...
复制链接

扫一扫