Scikit_Learn
Assignment
这次的作业主要是使用三种算法对数据集进行训练,并在通过Accuracy、F1-score、AUC ROC三项指标对算法进行评估。
Step1
Create a classification dataset(n_samples >= 1000,n_features >= 10):
from sklearn import datasets
from sklearn import cross_validation
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
dataset = datasets.make_classification(n_samples=1000, n_features=10)
X,y = dataset
Step2
Split the dataset using 10-fold cross validation:
kf = cross_validation.KFold(len(dataset[0]), n_folds=10, shuffle=True)
Step3
Train the algorithms:
GaussianNB:
朴素贝叶斯分类器基于一个简单的假定: