k折交叉验证python代码_对python pandas数据帧进行K折交叉验证 - NLTK分类

我想使用10倍交叉验证来评估nltk分类模型 . 这是pandas数据框架命名:data(有10k行和10个类)

特性:hello_variant,goodbye_variant,wh_question,yesNo_question,conjuction_start,No_of_tokens

YarQK.jpg

我尝试下面的代码 . 但它给出了一个错误

extract_features = data.drop(['class'],axis=1)

documents = data['class']

import nltk

from sklearn import cross_validation

training_set = nltk.classify.apply_features(extract_features, documents)

cv = cross_validation.KFold(len(training_set), n_folds=10, shuffle=False, random_state=None)

for traincv, testcv in cv:

classifier = nltk.NaiveBayesClassifier.train(training_set[traincv[0]:traincv[len(traincv)-1]])

print 'accuracy:', nltk.classify.util.accuracy(classifier, training_set[testcv[0]:testcv[len(testcv)-1]])

错误:

> --------------------------------------------------------------------------- ValueError Traceback (most recent call

> last) in ()

> 1 import nltk

> 2 from sklearn import cross_validation

> ----> 3 training_set = nltk.classify.apply_features(extract_features, documents)

> 4 cv = cross_validation.KFold(len(training_set), n_folds=10, shuffle=False, random_state=None)

> 5

>

> C:\Users\SampathR\Anaconda2\envs\dato-env\lib\site-packages\nltk\classify\util.pyc

> in apply_features(feature_func, toks, labeled)

> 60 """

> 61 if labeled is None:

> ---> 62 labeled = toks and isinstance(toks[0], (tuple, list))

> 63 if labeled:

> 64 def lazy_func(labeled_token):

>

> C:\Users\SampathR\Anaconda2\envs\dato-env\lib\site-packages\pandas\core\generic.pyc

> in __nonzero__(self)

> 712 raise ValueError("The truth value of a {0} is ambiguous. "

> 713 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."

> --> 714 .format(self.__class__.__name__))

> 715

> 716 __bool__ = __nonzero__

>

> ValueError: The truth value of a Series is ambiguous. Use a.empty,

> a.bool(), a.item(), a.any() or a.all().

此外,我想获得语料库(类)中每个对话行为的精确度,回忆率和F分数,以及分类器的准确性和混淆矩阵 . NLTK有什么方法可以计算出来吗? (除了sklearn)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值