python中idx是什么意思_python – 获取TypeError:当尝试使用idxmax()时...

在Pandas中使用idxmax()函数时,我一直收到此错误.

Traceback (most recent call last):

File "/Users/username/College/year-4/fyp-credit-card-fraud/code/main.py", line 20, in

best_c_param = classify.print_kfold_scores(X_training_undersampled, y_training_undersampled)

File "/Users/username/College/year-4/fyp-credit-card-fraud/code/Classification.py", line 39, in print_kfold_scores

best_c_param = results.loc[results['Mean recall score'].idxmax()]['C_parameter']

File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/series.py", line 1369, in idxmax

i = nanops.nanargmax(_values_from_object(self), skipna=skipna)

File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/nanops.py", line 74, in _f

raise TypeError(msg.format(name=f.__name__.replace('nan', '')))

TypeError: reduction operation 'argmax' not allowed for this dtype

我使用的熊猫版本是0.22.0

main.py

import ExploratoryDataAnalysis as eda

import Preprocessing as processor

import Classification as classify

import pandas as pd

data_path = '/Users/username/college/year-4/fyp-credit-card-fraud/data/'

if __name__ == '__main__':

df = pd.read_csv(data_path + 'creditcard.csv')

# eda.init(df)

# eda.check_null_values()

# eda.view_data()

# eda.check_target_classes()

df = processor.noramlize(df)

X_training, X_testing, y_training, y_testing, X_training_undersampled, X_testing_undersampled, \n y_training_undersampled, y_testing_undersampled = processor.resample(df)

best_c_param = classify.print_kfold_scores(X_training_undersampled, y_training_undersampled)

Classification.py

from sklearn.linear_model import LogisticRegression

from sklearn.cross_validation import KFold, cross_val_score

from sklearn.metrics import confusion_matrix, precision_recall_curve, auc, \n roc_auc_score, roc_curve, recall_score, classification_report

import pandas as pd

import numpy as np

def print_kfold_scores(X_training, y_training):

print('

KFold

')

fold = KFold(len(y_training), 5, shuffle=False)

c_param_range = [0.01, 0.1, 1, 10, 100]

results = pd.DataFrame(index=range(len(c_param_range), 2), columns=['C_parameter', 'Mean recall score'])

results['C_parameter'] = c_param_range

j = 0

for c_param in c_param_range:

print('-------------------------------------------')

print('C parameter: ', c_param)

print('

-------------------------------------------')

recall_accs = []

for iteration, indices in enumerate(fold, start=1):

lr = LogisticRegression(C=c_param, penalty='l1')

lr.fit(X_training.iloc[indices[0], :], y_training.iloc[indices[0], :].values.ravel())

y_prediction_undersampled = lr.predict(X_training.iloc[indices[1], :].values)

recall_acc = recall_score(y_training.iloc[indices[1], :].values, y_prediction_undersampled)

recall_accs.append(recall_acc)

print('Iteration ', iteration, ': recall score = ', recall_acc)

results.ix[j, 'Mean recall score'] = np.mean(recall_accs)

j += 1

print('

Mean recall score ', np.mean(recall_accs))

print('

')

best_c_param = results.loc[results['Mean recall score'].idxmax()]['C_parameter'] # Error occurs on this line

print('*****************************************************************')

print('Best model to choose from cross validation is with C parameter = ', best_c_param)

print('*****************************************************************')

return best_c_param

导致问题的原因是这个

best_c_param = results.loc [results [‘Mean recall score’].idxmax()] [‘C_parameter’]

该计划的输出如下

/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/username/College/year-4/fyp-credit-card-fraud/code/main.py

/Users/username/Library/Python/3.6/lib/python/site-packages/sklearn/cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.

"This module will be removed in 0.20.", DeprecationWarning)

Dataset Ratios

Percentage of genuine transactions: 0.5

Percentage of fraudulent transactions 0.5

Total number of transactions in resampled data: 984

Whole Dataset Split

Number of transactions in training dataset: 199364

Number of transactions in testing dataset: 85443

Total number of transactions in dataset: 284807

Undersampled Dataset Split

Number of transactions in training dataset 688

Number of transactions in testing dataset: 296

Total number of transactions in dataset: 984

KFold

-------------------------------------------

C parameter: 0.01

-------------------------------------------

Iteration 1 : recall score = 0.931506849315

Iteration 2 : recall score = 0.917808219178

Iteration 3 : recall score = 1.0

Iteration 4 : recall score = 0.959459459459

Iteration 5 : recall score = 0.954545454545

Mean recall score 0.9526639965

-------------------------------------------

C parameter: 0.1

-------------------------------------------

Iteration 1 : recall score = 0.849315068493

Iteration 2 : recall score = 0.86301369863

Iteration 3 : recall score = 0.915254237288

Iteration 4 : recall score = 0.945945945946

Iteration 5 : recall score = 0.909090909091

Mean recall score 0.89652397189

-------------------------------------------

C parameter: 1

-------------------------------------------

Iteration 1 : recall score = 0.86301369863

Iteration 2 : recall score = 0.86301369863

Iteration 3 : recall score = 0.983050847458

Iteration 4 : recall score = 0.945945945946

Iteration 5 : recall score = 0.924242424242

Mean recall score 0.915853322981

-------------------------------------------

C parameter: 10

-------------------------------------------

Iteration 1 : recall score = 0.849315068493

Iteration 2 : recall score = 0.876712328767

Iteration 3 : recall score = 0.983050847458

Iteration 4 : recall score = 0.945945945946

Iteration 5 : recall score = 0.939393939394

Mean recall score 0.918883626012

-------------------------------------------

C parameter: 100

-------------------------------------------

Iteration 1 : recall score = 0.86301369863

Iteration 2 : recall score = 0.876712328767

Iteration 3 : recall score = 0.983050847458

Iteration 4 : recall score = 0.945945945946

Iteration 5 : recall score = 0.924242424242

Mean recall score 0.918593049009

Traceback (most recent call last):

File "/Users/username/College/year-4/fyp-credit-card-fraud/code/main.py", line 20, in

best_c_param = classify.print_kfold_scores(X_training_undersampled, y_training_undersampled)

File "/Users/username/College/year-4/fyp-credit-card-fraud/code/Classification.py", line 39, in print_kfold_scores

best_c_param = results.loc[results['Mean recall score'].idxmax()]['C_parameter']

File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/series.py", line 1369, in idxmax

i = nanops.nanargmax(_values_from_object(self), skipna=skipna)

File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/nanops.py", line 74, in _f

raise TypeError(msg.format(name=f.__name__.replace('nan', '')))

TypeError: reduction operation 'argmax' not allowed for this dtype

Process finished with exit code 1

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值