python数据分类案例_python使用pandas抽样训练数据中某个类别实例

最新推荐文章于 2024-09-22 20:54:19 发布

weixin_39636608

最新推荐文章于 2024-09-22 20:54:19 发布

阅读量434

点赞数

文章标签： python数据分类案例

废话真的一句也不想多说，直接看代码吧！

# -*- coding: utf-8 -*-

import numpy

from sklearn import metrics

from sklearn.svm import LinearSVC

from sklearn.naive_bayes import MultinomialNB

from sklearn import linear_model

from sklearn.datasets import load_iris

from sklearn.cross_validation import train_test_split

from sklearn.preprocessing import OneHotEncoder, StandardScaler

from sklearn import cross_validation

from sklearn import preprocessing

import scipy as sp

from sklearn.linear_model import LogisticRegression

from sklearn.feature_selection import SelectKBest ,chi2

import pandas as pd

from sklearn.preprocessing import OneHotEncoder

#import iris_data

'''

creativeID,userID,positionID,clickTime,conversionTime,connectionType,

telecomsOperator,appPlatform,sitesetID,positionType,age,gender,

education,marriageStatus,haveBaby,hometown,residence,appID,appCategory,label

'''

def test():

df = pd.read_table("/var/lib/mysql-files/data1.csv", sep=",")

df1 = df[["connectionType","telecomsOperator","appPlatform","sitesetID",

"positionType","age","gender","education","marriageStatus",

"haveBaby","hometown","residence","appCategory","label"]]

print df1["label"].value_counts()

N_data = df1[df1["label"]==0]

P_data = df1[df1["label"]==1]

N_data = N_data.sample(n=P_data.shape[0], frac=None, replace=False, weights=None, random_state=2, axis=0)

#print df1.loc[:,"label"]==0

print P_data.shape

print N_data.shape

data = pd.concat([N_data,P_data])

print data.shape

data = data.sample(frac=1).reset_index(drop=True)

print data[["label"]]

return

补充拓展：pandas实现对dataframe抽样

随机抽样

import pandas as pd

#对dataframe随机抽取2000个样本

pd.sample(df, n=2000)

分层抽样

利用sklean中的函数灵活进行抽样

from sklearn.model_selection import train_test_split

#y是在X中的某一个属性列

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2, stratify=y)

以上这篇python使用pandas抽样训练数据中某个类别实例就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持脚本之家。

weixin_39636608

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。