Python建模的一些通用操作

最新推荐文章于 2024-04-28 05:00:00 发布

OverTheMoon

最新推荐文章于 2024-04-28 05:00:00 发布

阅读量682

点赞数

本文链接：https://blog.csdn.net/qq_17377865/article/details/78677204

版权

1. 训练集测试集划分

from sklearn.cross_validation import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x_data, y_data, test_size = 0.2, random_state = 45)

2. cross table

pd.crosstab(temp,y_result)
pd.crosstab(np.array(y_train).reshape(1048377), clf.predict(x_train))

http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.crosstab.html

类似的操作：

x_test_loan[['overdue_flag_30', 'result']].groupby(by=['overdue_flag_30', 'result']).agg(len).unstack()

3. map的用法

有时候，有一列list，想对list里每个值做变换，可以使用map函数。

map(func,seq1[,seq2...])：将函数func作用于给定序列的每个元素，并用一个列表来提供返回值；如果func为None，func表现为身份函数，返回一个含有每个序列中元素集合的n个元组的列表。

例如：

>>> map(lambda x : None,[1,2,3,4])  
[None, None, None, None]  
>>> map(lambda x : x * 2,[1,2,3,4])  
[2, 4, 6, 8]  
>>> map(lambda x : x * 2,[1,2,3,4,[5,6,7]])  
[2, 4, 6, 8, [5, 6, 7, 5, 6, 7]]  
>>> map(lambda x : None,[1,2,3,4])  
[None, None, None, None]

map内建函数的python实现：

>>> def map(func,seq):  
    mapped_seq = []  
    for eachItem in seq:  
        mapped_seq.append(func(eachItem))  
    return mapped_seq

Ref:

http://blog.csdn.net/prince2270/article/details/4681299

4. 另一种map，对值的映射

city_type_mapping = {'一线': 1, '二线': 2, '三线': 3, '四线': 4, '五线': 5}
sample_all['address_city_type'] = sample_all['address_city_type'].map(city_type_mapping)

5. 长转宽（zip）

sample = [(Sample(**doc), 1 / (round((int(time.time()) - doc['timestamp'] / 1000) / 60 / 60 / 24) + 5)) for doc in records]
population, weights = list(zip(*sample))

OverTheMoon

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Python建模的一些通用操作

1. 训练集测试集划分from sklearn.cross_validation import train_test_splitx_train, x_test, y_train, y_test = train_test_split(x_data, y_data, test_size = 0.2, random_state = 45)2. cross...
复制链接

扫一扫