【数据挖掘】[01]心跳信号预测

最新推荐文章于 2023-05-31 15:14:50 发布

李博清

最新推荐文章于 2023-05-31 15:14:50 发布

阅读量250

点赞数

分类专栏：组队学习

本文链接：https://blog.csdn.net/weixin_44454670/article/details/114905292

版权

文章目录

前言
一、数据处理
- 1.压缩dataframe占用内存
二、建立模型
- lgb
- 结果
总结

前言

首先感谢datawhale组织的组队学习，学习地址
GitHub：https://github.com/datawhalechina/team-learning-data- mining/tree/master/HeartbeatClassification
阿里云天池：https://tianchi.aliyun.com/competition/entrance/531883/notebook

一、数据处理

1.压缩dataframe占用内存

Kaggle比赛中常用
尤其是对大量使用数字类型的数据，主要原理是把int64/float64类型的数值用更小的int(float)32/16/8来搞定，拿走直接用，经常会减少50%的内存使用，甚至更多。

def reduce_mem_usage(df):
    start_mem = df.memory_usage().sum() / 1024**2 
    print('Memory usage of dataframe is {:.2f} MB'.format(start_mem))
    
    for col in df.columns:
        col_type = df[col].dtype
        
        if col_type != object:
            c_min = df[col].min()
            c_max = df[col].max()
            if str(col_type)[:3] == 'int':
                if c_min > np.iinfo(np.int8).min and c_max < np.iinfo(np.int8).max:
                    df[col] = df[col].astype(np.int8)
                elif c_min > np.iinfo(np.int16).</

最低0.47元/天解锁文章

李博清

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
【数据挖掘】[01]心跳信号预测

文章目录前言一、pandas是什么？二、使用步骤1.引入库2.读入数据总结前言首先感谢datawhale组织的组队学习，学习地址GitHub：https://github.com/datawhalechina/team-learning-data- mining/tree/master/HeartbeatClassification阿里云天池：https://tianchi.aliyun.com/competition/entrance/531883/notebook一、pandas是什
复制链接

扫一扫