使用Google Colab进行心跳信号分类预测——Task1
Datawhale 零基础入门数据挖掘Task1赛题理解学习笔记打卡
部分代码示例
- 使用Google Colaboratory进行数据读取
#导入pandas, numpy
import pandas as pd
import numpy as np
- 链接Google账号,从Google Drive里读取数据。
#google drive mount
from google.colab import drive
drive.mount('/content/drive')
- 点击链接,认证账号。
- 选择账号,并授权
- 复制code到notebook里粘贴
#进入数据集所在文件夹HeartbeatClassification
%cd /content/drive/My Drive/HeartbeatClassification
#读取数据
train=pd.read_csv('train.csv')
test=pd.read_csv('testA.csv')
#查看数据前6行
train.head()
- 制作baseline所需要的包
#导入包
import os
import gc
import math
import lightgbm as lgb
import xgboost as xgb
from catboost import CatBoostRegressor
from sklearn.linear_model import SGDRegressor, LinearRegression, Ridge
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import StratifiedKFold, KFold
from sklearn.metrics import log_loss
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
from tqdm import tqdm
import matplotlib.pyplot as plt
import time
import warnings
warnings.filterwarnings('ignore')
注:缺少catboost库,用pip install catboost
下载
数据预处理和模型训练的代码参考Datawhale开源地址
- 最后结果上传