Python:One-hot-encoding 例子

本文介绍了使用Python进行数据预处理的方法,包括性别、年龄、身高、体重等特征的编码,以及OneHotEncoder和LabelEncoder的运用。通过读取并操作ObesityDataSet.csv,最终将数据整合为DataFrame并保存为CSV文件。
摘要由CSDN通过智能技术生成
import pandas as pd
import numpy as np
import os
os.getcwd()
os.chdir("D:\OCdata")
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelEncoder

data = pd.read_csv("ObesityDataSet_raw_and_data_sinthetic.csv")

# print(data["family_history_with_overweight"].value_counts())

Gender = np.array(pd.get_dummies(data["Gender"]))

print("Gender",Gender.shape)


Age = np.array(data["Age"]).reshape(-1,1)
print("Age",Age.shape)

Height = np.array(data["Height"]).reshape(-1,1)
print("height",Height.shape)

Weight = np.array(data["Weight"]).reshape(-1,1)
print("Weight",Weight.shape)

family = np.array(pd.get_dummies(data["family_history_with_overweight"]))
print("family",family.shape)

FCVC = np.array(data["FCVC"]).reshape(-1,1)
print("FCVC",FCVC.shape)


NCP = np.array(data["NCP"]).reshape(-1,1)
print("NCP",NCP.shape)

CAEC = np.array(pd.get_dummies(data["CAEC"]))
print("CAEC",CAEC.shape)

smoke = np.array(pd.get_dummies(data["SMOKE"]))
print("smoke",smoke.shape)

CH2O = np.array(data["CH2O"]).reshape(-1,1)
print("CH2O",CH2O.shape)

SCC = np.array(pd.get_dummies(data["SCC"]))
print("SCC",SCC.shape)

FAF = np.array(data["FAF"]).reshape(-1,1)
print("FAF ",FAF.shape)

TUE = np.array(data["TUE"]).reshape(-1,1)
print("TUE ",TUE.shape)

CALC = np.array(pd.get_dummies(data["CALC"]))
print("CALC",CALC.shape)

MTRANS = np.array(pd.get_dummies(data["MTRANS"]))
print("MTRANS",MTRANS.shape)


print(data["NObeyesdad"].value_counts())

NObeyesdad = np.array(data["NObeyesdad"])

decision = []
for ele in NObeyesdad:
    if ele == "Insufficient_Weight":
        decision.append(0)
    elif ele == "Normal_Weight":
        decision.append(1)
    elif ele == "Overweight_Level_I":
        decision.append(2)
    elif ele == "Overweight_Level_II":
        decision.append(2)
    elif ele == "Obesity_Type_I":
        decision.append(3)
    elif ele == "Obesity_Type_II":
        decision.append(3)
    elif ele == "Obesity_Type_III":
        decision.append(3)

decision = np.array(decision).reshape(-1,1)
print("decision",decision.shape)

Data = np.concatenate((Gender,Age,Height,Weight,family,FCVC,NCP,CAEC,smoke,CH2O,SCC,FAF,TUE,CALC,MTRANS,decision),axis=1)
print(Data.shape)
Data = pd.DataFrame(Data,index=None)
Data.to_csv(r"D:\OCdata\Obesity.csv",index=None,header=None)



 

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

DeniuHe

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值