4-tablur

处理表格数据

导入包,数据集

from fastai.tabular import *
path = untar_data(URLs.ADULT_SAMPLE)
df = pd.read_csv(path/'adult.csv')
  • 指定哪些是类别变量,哪些是连续变量
  • 数据预处理procs
dep_var = 'salary'
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']
cont_names = ['age', 'fnlwgt', 'education-num']
procs = [FillMissing, Categorify, Normalize]
test = TabularList.from_df(df.iloc[800:1000].copy(), path=path, cat_names=cat_names, cont_names=cont_names)
data = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
                           .split_by_idx(list(range(800,1000)))
                           .label_from_df(cols=dep_var)
                           .add_test(test)
                           .databunch())
data.show_batch(rows=10)
workclasseducationmarital-statusoccupationrelationshipraceeducation-num_naagefnlwgteducation-numtarget
Local-govBachelorsNever-marriedProf-specialtyNot-in-familyWhiteFalse-0.1896-0.74761.1422<50k
Self-emp-not-incHS-gradMarried-civ-spouseCraft-repairHusbandWhiteFalse-0.04300.0063-0.4224<50k
PrivateHS-gradMarried-civ-spouseSalesHusbandWhiteFalse-1.1425-1.4272-0.4224<50k
PrivateHS-gradDivorcedSalesOwn-childWhiteFalse-0.26292.4893-0.4224<50k
PrivateSome-collegeNever-marriedTech-supportOwn-childWhiteFalse-1.4357-0.2975-0.0312<50k
Self-emp-not-incBachelorsNever-marriedSalesOwn-childAsian-Pac-IslanderFalse0.03031.19381.1422<50k
PrivateHS-gradNever-marriedAdm-clericalUnmarriedWhiteFalse-0.9959-0.1439-0.4224<50k
PrivateBachelorsMarried-civ-spouseExec-managerialHusbandWhiteFalse0.0303-0.20851.1422>=50k
?7th-8thDivorced?UnmarriedWhiteFalse1.20300.0396-2.3781<50k
?Some-collegeNever-married?Own-childWhiteFalse-1.36240.4335-0.0312<50k
  • 网络结构layers=[200,100]
learn = tabular_learner(data, layers=[200,100], metrics=accuracy)
learn.fit(1, 1e-2)
epochtrain_lossvalid_lossaccuracytime
00.3590380.3903570.80500000:52
row = df.iloc[0]
learn.predict(row)
(Category <50k, tensor(0), tensor([0.5156, 0.4844]))
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值