pandas.read_csv() 参数 names整理

pandas 官方文档

 

names : array-like, default None

用于结果的列名列表,如果数据文件中没有列标题行,就需要执行header=None。默认列表中不能出现重复,除非设定参数mangle_dupe_cols=True。

 

AgeGenderEducationEducationFieldMaritalStatusIncomeOverTime
37Male4Life SciencesDivorced5993No
54Female4Life SciencesDivorced10502No
34Male3Life SciencesSingle6074Yes
39Female1Life SciencesMarried12742No
28Male3MedicalDivorced2596No
24Female1MedicalMarried4162Yes
29Male5OtherSingle3983No
36Male2MedicalMarried7596No
33Female4MedicalMarried2622No
import pandas as pd  

1.1

data = pd.read_csv('./train.csv',
                   names=['new_0','new_1','new_2','new_3','new_4','new_5','new_6']
                   )

print(data.head(5))

输出结果:

  new_0   new_1      new_2           new_3          new_4   new_5     new_6
0   Age  Gender  Education  EducationField  MaritalStatus  Income  OverTime
1    37    Male          4   Life Sciences       Divorced    5993        No
2    54  Female          4   Life Sciences       Divorced   10502        No
3    34    Male          3   Life Sciences         Single    6074       Yes
4    39  Female          1   Life Sciences        Married   12742        No

1.2

data = pd.read_csv('./train.csv',
                   header=None,
                   names=['new_0','new_1','new_2','new_3','new_4','new_5','new_6']
                   )

print(data.head(5))

输出结果:

  new_0   new_1      new_2           new_3          new_4   new_5     new_6
0   Age  Gender  Education  EducationField  MaritalStatus  Income  OverTime
1    37    Male          4   Life Sciences       Divorced    5993        No
2    54  Female          4   Life Sciences       Divorced   10502        No
3    34    Male          3   Life Sciences         Single    6074       Yes
4    39  Female          1   Life Sciences        Married   12742        No

1.3  header=2,  names=['new_0','new_1','new_2','new_3','new_4','new_5','new_6']

等于header=2,则第2行作为列名,Dataframe 从3行的数据开始,但names定义列名覆盖第2行的列名。

data = pd.read_csv('./train.csv',
                   header=2,
                   names=['new_0','new_1','new_2','new_3','new_4','new_5','new_6']
                   )

print(data.head(5))

输出结果:

   new_0   new_1  new_2          new_3     new_4  new_5 new_6
0     34    Male      3  Life Sciences    Single   6074   Yes
1     39  Female      1  Life Sciences   Married  12742    No
2     28    Male      3        Medical  Divorced   2596    No
3     24  Female      1        Medical   Married   4162   Yes
4     29    Male      5          Other    Single   3983    No

 

 

 

  • 4
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值