pandas 的DataFrame学习

pandas 入门学习

Series 使用方法:

  • Series定义
  • Series 对象基本创建,值,索引如何访问
  • Series 对象通过字典创建
  • 判断Series对象是否存在缺失值
import pandas as pd
from pandas import Series, DataFrame
import numpy as np

Pandas 的 DataFrame 学习

  • DataFrame 定义:它是一个表格型的数据结构,它含有一组有序的列,每列可以是不同的值类型(布尔,字符串,数值等)
  • DataFrame可以看作是由 Series组成的字典它们共用一个索引

DataFrame 的创建:最常用的方法是传入一个等长列表或者Numpy数组组成的 字典

data = {'state':['ohic','ohic','ohic','nevada','nevada'],
        'years':[2000,2001,2002,2003,2004],
        'pop':[[1.5, 1.7,3.5, 2.4,1.9]}
frame = DataFrame(data)
frame
stateyearspop
0ohic20001.5
1ohic20011.7
2ohic20023.5
3nevada20032.4
4nevada20041.9

指定DataFrame列排列顺序

frame = DataFrame(data, columns = ['pop', 'state', 'years'])
frame
popstateyears
01.5ohic2000
11.7ohic2001
23.5ohic2002
32.4nevada2003
41.9nevada2004
fm = DataFrame(data, index = ['one','two', 'thress','four', 'five'], columns=['pop','state','years'])
# 指明行索引和列索引
fm
popstateyears
one1.5ohic2000
two1.7ohic2001
thress3.5ohic2002
four2.4nevada2003
five1.9nevada2004
fm.columns
Index(['pop', 'state', 'years'], dtype='object')
fm.index
Index(['one', 'two', 'thress', 'four', 'five'], dtype='object')
# fm.index.name('a')

DataFrame 对于 列属性的处理:查看列的值,给某列赋值

fm['state'] # 通过列索引 查看某列值
one         ohic
two         ohic
thress      ohic
four      nevada
five      nevada
Name: state, dtype: object
fm.state # 通过列属性查看列的值
one         ohic
two         ohic
thress      ohic
four      nevada
five      nevada
Name: state, dtype: object
fm1 = DataFrame(data, index = ['one','two', 'thress','four', 'five'], columns=['pop','state','years','det'])
fm1
popstateyearsdet
one1.5ohic2000NaN
two1.7ohic2001NaN
thress3.5ohic2002NaN
four2.4nevada2003NaN
five1.9nevada2004NaN
fm1.det = 19  # 通过属性给某列赋值 (思考:如果给指定行和列赋值呢?)
fm1
popstateyearsdet
one1.5ohic200019
two1.7ohic200119
thress3.5ohic200219
four2.4nevada200319
five1.9nevada200419
# 把列表或者数组赋值给某个列时,长度必须和DataFrame长度匹配,否则就会报错
v = [1,2,3,4,5]
fm1.det = v
fm1
popstateyearsdet
one1.5ohic20001
two1.7ohic20012
thress3.5ohic20023
four2.4nevada20034
five1.9nevada20045
# v = [1,2,3,5]#  Length of values does not match length of index
# fm1.det = v
# fm1
# 把Series对象付给某个列时,会精确匹配DataFrame的索引,所有空位会补上缺失值
# val = Series([1.2,1,3,2,4], index =['two','four','five'])
# # fm1.det = val
# fm1['det'] = val
# fm1
# 为了不存在的列赋值会创建新的列,del删除列
fm1['east'] = fm1.state=='ohic'
fm1
popstateyearsdeteast
one1.5ohic20001True
two1.7ohic20012True
thress3.5ohic20023True
four2.4nevada20034False
five1.9nevada20045False
del fm1['east']

fm1
popstateyearsdet
one1.5ohic20001
two1.7ohic20012
thress3.5ohic20023
four2.4nevada20034
five1.9nevada20045
fm1.columns
Index(['pop', 'state', 'years', 'det'], dtype='object')
# 通过嵌套字典创建DataFrame,外层键为列名,内层键为行名
data1 = {'a':{'x':1, 'y':2,'z':3},
         'b':{'x':11,'y':12, 'z':13}}
fm2 = DataFrame(data1)
fm2
ab
x111
y212
z313

DataFrame的名称属性:name, values,

fm1.index.name = 'Uid'
fm1.columns.name = 'co_name'
fm1
co_namepopstateyearsdet
Uid
one1.5ohic20001
two1.7ohic20012
thress3.5ohic20023
four2.4nevada20034
five1.9nevada20045
fm1.values
array([[1.5, 'ohic', 2000, 1],
       [1.7, 'ohic', 2001, 2],
       [3.5, 'ohic', 2002, 3],
       [2.4, 'nevada', 2003, 4],
       [1.9, 'nevada', 2004, 5]], dtype=object)
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值