1.1数据分析之------pandas

本文详细探讨了pandas在数据分析中的应用,包括数据清洗、数据整合、数据切片和聚合操作等核心功能,旨在帮助读者提升数据分析效率。
摘要由CSDN通过智能技术生成
import  pandas  as pd
a=[1,4,7,9]
s=pd.Series(a)
s
0    1
1    4
2    7
3    9
dtype: int64
s.dtypes
dtype('int64')
s.values
array([1, 4, 7, 9], dtype=int64)
#修改索引
a=[1,4,7,9]
index=['a','b','c','d']
s=pd.Series(a,index)
s
a    1
b    4
c    7
d    9
dtype: int64
s.index
Index(['a', 'b', 'c', 'd'], dtype='object')
s
a    1
b    4
c    7
d    9
dtype: int64

a=[1,4,7,9]
index=list('rtyu')
s=pd.Series(a,index)
s
r    1
t    4
y    7
u    9
dtype: int64
s[s>5]
y    7
u    9
dtype: int64
dic = {
   'beijing':35000,'shanghai':71000,'guangzhou':16000,'shenzhen':5000}
s1=pd.Series(dic)
s1
beijing      35000
shanghai     71000
guangzhou    16000
shenzhen      5000
dtype: int64
s1['beijing']
35000
s1.keys()
Index(['beijing', 'shanghai', 'guangzhou', 'shenzhen'], dtype='object')
list(s1.items())
[('beijing', 35000),
 ('shanghai', 71000),
 ('guangzhou', 16000),
 ('shenzhen', 5000)]
s1.notnull()
beijing      True
shanghai     True
guangzhou    True
shenzhen     True
dtype: bool
pd.notnull(s1)
beijing      True
shanghai     True
guangzhou    True
shenzhen     True
dtype: bool
s1.name='xxxxxxx'
s1
beijing      35000
shanghai     71000
guangzhou    16000
shenzhen      5000
Name: xxxxxxx, dtype: int64
s1.index.name='qqqqqqqqqqq'
s1
qqqqqqqqqqq
beijing      35000
shanghai     71000
guangzhou    16000
shenzhen      5000
Name: xxxxxxx, dtype: int64
data = {
   'city':['beijing','beijing','beijing','shanghai','shanghai','shanghai'],
        'year':[2000,2001,2002,2001,2002,2003],
        'pop':[1.5, 1.7,3.6,2.4,2.9,3.2
    ]}
data
{'city': ['beijing', 'beijing', 'beijing', 'shanghai', 'shanghai', 'shanghai'],
 'year': [2000, 2001, 2002, 2001, 2002, 2003],
 'pop': [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]}
df=pd.DataFrame(data)
df
city year pop
0 beijing 2000 1.5
1 beijing 2001 1.7
2 beijing 2002 3.6
3 shanghai 2001 2.4
4 shanghai 2002 2.9
5 shanghai 2003 3.2
df.index
RangeIndex(start=0, stop=6, step=1)
df.columns
Index(['city', 'year', 'pop'], dtype='object')
df.dtypes
city     object
year      int64
pop     float64
dtype: object
df.values
array([['beijing', 2000, 1.5],
       ['beijing', 2001, 1.7],
       ['beijing', 2002, 3.6],
       ['shanghai', 2001, 2.4],
       ['shanghai', 2002, 2.9],
       ['shanghai', 2003, 3.2]], dtype=object)
df.head(1)
df.tail(2)
city year pop
4 shanghai 2002 2.9
5 shanghai 2003 3.2
df.columns
Index(['city', 'year', 'pop'], dtype='object')
df1=pd.DataFrame(data,columns=['year', 'city', 'pop'])

df1


year city pop
0 2000 beijing 1.5
1 2001 beijing 1.7
2 2002 beijing 3.6
3 2001 shanghai 2.4
4 2002 shanghai 2.9
5 2003 shanghai 3.2
df2 = pd.DataFrame(data,index=['a','b'
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值