Pandas数据结构
- Series一维数组结构
- DataFrame表格型数组结构
导入模块pandas,numpy,matplotlib.pyplot
1 import pandas as pd
2 import numpy as np
3 import matplotlib.pyplot as plt
- Series结构(一维数组)
- 定义:类似于一维数组,由一组索引(数据标签,即行索引)和一组数据组成。
- 创建一个Series:
pd.Series()
- 给Series方法传入一个列表
- 1.不写index索引,即默认数据标签从0开始
s1 = pd.Series(["a","b","c","d"])
- 2.设置index索引,用“index=”来设置索引项
s3 = pd.Series([1,2,3,4],index=["a","b","c","d"])
- 3.传入一个字典{“key1”:“value1”,……}
s4 = pd.Series({"a":1,"b":2,"c":3,"d":4})
- 1.不写index索引,即默认数据标签从0开始
- 获取Series的索引值(index)和值(values方法)
s1.index; s2.values
- DataFrame表格型数据结构
- 定义:由一对索引(行索引、列索引)和一组数据组成
- 创建一个DataFrame:
pd.DataFrame()
- 不设置行索引和列索引,默认都从0开始
- 给DataFrame传入一个单一列表:
d1 = pd.DataFrame(["a","b","c","d"])
- 给DataFrame传入一个嵌套列表:
d2 = pd.DataFrame([["a","A"],["b","B"],["c","C"],["d","D"]])或者是d2 = pd.DataFrame([("a","A"),("b","B"),("c","C"),("d","D")])
- 给DataFrame传入一个单一列表:
- 设置行索引(index)、列索引(columns):
d3 = pd.DataFrame([["a","A"],["b","B"],["c","C"],["d","D"]],index=["一","二","三","四"],columns=["小写","大写"])
- 给DataFrame传入一个字典
data = {"小写":["a","b","c","d"],"大写":["A","B","C","D"]} d4 = pd.DataFrame(data)
- 注意自定义索引:
data = {"小写":["a","b","c","d"],"大写":["A","B","C","D"]} d5 = pd.DataFrame(data,index = ["一","二","三","四"])
- 注意自定义索引:
- 获取DataFrame的行索引(df1.index)和列索引(df1.columns)