numpy中的ndarray与pandas的Series和DataFrame之间的相互转换https://blog.csdn.net/jinguangliu/article/details/78538748?biz_id=102&utm_term=pandas%E5%92%8Cnumpy%E6%A0%BC%E5%BC%8F%E8%BD%AC%E6%8D%A2&utm_medium=distribute.pc_search_result.none-task-blog-2allsobaiduweb~default-1-78538748&spm=1018.2118.3001.4187
Pandas,numpy数据类型之间的互换(持续更新)https://blog.csdn.net/qq_29027865/article/details/81904403?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522159749206119725211954788%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=159749206119725211954788&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2allfirst_rank_ecpm_v3~pc_rank_v3-4-81904403.pc_ecpm_v3_pc_rank_v3&utm_term=pandas%E5%92%8Cnumpy%E6%A0%BC%E5%BC%8F%E8%BD%AC%E6%8D%A2&spm=1018.2118.3001.4187
numpy的ndarray与pandas的series和dataframe之间互转
https://blog.csdn.net/qq_38486203/article/details/80527029?biz_id=102&utm_term=pandas%E5%92%8Cnumpy%E6%A0%BC%E5%BC%8F%E8%BD%AC%E6%8D%A2&utm_medium=distribute.pc_search_result.none-task-blog-2allsobaiduweb~default-5-80527029&spm=1018.2118.3001.4187
pandas操作。
data[‘ounces’] = data[‘ounces’].map(lambda x: x+ 2) # 这里使用 Map 函数与Apply函数有点类似
data ####对某列进行操作
data[‘animal’] = data[‘food’].map(meat_to_animal) ##加入dict成为一列
data
…
df = pd.DataFrame({‘A’: [‘a’, ‘b’, ‘a’, ‘c’, ‘a’, ‘c’, ‘b’, ‘c’], ‘B’: [2, 8, 1, 4, 3, 2, 5, 9],‘C’: [102, 98, 107, 104, 115, 87, 92, 123]})
df
#把Datafram转换成数组:array = Dataframe_Name.values
a=df.iloc[:,1:].values
a
#把数组转换为Datafram: df = pd.Dataframe(array)
a=pd.DataFrame(a)
a
…
list、numpy、pandas数据格式转换
https://blog.csdn.net/weixin_43818850/article/details/85394310?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.edu_weight&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.edu_weight
5.Dataframe转list
首先使用np.array()函数把DataFrame转化为np.ndarray()【array相当于R的矩阵,DATAFRAME相当于dataframe】,再利用tolist()函数把np.ndarray()转为list
#5.Dataframe转list
data = np.array(df).tolist()
data
#:DataFrame转数组(numpy)形式
data=df.values
data
###等价data = np.array(df)
#6.numpy转Dataframe
#numpy转Dataframe可以指定列名和行索引
data = np.array(df)
df = pd.DataFrame(data,index=list(range(data.shape[0])),columns=[‘title’,‘content’,‘pub_date’])
df
…
list,dict,array,series,dataframe相互转换
https://blog.csdn.net/weixin_44056331/article/details/89165013?utm_medium=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromMachineLearnPai2-1.edu_weight&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromMachineLearnPai2-1.edu_weight
#dict to dataframe
df = pd.DataFrame(data)
print(df)
##dict 与Series相似
#dict to series
#Series是一个一维的类似的数组对象,包含一个数组的数据(任何NumPy的数据类型)和一个与数组关联的数据标签,被叫做 索引 。
#若不指定 index,data 的 key 充当 Series 的 index
ser=pd.Series(data)
#series to dict
data_series=ser.to_dict() #Convert Series to {label -> value} dict or dict-like object.
data_series
#list to dataframe
df=pd.DataFrame(data,index=[“A”,“B”,“C”,“D”,“E”],columns=[‘words’, ‘year’, ‘number’])
print(df)
#输出为:
words year number
#A animal 2000 1.5
#B ambition 2001 1.7
#C balance 2002 3.6
#D city 2001 2.4
#E decade 2002 2.9
…
很有用
4.dataframe to array/dict
import numpy as np
import pandas as pd
from pandas import DataFrame
data=[["animal",2000,1.5],["ambition",2001,1.7],["balance",2002,3.6],["city",2001,2.4],["decade",2002,2.9]]
pd=DataFrame(data,index=["A","B","C","D","E"],columns=['words', 'year', 'number'])
#dataframe to array
ndarray=np.array(pd)
print(ndarray)
print(ndarray.shape)
#输出为:
#[['animal' 2000 1.5]
# ['ambition' 2001 1.7]
# ['balance' 2002 3.6]
# ['city' 2001 2.4]
# ['decade' 2002 2.9]]
#(5, 3)
#dataframe to dict
#dict返回的是dict of dict;list返回的是列表的字典;series返回的是序列的字典;records返回的是字典的列表
dict_data1=pd.to_dict(orient="dict")
print(dict_data1)
#输出为:{'words': {'A': 'animal', 'B': 'ambition', 'C': 'balance', 'D': 'city', 'E': 'decade'}, 'year': {'A': 2000, 'B': 2001, 'C': 2002, 'D': 2001, 'E': 2002}, 'number': {'A': 1.5, 'B': 1.7, 'C': 3.6, 'D': 2.4, 'E': 2.9}}
dict_data2=pd.to_dict(orient="list")
print(dict_data2)
#输出为:{'words': ['animal', 'ambition', 'balance', 'city', 'decade'], 'year': [2000, 2001, 2002, 2001, 2002], 'number': [1.5, 1.7, 3.6, 2.4, 2.9]}
dict_data3=pd.to_dict(orient="series")
print(dict_data3)
#输出为:
#{'words': A animal
#B ambition
#C balance
#D city
#E decade
#Name: words, dtype: object, 'year': A 2000
#B 2001
#C 2002
#D 2001
#E 2002
#Name: year, dtype: int64, 'number': A 1.5
#B 1.7
#C 3.6
#D 2.4
#E 2.9
#Name: number, dtype: float64}
dict_data4= pd.to_dict(orient='records')
print(dict_data4)
#输出为:[{'words': 'animal', 'year': 2000, 'number': 1.5}, {'words': 'ambition', 'year': 2001, 'number': 1.7}, {'words': 'balance', 'year': 2002, 'number': 3.6}, {'words': 'city', 'year': 2001, 'number': 2.4}, {'words': 'decade', 'year': 2002, 'number': 2.9}]
对于多列的dataframe,把第一列当作key,其他列当作value
import numpy as np
import pandas as pd
from pandas import DataFrame
data=[["animal",2000,1.5],["ambition",2001,1.7],["balance",2002,3.6],["city",2001,2.4],["decade",2002,2.9]]
pd=DataFrame(data,index=["A","B","C","D","E"],columns=['words', 'year', 'number'])
dict_data1=pd.set_index("words").T.to_dict("list")
print(dict_data1)
#输出为:{'animal': [2000.0, 1.5], 'ambition': [2001.0, 1.7], 'balance': [2002.0, 3.6], 'city': [2001.0, 2.4], 'decade': [2002.0, 2.9]}
#2.更改所有列名
c.columns=[‘A’,‘B’,‘C’,‘D’]
c
#更改指定列名
c.rename(columns={‘A’:‘AA’,‘C’:‘CC’})
#更改全部索引
c.index=[‘A’,‘B’,‘C’,‘D’,‘E’]
c
#更改特定索引
c.rename(index={‘A’:‘AA’,‘C’:‘CC’})
#4.选取指定数据
df1=c.loc[[‘A’,‘B’,‘C’],:]
df1
df2=c.loc[:,[‘A’,‘B’]]
df2
#5.重置索引
#增加新索引,原索引保留
a=c
a.reset_index()
#增加新索引,删除原索引
b=c
b.reset_index(drop=True)
c=pd.merge(left,df3,left_on=‘key1’,right_on=‘key3’) #键名不同的连接
#更改指定列名
c.rename(columns={‘lval_x’:‘valuekey1’,‘lval_y’:‘valuekey3’})