pandas对数据处理并非结构化输出

        对pandas数据类型的处理,可以将数据转换成list数据类型处理,也可以使用df.apply()方法对df数据进行处理。

方法一:将每列数据转换成列表类型再进行处理

示例代码1:

import pandas as pd

#  读取excel中数据
df = pd.read_excel('./test.xlsx')
print(df)
print("*" * 50)

#  将df数据转换成list数据类型
df_lst1 = df['num1'].values.tolist()
print(df_lst1)
df_lst2 = df['num2'].values.tolist()
print(df_lst2)
print("*" * 50)

#  对每列数据进行单独处理
df_lst1_odd = [i for i in df_lst1 if i % 2 == 1]
df_lst1_even = [i for i in df_lst1 if i % 2 == 0]
print(df_lst1_odd)
print(df_lst1_even)
df_lst2_odd = [i for i in df_lst2 if i % 2 == 1]
df_lst2_even = [i for i in df_lst2 if i % 2 == 0]
print(df_lst2_odd)
print(df_lst2_even)
print("*" * 50)

#  将上述拿到的非结构化数据使用from_dict结构化处理
df = pd.DataFrame.from_dict(
    {"num1奇数": df_lst1_odd, "num1偶数": df_lst1_even, "num2奇数": df_lst2_odd, "num2偶数": df_lst2_even}, orient='index')
print(df)
print(df.T)

#  将df数据保存在csv文件中
df.T.to_csv("./new_test.csv", index=False)

运行结果:

   num1  num2
0     1    11
1     2    33
2     3    22
3     4    44
4     6    55
5     5    33
6     7    88
7     9    99
**************************************************
[1, 2, 3, 4, 6, 5, 7, 9]
[11, 33, 22, 44, 55, 33, 88, 99]
**************************************************
[1, 3, 5, 7, 9]
[2, 4, 6]
[11, 33, 55, 33, 99]
[22, 44, 88]
**************************************************
         0   1   2     3     4
num1奇数   1   3   5   7.0   9.0
num1偶数   2   4   6   NaN   NaN
num2奇数  11  33  55  33.0  99.0
num2偶数  22  44  88   NaN   NaN
   num1奇数  num1偶数  num2奇数  num2偶数
0     1.0     2.0    11.0    22.0
1     3.0     4.0    33.0    44.0
2     5.0     6.0    55.0    88.0
3     7.0     NaN    33.0     NaN
4     9.0     NaN    99.0     NaN

输出的csv文件内容:

 示例代码2:  【与示例代码1的区别是输出保留了原数据】

import pandas as pd

#  读取excel中数据
df = pd.read_excel('./test.xlsx')
print(df)
print("*" * 50)

#  将df数据转换成list数据类型
df_lst1 = df['num1'].values.tolist()
print(df_lst1)
df_lst2 = df['num2'].values.tolist()
print(df_lst2)
print("*" * 50)

#  对每列数据进行单独处理
df_lst1_odd = [i for i in df_lst1 if i % 2 == 1]
df_lst1_even = [i for i in df_lst1 if i % 2 == 0]
print(df_lst1_odd)
print(df_lst1_even)
df_lst2_odd = [i for i in df_lst2 if i % 2 == 1]
df_lst2_even = [i for i in df_lst2 if i % 2 == 0]
print(df_lst2_odd)
print(df_lst2_even)
print("*" * 50)

#  将上述拿到的非结构化数据使用from_dict结构化处理
df = pd.DataFrame.from_dict(
    {"num1": df_lst1, "num2": df_lst2, "num1奇数": df_lst1_odd, "num1偶数": df_lst1_even, "num2奇数": df_lst2_odd,
     "num2偶数": df_lst2_even}, orient='index')
print(df)
print(df.T)

#  将df数据保存在csv文件中
df.T.to_csv("./new_test.csv", index=False)

运行结果:

   num1  num2
0     1    11
1     2    33
2     3    22
3     4    44
4     6    55
5     5    33
6     7    88
7     9    99
**************************************************
[1, 2, 3, 4, 6, 5, 7, 9]
[11, 33, 22, 44, 55, 33, 88, 99]
**************************************************
[1, 3, 5, 7, 9]
[2, 4, 6]
[11, 33, 55, 33, 99]
[22, 44, 88]
**************************************************
         0   1   2     3     4     5     6     7
num1     1   2   3   4.0   6.0   5.0   7.0   9.0
num2    11  33  22  44.0  55.0  33.0  88.0  99.0
num1奇数   1   3   5   7.0   9.0   NaN   NaN   NaN
num1偶数   2   4   6   NaN   NaN   NaN   NaN   NaN
num2奇数  11  33  55  33.0  99.0   NaN   NaN   NaN
num2偶数  22  44  88   NaN   NaN   NaN   NaN   NaN
   num1  num2  num1奇数  num1偶数  num2奇数  num2偶数
0   1.0  11.0     1.0     2.0    11.0    22.0
1   2.0  33.0     3.0     4.0    33.0    44.0
2   3.0  22.0     5.0     6.0    55.0    88.0
3   4.0  44.0     7.0     NaN    33.0     NaN
4   6.0  55.0     9.0     NaN    99.0     NaN
5   5.0  33.0     NaN     NaN     NaN     NaN
6   7.0  88.0     NaN     NaN     NaN     NaN
7   9.0  99.0     NaN     NaN     NaN     NaN

输出的csv文件内容:

方法二:使用df.apply()方法自主定义函数

示例代码:

import pandas as pd

#  读取excel中数据
df = pd.read_excel('./test.xlsx')
print(df)
print("*" * 50)


#  定义奇函数
def odd(x):
    if x % 2 == 1:
        return x


#  定义偶函数
def even(x):
    if x % 2 == 0:
        return x


#  单独获取每列数据
df_lst1 = df['num1']
df_lst2 = df['num2']

#  使用df.apply()自定义运算
df_lst1_odd = df['num1'].apply(odd)
df_lst1_even = df['num1'].apply(even)
df_lst2_odd = df['num2'].apply(odd)
df_lst2_even = df['num2'].apply(even)

#  将上述拿到的非结构化数据使用from_dict结构化处理
df = pd.DataFrame.from_dict(
    {"num1": df_lst1, "num2": df_lst2, "num1奇数": df_lst1_odd, "num1偶数": df_lst1_even, "num2奇数": df_lst2_odd,
     "num2偶数": df_lst2_even}, orient='index')
print(df)
print(df.T)

#  将df数据保存在csv文件中
df.T.to_csv("./new_test.csv", index=False)

运行结果:

   num1  num2
0     1    11
1     2    33
2     3    22
3     4    44
4     6    55
5     5    33
6     7    88
7     9    99
**************************************************
           0     1     2     3     4     5     6     7
num1     1.0   2.0   3.0   4.0   6.0   5.0   7.0   9.0
num2    11.0  33.0  22.0  44.0  55.0  33.0  88.0  99.0
num1奇数   1.0   NaN   3.0   NaN   NaN   5.0   7.0   9.0
num1偶数   NaN   2.0   NaN   4.0   6.0   NaN   NaN   NaN
num2奇数  11.0  33.0   NaN   NaN  55.0  33.0   NaN  99.0
num2偶数   NaN   NaN  22.0  44.0   NaN   NaN  88.0   NaN
   num1  num2  num1奇数  num1偶数  num2奇数  num2偶数
0   1.0  11.0     1.0     NaN    11.0     NaN
1   2.0  33.0     NaN     2.0    33.0     NaN
2   3.0  22.0     3.0     NaN     NaN    22.0
3   4.0  44.0     NaN     4.0     NaN    44.0
4   6.0  55.0     NaN     6.0    55.0     NaN
5   5.0  33.0     5.0     NaN    33.0     NaN
6   7.0  88.0     7.0     NaN     NaN    88.0
7   9.0  99.0     9.0     NaN    99.0     NaN

输出的csv文件内容:

  解决上述csv文件中数据松散问题:

示例代码:

import pandas as pd

#  读取excel中数据
df = pd.read_excel('./test.xlsx')
print(df)
print("*" * 50)


#  定义奇函数
def odd(x):
    if x % 2 == 1:
        return x


#  定义偶函数
def even(x):
    if x % 2 == 0:
        return x


#  单独获取每列数据
df_lst1 = df['num1']
df_lst2 = df['num2']

#  使用df.apply()自定义运算
df_lst1_odd = df['num1'].apply(odd).dropna()
df_lst1_even = df['num1'].apply(even).dropna()
df_lst2_odd = df['num2'].apply(odd).dropna()
df_lst2_even = df['num2'].apply(even).dropna()


#  将上述拿到的非结构化数据使用from_dict结构化处理
#  注意:前两列原始数据要使用list转化一下数据类型,否则处理好的输出数据仍然松散
df = pd.DataFrame.from_dict(
    {"num1": list(df_lst1), "num2": list(df_lst2), "num1奇数": df_lst1_odd, "num1偶数": df_lst1_even, "num2奇数": df_lst2_odd, "num2偶数": df_lst2_even}, orient='index')
print(df)
print(df.T)

#  将df数据保存在csv文件中
df.T.to_csv("./new_test.csv", index=False)

运行结果:

   num1  num2
0     1    11
1     2    33
2     3    22
3     4    44
4     6    55
5     5    33
6     7    88
7     9    99
**************************************************
           0     1     2     3     4     5     6     7
num1     1.0   2.0   3.0   4.0   6.0   5.0   7.0   9.0
num2    11.0  33.0  22.0  44.0  55.0  33.0  88.0  99.0
num1奇数   1.0   3.0   5.0   7.0   9.0   NaN   NaN   NaN
num1偶数   2.0   4.0   6.0   NaN   NaN   NaN   NaN   NaN
num2奇数  11.0  33.0  55.0  33.0  99.0   NaN   NaN   NaN
num2偶数  22.0  44.0  88.0   NaN   NaN   NaN   NaN   NaN
   num1  num2  num1奇数  num1偶数  num2奇数  num2偶数
0   1.0  11.0     1.0     2.0    11.0    22.0
1   2.0  33.0     3.0     4.0    33.0    44.0
2   3.0  22.0     5.0     6.0    55.0    88.0
3   4.0  44.0     7.0     NaN    33.0     NaN
4   6.0  55.0     9.0     NaN    99.0     NaN
5   5.0  33.0     NaN     NaN     NaN     NaN
6   7.0  88.0     NaN     NaN     NaN     NaN
7   9.0  99.0     NaN     NaN     NaN     NaN

输出的csv文件内容:

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值