1, pandas交换数据的两列

最新推荐文章于 2024-05-17 22:11:37 发布

看大海

最新推荐文章于 2024-05-17 22:11:37 发布

阅读量1.1k

点赞数 1

分类专栏： python 文章标签： python

本文链接：https://blog.csdn.net/yiersab/article/details/109389722

版权

python 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

今天要对csv的数据进行列重拍序，有点费劲，还是用的不熟练

csv数据类似,交换db两列

         a     b    c       d  
0      196   242    3     881250949
1      186   302    3     891717742
2       22   377    1     878887116
3      244    51    2     880606923
4      166   346    1     886397596

import pandas as pd
import numpy as np

file = 'ppp.csv'
df = pd.read_csv(file, sep='\t', header=None ,names=['a'  , 'b' ,'c' ,'d'])
cols = list(df)
cols.insert(2,cols.pop(cols.index('d'))) # 2是将d放在哪一列，cols.pop(cols.index('d')) 是要换的d列
df = df.loc[:,cols] # 开始按照两列互换
print(df)

结果

         a     b          d  c
0      196   242  881250949  3
1      186   302  891717742  3
2       22   377  878887116  1
3      244    51  880606923  2
4      166   346  886397596  1

另外，其它用到的如下：

返回列数：df.shape[1]

返回行数：df.shape[0] 或者：len(df)

生成连续数字列表，并插入到某一列：

a = df.shape[0]
b = list(range(a))
df.insert(0,'ind',pb.Series(b)) # 0是要插入的位置，‘ind’是表头，pb.Series(b)是要插入的值

例子

import glob,os
import regex as re
import pandas as pb
path = '/mnt/temp/tep/test/test'
result_csv = '/mnt/temp/tep/test/test_cp'
for file in glob.glob(os.path.join(path,'*')):

    df = None = None
    if file.endswith(".csv"):
        print(file)
        df = pb.read_csv(file, index_col=False)
        df.activateFlag_x[df.activateFlag_x==2] = 'aaa' #把activateFlag_x列等于2的值替换为aaa
        df.activateFlag_x[df.activateFlag_x == 1] = 'bbb' #同上
        #下面是将最右边的列通过逐次换到最左边
        cols = list(df)
        cols.insert(1, cols.pop(cols.index('activateFlag_x'))) #2和1换
        df = df.loc[:, cols]
        cols.insert(0, cols.pop(cols.index('activateFlag_x'))) #1和0换
        df = df.loc[:, cols]
        
        # a = df.shape[0]
        # b = list(range(a))
        # df.insert(0,'ind',pb.Series(b))
        
        # df.to_csv(os.path.join(result_csv,os.path.basename(file)),header=False,index=False) #去掉标题和最左边的索引id
        df.to_csv(os.path.join(result_csv, os.path.basename(file)), header=False) #只去掉头标题

    print(df)