pandas 排序df1.sort_values

最新推荐文章于 2024-04-06 22:37:43 发布

愚昧之山绝望之谷开悟之坡

最新推荐文章于 2024-04-06 22:37:43 发布

阅读量1k

点赞数

分类专栏： python 工具文章标签： python

本文链接：https://blog.csdn.net/qq_15821487/article/details/120411471

版权

python 同时被 2 个专栏收录

344 篇文章 10 订阅

订阅专栏

工具

291 篇文章 2 订阅

订阅专栏

import pandas as pd

path1 = './ml_work1/result/extract_json.csv'
path2 = './ml_work2/result/extract_json.csv'

df1 = pd.read_csv(path1)
print(df1)
df1_s = df1.sort_values(by=['link'], axis=0, kind='mergesort', ignore_index=True)
print(df1)
print('******************************************************************************************')
df2 = pd.read_csv(path2)
print(df2)
df2_s = df2.sort_values(by=['link'], axis=0, kind='mergesort', ignore_index=True)
print(df2)

j = 0
for i in range(len(df1)):
    print(df1_s.loc[i]['link'])
    print('*****************')
    print(df2_s.loc[i]['link'])
    if df1_s.loc[i]['link'] == df2_s.loc[i]['link']:
        j += 1
print(j)
print(len(df1))
print(len(df2))

python内置的数据结构，或者是第三方数据结构numpy pandas，这些结构的基本操作要非常熟悉
第三方的数据几个是，基本数据结构的组合，字典+列表而已。

kind='mergesort’采用稳定的合并排序，保证每次的排序结果一致，当存在相同元素时。

ignore_index=True另外排序后的索引可以忽略，按新的索引，


    def sort_values(
        self,
        axis=0,
        ascending=True,
        inplace: bool = False,
        kind: str = "quicksort",
        na_position: str = "last",
        ignore_index: bool = False,
        key: ValueKeyFunc = None,
    ):
        """
        Sort by the values.

        Sort a Series in ascending or descending order by some
        criterion.

        Parameters
        ----------
        axis : {0 or 'index'}, default 0
            Axis to direct sorting. The value 'index' is accepted for
            compatibility with DataFrame.sort_values.
        ascending : bool, default True
            If True, sort values in ascending order, otherwise descending.
        inplace : bool, default False
            If True, perform operation in-place.
        kind : {'quicksort', 'mergesort' or 'heapsort'}, default 'quicksort'
            Choice of sorting algorithm. See also :func:`numpy.sort` for more
            information. 'mergesort' is the only stable  algorithm.
        na_position : {'first' or 'last'}, default 'last'
            Argument 'first' puts NaNs at the beginning, 'last' puts NaNs at
            the end.
        ignore_index : bool, default False
            If True, the resulting axis will be labeled 0, 1, …, n - 1.

            .. versionadded:: 1.0.0

        key : callable, optional
            If not None, apply the key function to the series values
            before sorting. This is similar to the `key` argument in the
            builtin :meth:`sorted` function, with the notable difference that
            this `key` function should be *vectorized*. It should expect a
            ``Series`` and return an array-like.

            .. versionadded:: 1.1.0

      pandas排序的方法有很多，sort_values表示根据某一列排序
 
       pd.sort_values("xxx",inplace=True)
 
    表示pd按照xxx这个字段排序，inplace默认为False,如果该值为False，那么原来的pd顺序没变，只是返回的是排序的

愚昧之山绝望之谷开悟之坡

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
pandas 排序df1.sort_values

import pandas as pdpath1 = './ml_work1/result/extract_json.csv'path2 = './ml_work2/result/extract_json.csv'df1 = pd.read_csv(path1)print(df1)df1_s = df1.sort_values(by=['link'], axis=0, kind='mergesort', ignore_index=True)print(df1)print('********
复制链接

扫一扫

专栏目录