pandas

最新推荐文章于 2023-12-26 11:29:20 发布

grllery

最新推荐文章于 2023-12-26 11:29:20 发布

阅读量197

点赞数

分类专栏： python 文章标签： pandas

本文链接：https://blog.csdn.net/grllery/article/details/82355645

版权

python 专栏收录该内容

21 篇文章 1 订阅

订阅专栏

apply map

map()将函数作用于Series中的每个元素

apply()作用于DataFrame中的行或者列，一维的向量上，求每个列的均值等操作...

applymap()作用于DataFrame的每个元素上

计算某一列中的去重复后的元素数：df[df.columns[1]].nunique()

sort

df.sort_values(by=['col1'])
df.sort_values(by=['col1', 'col2'])
df.sort_values(by='col1', ascending=False)

df.to_excel

If passing an existing ExcelWriter object, then the sheet will be added to the existing workbook. This can be used to save different DataFrames to one workbook:

>>> writer = pd.ExcelWriter('output.xlsx')
>>> df1.to_excel(writer,'Sheet1')
>>> df2.to_excel(writer,'Sheet2')
>>> writer.save()

iterrows

python里使用iterrows()对dataframe进行遍历

for index, row in df.iterrows():
    pass

row为相应的pandas的Series。

How do I get the row count of a Pandas dataframe?

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: df = pd.DataFrame(np.arange(12).reshape(4,3))

In [4]: df
Out[4]: 
   0  1  2
0  0  1  2
1  3  4  5
2  6  7  8
3  9  10 11

In [5]: df.shape
Out[5]: (4, 3)

In [6]: timeit df.shape
2.77 µs ± 644 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [7]: timeit df[0].count()
348 µs ± 1.31 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [8]: len(df.index)
Out[8]: 4

In [9]: timeit len(df.index)
990 ns ± 4.97 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

https://stackoverflow.com/questions/15943769/how-do-i-get-the-row-count-of-a-pandas-dataframe

pandas.DataFrame.sample

DataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)[source]

n : int, optional

Number of items from axis to return. Cannot be used with frac. Default = 1 if frac = None.

frac : float, optional

Fraction of axis items to return. Cannot be used with n.

或者

from sklearn.utils import shuffle

df = shuffle(df)

to numpy

landmarks = landmarks_frame.iloc[65, 1:].values
print(type(landmarks))  # <class 'numpy.ndarray'>
landmarks = landmarks.astype('float').reshape(-1, 2)