源码: def apply(self, func, axis=0, broadcast=False, raw=False, reduce=None, args=(), **kwds):
沿DataFrame的输入轴应用功能。
传递给函数的对象是具有索引的Series对象
DataFrame的索引(轴= 0)或列(轴= 1)。
返回类型取决于是否传递函数聚合,或者
如果DataFrame为空,则减少参数。
Parameters
----------
func : function
Function to apply to each column/row
axis : {0 or 'index', 1 or 'columns'}, default 0
* 0 or 'index': apply function to each column
* 1 or 'columns': apply function to each row
broadcast : boolean, default False
For aggregation functions, return object of same size with values
propagated
raw : boolean, default False
If False, convert each row or column into a Series. If raw=True the
passed function will receive ndarray objects instead. If you are
just applying a NumPy reduction function this will achieve much
better performance
reduce : boolean or None, default None
Try to apply reduction procedures. If the DataFrame is empty,
apply will use reduce to determine whether the result should be a
Series or a DataFrame. If reduce is None (the default), apply's
return value will be guessed by calling func an empty Series (note:
while guessing, exceptions raised by func will be ignored). If
reduce is True a Series will always be returned, and if False a
DataFrame will always be returned.
args : tuple
Positional arguments to pass to function in addition to the
array/series
Additional keyword arguments will be passed as keywords to the function
Notes
-----
In the current implementation apply calls func twice on the
first column/row to decide whether it can take a fast or slow
code path. This can lead to unexpected behavior if func has
side-effects, as they will take effect twice for the first
column/row.
Examples
--------
>>> df.apply(numpy.sqrt) # returns DataFrame
>>> df.apply(numpy.sum, axis=0) # equiv to df.sum(0)
>>> df.apply(numpy.sum, axis=1) # equiv to df.sum(1)
See also
--------
DataFrame.applymap: For elementwise operations
Returns
-------
applied : Series or DataFrame
"""
实例:
数据
Yr Mo Dy RPT VAL ROS KIL SHA BIR DUB CLA MUL CLO BEL MAL 61 1 1 15.04 14.96 13.17 9.29 NaN 9.87 13.67 10.25 10.83 12.58 18.50 15.04 61 1 2 14.71 NaN 10.83 6.50 12.62 7.67 11.50 10.04 9.79 9.67 17.54 13.83 61 1 3 18.50 16.88 12.33 10.13 11.17 6.17 11.25 NaN 8.50 7.67 12.75 12.71 61 1 4 10.58 6.63 11.75 4.58 4.54 2.88 8.63 1.79 5.83 5.88 5.46 10.88 61 1 5 13.33 13.25 11.42 6.17 10.71 8.21 11.92 6.54 10.92 10.34 12.92 11.83 61 1 6 13.21 8.12 9.96 6.67 5.37 4.50 10.67 4.42 7.17 7.50 8.12 13.17 61 1 7 13.50 14.29 9.50 4.96 12.29 8.33 9.17 9.29 7.58 7.96 13.96 13.79
wind = pd.read_csv('../wind.csv', sep='\s+', parse_dates=[[0, 1, 2]])
print(wind.apply(np.sqrt)) print(wind.apply(np.sum,axis=0)) print(wind.apply(np.sum,axis=1)) RPT 81200.10 VAL 69943.79 ROS 76632.98 KIL 41427.19 SHA 68715.74 BIR 46624.48 DUB 64378.34 CLA 55829.49 MUL 55811.38 CLO 57233.29 BEL 86257.50 MAL 102485.95 dtype: float64 Yr_Mo_Dy 1961-01-01 143.20 1961-01-02 124.70 1961-01-03 128.06 1961-01-04 79.43 1961-01-05 127.56 1961-01-06 98.88 1961-01-07 124.62
def plus(df,n,m): df['c'] = (df['a']+df['b'])*m df['d'] = (df['a']+df['b'])*n return df list_plus = [[1,3],[7,8],[4,5]] df1 = pd.DataFrame(list_plus,columns=['a','b']) print(df1) df1 = df1.apply(plus,axis=1,args=(3,2)) print(df1)