我得到了一个2列的数据帧(体积和价格),我想基于volume列创建20个存储箱,每个箱中的数据量相等。在
也就是说,如果我得到了volume=[1,6,8,2,6,9,3,6]和4个bin,我希望将数据剪切到第一个bin:1:2,2nd:3:6,3rd:6:8,4th:8:9
然后绘制相应y值的平均直方图
我的数据:df = pd.DataFrame{'Volume_norm' : [0.92, 2.31, 0.92, 0.018, 0.0454, 0.43, 0.43,0.943,0.543,0.543,0.43] , 'Price' : [2, 4, 5, 1, 5, 1, 2, 4, 2, 3, 6]}
我的代码:
^{pr2}$
它只给出了x(体积)的和而不是y的平均价格
=============更新代码===========df = pd.DataFrame({'Volume_norm' : [0.92,2.31,0.92,0.018,0.0454,0.43,0.43,0.943,0.543,0.543,0.43],
'Price' : [2,4,5,1,5,1,2,4,2,3,6]})
x = df['Volume_norm']
y = df['Price']
nbins = 5
binsize = x.size // nbins
indices = x.argsort()
means = np.zeros((nbins,))
xaxis = np.zeros((nbins,))
for k in range(nbins):
xaxis[k] = x[indices[i * binsize : (i + 1) * binsize]].mean()
for i in range(nbins):
means[i] = y[indices[i * binsize : (i + 1) * binsize]].mean()
plt.loglog(xaxis,means,'r-')
plt.show()
但是xaxis返回me:array([0.9315,0.9315,0.9315,0.9315,0.9315])
此外,是否可以使用“Counter”来计算每个间隔中的数据数量?在