python memoryerror_Python中的MemoryError,但不是IPython

Generally-can you think of any reason why this would happen (i.e. a MemoryError in Python but not in IPython (console--not notebook)?)

To be more specific, I'm using sklearn's sgdclassifier in the multiclass and multilabel case. It errors given the following code:

model = SGDClassifier(

loss='hinge',

penalty='l2',

n_iter=niter,

alpha=alpha,

fit_intercept=True,

n_jobs=1)

mc = OneVsRestClassifier(model)

mc.fit(X, y)

On calling mc.fit(X, y), the following error occurs:

File "train12-3b.py", line 411, in buildmodel

mc.fit(X, y)

File "/usr/local/lib/python2.7/dist-packages/sklearn/multiclass.py", line 201, in fit

n_jobs=self.n_jobs)

File "/usr/local/lib/python2.7/dist-packages/sklearn/multiclass.py", line 88, in fit_ovr

Y = lb.fit_transform(y)

File "/usr/local/lib/python2.7/dist-packages/sklearn/base.py", line 408, in fit_transform

return self.fit(X, **fit_params).transform(X)

File "/usr/local/lib/python2.7/dist-packages/sklearn/preprocessing/label.py", line 272, in transform

neg_label=self.neg_label)

File "/usr/local/lib/python2.7/dist-packages/sklearn/preprocessing/label.py", line 394, in label_binarize

Y = np.zeros((len(y), len(classes)), dtype=np.int)

MemoryError

Y is a matrix with 6 million rows and k columns, where the gold labels are 1 and the rest are 0 (in this case, k = 21, but I'd like to go >2000). Y gets converted by sklearn to a dense matrix (hence Y = np.zeros((len(y), len(classes)), dtype=np.int) MemoryError ), even if it is passed in as sparse.

I have 60 gb of ram, and with 21 columns, it shouldn't take more than 8 gb max (6 million * 21 * 64), so I'm confused. I rewrote the Y = np.zeros((len(y), len(classes)), dtype=np.int to use dtype = bool, but no luck.

Any thoughts?

解决方案

It sounds like you are hitting a limitation of the current implementation of the label binarizer: see issue #2441. There is PR #2458 to fix it.

Please feel free to try that branch and report your results as a comment to that PR.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值