python itertools.product,在python中加快itertools.product的方法

I'm trying to create a numpy array consisting of all possible asset allocations using itertools.product.

The conditions are that allocations for each asset can be in range of zero to 100% and can rise by (100% / number of assets) increments. The allocations total sum should be 100%.

The calculations take very long time when assets number grows (10 seconds for 7 assets, 210 seconds for 8 assets and so on).

Is there a way to speed up the code somehow?

Maybe i should try using it.takewhile or multiprocessing?

import itertools as it

import numpy as np

def CreateMatrix(Increments):

inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));

matrix = np.ndarray((1, int(1/Increments)));

x = 0;

for i in inputs:

if np.sum(i, axis = 0) == 1:

if x > 0:

matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]

matrix[x] = i

x = x + 1

return matrix

Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

Increments = 1.0 / len(Assets)

matrix = CreateMatrix(Increments);

print matrix

解决方案

Use the stdlib sum instead of numpy.sum. This code spends most of its time computing that sum, according to cProfile.

Profiling Code

import cProfile, pstats, StringIO

import itertools as it

import numpy as np

def CreateMatrix(Increments):

inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));

matrix = np.ndarray((1, int(1/Increments)));

x = 0

for i in inputs:

if np.sum(i, axis=0) == 1:

if x > 0:

matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]

matrix[x] = i

x += 1

return matrix

pr = cProfile.Profile()

pr.enable()

Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

Increments = 1.0 / len(Assets)

matrix = CreateMatrix(Increments);

print matrix

pr.disable()

s = StringIO.StringIO()

sortby = 'cumulative'

ps = pstats.Stats(pr, stream=s).sort_stats(sortby)

ps.print_stats()

print s.getvalue()

Truncated Output

301565912 function calls (301565864 primitive calls) in 294.255 seconds

Ordered by: cumulative time

ncalls tottime percall cumtime percall filename:lineno(function)

1 26.294 26.294 294.254 294.254 product.py:7(CreateMatrix)

43046721 41.948 0.000 267.762 0.000 Library/Python/2.7/lib/python/site-packages/numpy/core/fromnumeric.py:1966(sum)

43046723 60.071 0.000 217.863 0.000 Library/Python/2.7/lib/python/site-packages/numpy/core/fromnumeric.py:69(_wrapreduction)

43046723 124.341 0.000 124.341 0.000 {method 'reduce' of 'numpy.ufunc' objects}

43046723 14.630 0.000 14.630 0.000 Library/Python/2.7/lib/python/site-packages/numpy/core/fromnumeric.py:70()

43046721 12.629 0.000 12.629 0.000 {getattr}

43098200 7.958 0.000 7.958 0.000 {isinstance}

43046724 6.191 0.000 6.191 0.000 {method 'items' of 'dict' objects}

6434 0.047 0.000 0.199 0.000 Library/Python/2.7/lib/python/site-packages/numpy/lib/index_tricks.py:316(__getitem__)

Timing Experiments

numpy.sum

import itertools as it

import numpy as np

def CreateMatrix(Increments):

inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));

matrix = np.ndarray((1, int(1/Increments)));

x = 0;

for i in inputs:

if np.sum(i, axis = 0) == 1:

if x > 0:

matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]

matrix[x] = i

x = x + 1

return matrix

Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

Increments = 1.0 / len(Assets)

matrix = CreateMatrix(Increments);

$ python -m timeit --number=3 --verbose "$(cat product.py)"

raw times: 738 696 697

3 loops, best of 3: 232 sec per loop

Stdlib sum

import itertools as it

import numpy as np

def CreateMatrix(Increments):

inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));

matrix = np.ndarray((1, int(1/Increments)));

x = 0;

for i in inputs:

if sum(i) == 1:

if x > 0:

matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]

matrix[x] = i

x = x + 1

return matrix

Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

Increments = 1.0 / len(Assets)

matrix = CreateMatrix(Increments);

$ python -m timeit --number=3 --verbose "$(cat product.py)"

raw times: 90.5 84.3 85.3

3 loops, best of 3: 28.1 sec per loop

There are many more ways to get your solution faster, as other folks have said in their comments. Take a look at How do I "multi-process" the itertools product module? for an idea of how to use multiprocessing to speed this up. No matter what you do: clever algorithm, concurrency or both, replace the sum function; it's a lot of speed up for very little effort.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值