Parallel Python是一个Python模块,它提供了Python代码的在SMP(系统有多个处理器或内核)和集群(计算机通过网络连接)上并行执行的机制。能够将计算压力分布到多核CPU或集群的多台计算机上,能够非常方便的在内网中搭建一个自组织的分布式计算平台。先从多核计算开始,普通的Python应用程序只能够使用一个CPU进程,而通过Parallel Python能够很方便的将计算扩展到多个CPU进程。
特性:
- Parallel execution of python code on SMP and clusters
- Easy to understand and implement job-based parallelization technique (easy to convert serial application in parallel)
- Automatic detection of the optimal configuration (by default the number of worker processes is set to the number of effective processors)
- Dynamic processors allocation (number of worker processes can be changed at runtime)
- Low overhead for subsequent jobs with the same function (transparent caching is implemented to decrease the overhead)
- Dynamic load balancing (jobs are distributed between processors at runtime)
- Fault-tolerance (if one of the nodes fails tasks are rescheduled on others)
- Auto-discovery of computational resources
- Dynamic allocation of computational resources (consequence of auto-discovery and fault-tolerance)
- SHA based authentication for network connections
- Cross-platform portability and interoperability (Windows, Linux, Unix, Mac OS X)
- Cross-architecture portability and interoperability (x86, x86-64, etc.)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
|
Example
#1: sum_primes.py
#!/usr/bin/python
# File: sum_primes.py
# Author: VItalii Vanovschi
# Desc: This program demonstrates parallel computations with pp module
# It calculates the sum of prime numbers below a given integer in parallel
# Parallel Python Software: http://www.parallelpython.com
import
math, sys, time
import
pp
def
isprime(n):
"""Returns True if n is prime and False otherwise"""
if
not
isinstance
(n,
int
):
raise
TypeError(
"argument passed to is_prime is not of 'int' type"
)
if
n <
2
:
return
False
if
n
=
=
2
:
return
True
max
=
int
(math.ceil(math.sqrt(n)))
i
=
2
while
i <
=
max
:
if
n
%
i
=
=
0
:
return
False
i
+
=
1
return
True
def
sum_primes(n):
"""Calculates sum of all primes below given integer n"""
return
sum
([x
for
x
in
xrange
(
2
,n)
if
isprime(x)])
print
"""Usage: python sum_primes.py [ncpus]
[ncpus] - the number of workers to run in parallel,
if omitted it will be set to the number of processors in the system
"""
# tuple of all parallel python servers to connect with
ppservers
=
()
#ppservers = ("10.0.0.1",)
if
len
(sys.argv) >
1
:
ncpus
=
int
(sys.argv[
1
])
# Creates jobserver with ncpus workers
job_server
=
pp.Server(ncpus, ppservers
=
ppservers)
else
:
# Creates jobserver with automatically detected number of workers
job_server
=
pp.Server(ppservers
=
ppservers)
print
"Starting pp with"
, job_server.get_ncpus(),
"workers"
# Submit a job of calulating sum_primes(100) for execution.
# sum_primes - the function
# (100,) - tuple with arguments for sum_primes
# (isprime,) - tuple with functions on which function sum_primes depends
# ("math",) - tuple with module names which must be imported before sum_primes execution
# Execution starts as soon as one of the workers will become available
job1
=
job_server.submit(sum_primes, (
100
,), (isprime,), (
"math"
,))
# Retrieves the result calculated by job1
# The value of job1() is the same as sum_primes(100)
# If the job has not been finished yet, execution will wait here until result is available
result
=
job1()
print
"Sum of primes below 100 is"
, result
start_time
=
time.time()
# The following submits 8 jobs and then retrieves the results
inputs
=
(
100000
,
100100
,
100200
,
100300
,
100400
,
100500
,
100600
,
100700
)
jobs
=
[(
input
, job_server.submit(sum_primes,(
input
,), (isprime,), (
"math"
,)))
for
input
in
inputs]
for
input
, job
in
jobs:
print
"Sum of primes below"
,
input
,
"is"
, job()
print
"Time elapsed: "
, time.time()
-
start_time,
"s"
job_server.print_stats()
# Parallel Python Software: http://www.parallelpython.com
|