演示多处理速度非常简单:
import multiprocessing
import sys
import time
# multi-platform precision clock
get_timer = time.clock if sys.platform == "win32" else time.time
def cube_function(num):
time.sleep(0.01) # let's simulate it takes ~10ms for the cpu core to cube the number
return num**3
if __name__ == "__main__": # multiprocessing guard
# we'll test multiprocessing with pools from one to the number of cpu cores on the system
# it won't show significant improvements after that and it will soon start going
# downhill due to the underlying OS thread context switches
for workers in range(1, multiprocessing.cpu_count() + 1):
pool = multiprocessing.Pool(processes=workers)
# lets 'warm up' our pool so it doesn't affect our measurements
pool.map(cube_function, range(multiprocessing.cpu_count()))
# Now to the business, we'll have 10000 numbers to quart via our expensive function
print("Cubing 10000 numbers over {} processes:".format(workers))
timer = get_timer() # time measuring starts Now
results = pool.map(cube_function, range(10000)) # map our range to the cube_function
timer = get_timer() - timer # get our delta time as soon as it finishes
print("\tTotal: {:.2f} seconds".format(timer))
print("\tAvg. per process: {:.2f} seconds".format(timer / workers))
pool.close() # lets clear out our pool for the next run
time.sleep(1) # waiting for a second to make sure everything is cleaned up
当然,在这里我们只是模拟10ms /数字的计算,您可以cube_function
用任何cpu负担的方法代替实际演示。结果符合预期:
Cubing 10000 numbers over 1 processes:
Total: 100.01 seconds
Avg. per process: 100.01 seconds
Cubing 10000 numbers over 2 processes:
Total: 50.02 seconds
Avg. per process: 25.01 seconds
Cubing 10000 numbers over 3 processes:
Total: 33.36 seconds
Avg. per process: 11.12 seconds
Cubing 10000 numbers over 4 processes:
Total: 25.00 seconds
Avg. per process: 6.25 seconds
Cubing 10000 numbers over 5 processes:
Total: 20.00 seconds
Avg. per process: 4.00 seconds
Cubing 10000 numbers over 6 processes:
Total: 16.68 seconds
Avg. per process: 2.78 seconds
Cubing 10000 numbers over 7 processes:
Total: 14.32 seconds
Avg. per process: 2.05 seconds
Cubing 10000 numbers over 8 processes:
Total: 12.52 seconds
Avg. per process: 1.57 seconds
现在,为什么不100%线性?嗯,首先,它需要一些时间来图/数据分配给子流程,并把它找回来,有一些成本的上下文切换,还有一些用我的cpu不时其他任务,time.sleep()
不是完全精确(也不可能在非RT OS上使用)…但是结果大致上可用于并行处理。