好吧,当函数的输出太大时,似乎用来填充Queue的管道被塞住了(我的粗略理解?这是一个未解决/关闭的错误?http://bugs.python.org/issue8237)。我已经修改了问题中的代码,以便有一些缓冲(进程运行时会定期清空队列),这解决了我所有的问题。因此,现在需要执行任务(函数及其参数)的集合,启动它们并收集输出。我希望它看起来更简单/更干净。
编辑(2014年9月; 2017年11月更新:为了可读性而重写):我正在使用自那时以来所做的增强功能更新代码。新代码(功能相同,但功能更好)在此处:https ://gitlab.com/cpbl/cpblUtilities/blob/master/parallel.py
调用说明也在下面。
def runFunctionsInParallel(*args, **kwargs):
""" This is the main/only interface to class cRunFunctionsInParallel. See its documentation for arguments.
"""
return cRunFunctionsInParallel(*args, **kwargs).launch_jobs()
###########################################################################################
###
class cRunFunctionsInParallel():
###
#######################################################################################
"""Run any list of functions, each with any arguments and keyword-arguments, in parallel.
The functions/jobs should return (if anything) pickleable results. In order to avoid processes getting stuck due to the output queues overflowing, the queues are regularly collected and emptied.
You can Now pass os.system or etc to this as the function, in order to parallelize at the OS level, with no need for a wrapper: I made use of hasattr(builtinfunction,'func_name') to check for a name.
Parameters
----------
listOf_FuncAndArgLists : a list of lists
List of up-to-three-element-lists, like [function, args, kwargs],
specifying the set of functions to be launched in parallel. If an
element is just a function, rather than a list, then it is assumed
to have no arguments or keyword arguments. Thus, possible formats
for elements of the outer list are:
function
[function, list]
[function, list, dict]
kwargs: dict
One can also supply the kwargs once, for all jobs (or for those
without their own non-empty kwargs specified in the list)
names: an optional list of names to identify the processes.
If omitted, the function name is used, so if all the functions are
the same (ie merely with different arguments), then they would be
named indistinguishably
offsetsSeconds: int or list of ints
delay some functions' start times
expectNonzeroExit: True/False
Normal behavIoUr is to not proceed if any function exits with a
Failed exit code. This can be used to override this behavIoUr.
parallel: True/False
Whenever the list of functions is longer than one, functions will
be run in parallel unless this parameter is passed as False
maxAtOnce: int
If nonzero, this limits how many jobs will be allowed to run at
once. By default, this is set according to how many processors
the hardware has available.
showFinished : int
Specifies the maximum number of successfully finished jobs to show
in the text interface (before the last report, which should always
show them all).
Returns
-------
Returns a tuple of (return codes, return values), each a list in order of the jobs provided.
Issues
-------
Only tested on POSIX OSes.
Examples
--------
See the testParallel() method in this module
"""