Python，使用多进程比不使用它慢

预计到达时间：现在您已经发布了代码，我可以告诉您有一种简单的方法可以更快地完成您正在做的事情（快100倍以上）。

我看到您正在执行的操作是在字符串列表中的每个项目的括号中添加一个频率。不必每次都计算所有元素（您可以使用cProfile确认这是迄今为止代码中最大的瓶颈），您只需创建一个字典即可将每个元素映射到其频率。这样，您只需要遍历该列表两次- 一次创建频率字典，一次使用它添加频率。

在这里，我将展示我的新方法，对其进行计时，并使用生成的测试用例将其与旧方法进行比较。测试用例甚至表明新结果与旧结果 完全相同 。 注意： 下面您真正需要注意的就是new_method。

import random
import time
import collections
import cProfile

LIST_LEN = 14000

def timefunc(f):
    t = time.time()
    f()
    return time.time() - t


def random_string(length=3):
    """Return a random string of given length"""
    return "".join([chr(random.randint(65, 90)) for i in range(length)])


class Profiler:
    def __init__(self):
        self.original = [[random_string() for i in range(LIST_LEN)]
                            for j in range(4)]

    def old_method(self):
        self.ListVar = self.original[:]
        for b in range(len(self.ListVar)):
            self.list1 = []
            self.temp = []
            for n in range(len(self.ListVar[b])):
                if not self.ListVar[b][n] in self.temp:
                    self.list1.insert(n, self.ListVar[b][n] + '(' +    str(self.ListVar[b].count(self.ListVar[b][n])) + ')')
                    self.temp.insert(0, self.ListVar[b][n])

            self.ListVar[b] = list(self.list1)
        return self.ListVar

    def new_method(self):
        self.ListVar = self.original[:]
        for i, inner_lst in enumerate(self.ListVar):
            freq_dict = collections.defaultdict(int)
            # create frequency dictionary
            for e in inner_lst:
                freq_dict[e] += 1
            temp = set()
            ret = []
            for e in inner_lst:
                if e not in temp:
                    ret.append(e + '(' + str(freq_dict[e]) + ')')
                    temp.add(e)
            self.ListVar[i] = ret
        return self.ListVar

    def time_and_confirm(self):
        """
        Time the old and new methods, and confirm they return the same value
        """
        time_a = time.time()
        l1 = self.old_method()
        time_b = time.time()
        l2 = self.new_method()
        time_c = time.time()

        # confirm that the two are the same
        assert l1 == l2, "The old and new methods don't return the same value"

        return time_b - time_a, time_c - time_b

p = Profiler()
print p.time_and_confirm()

当我运行此命令时，它得到的时间为（15.963812112808228，0.05961179733276367），这意味着它快了250倍，尽管这一优势取决于列表的时长和每个列表中的频率分布。我相信您会同意，凭借这种速度优势，您可能不需要使用多处理功能：）

（我的原始答案留在后头，以供后代参考）

ETA：顺便说一句，值得注意的是，该算法在列表长度上大致是线性的，而您使用的代码是二次的。这意味着，元素数量越多，它的优势就越明显。例如，如果将每个列表的长度增加到1000000，则只需要5秒钟即可运行。根据推断，旧代码将花费一天的时间：)

这取决于您正在执行的操作。例如：

import time
NUM_RANGE = 100000000

from multiprocessing  import Process

def timefunc(f):
    t = time.time()
    f()
    return time.time() - t

def multi():
    class MultiProcess(Process):
        def __init__(self):
            Process.__init__(self)

        def run(self):
            # Alter string + test processing speed
            for i in xrange(NUM_RANGE):
                a = 20 * 20

    thread1 = MultiProcess()
    thread2 = MultiProcess()
    thread1.start()
    thread2.start()
    thread1.join()
    thread2.join()

def single():
    for i in xrange(NUM_RANGE):
        a = 20 * 20

    for i in xrange(NUM_RANGE):
        a = 20 * 20

print timefunc(multi) / timefunc(single)

在我的机器上，多进程操作仅占单线程操作时间的60％。

python 2022/1/1 18:50:39 有476人围观

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节

关注并接收问题和回答的更新提醒

参与内容的编辑和改进，让解决方法与时俱进

请先登录

Python，使用多进程比不使用它慢

撰写回答

推荐问题

尝试使用selenium和python登录网页时出错

从Python访问errno？

gcloud compute copy-files：复制文件时拒绝权限

在服务器上运行selenium浏览器（Flask / Python / Heroku）

ImportError：没有使用Python2的名为mysql.connector的模块

Python：无法在网页中使用selenium下载

带有Selenium的Python“元素未附加到页面文档中”

在Jenkins中设置特定的Python

Python：从文件中选择随机行，然后删除该行

从Python字符串中删除不在允许列表中的HTML标签

从python读取json文件

通过Python3使用Selenium和WebDriver切换选项卡时，“ NoSuchWindowException：没有这样的窗口：窗口已经关闭”

pythonselenium多个测试用例

连接所有PostgreSQL表并创建一个Python字典

带有selenium的Python：无法找到真正存在的元素

列出用户和组的Python脚本

在capybara中选择具有多个类的元素

如何以正确的顺序导入Scrapy项目密钥？

如何确定是否为Selenium + Python加载了某些HTML元素？

使用Java与Python的Selenium Webdriver

分类汇总

您的鼓励是对我最大的支持