用Python快速计算Pareto前沿

如果您担心实际速度，则一定要使用numpy（因为聪明的算法调整可能比使用数组操作获得的收益要小得多）。这是三个都计算相同功能的解决方案。该is_pareto_efficient_dumb解决方案在大多数情况下较慢，但随着成本增加而变得更快，在许多点上，该is_pareto_efficient_simple解决方案都比哑解决方案有效得多，并且最终is_pareto_efficient函数可读性较差，但最快（所以所有这些都是帕累托高效的！）。

import numpy as np


# Very slow for many datapoints.  Fastest for many costs, most readable
def is_pareto_efficient_dumb(costs):
    """
    Find the pareto-efficient points
    :param costs: An (n_points, n_costs) array
    :return: A (n_points, ) boolean array, indicating whether each point is Pareto efficient
    """
    is_efficient = np.ones(costs.shape[0], dtype = bool)
    for i, c in enumerate(costs):
        is_efficient[i] = np.all(np.any(costs[:i]>c, axis=1)) and np.all(np.any(costs[i+1:]>c, axis=1))
    return is_efficient


# Fairly fast for many datapoints, less fast for many costs, somewhat readable
def is_pareto_efficient_simple(costs):
    """
    Find the pareto-efficient points
    :param costs: An (n_points, n_costs) array
    :return: A (n_points, ) boolean array, indicating whether each point is Pareto efficient
    """
    is_efficient = np.ones(costs.shape[0], dtype = bool)
    for i, c in enumerate(costs):
        if is_efficient[i]:
            is_efficient[is_efficient] = np.any(costs[is_efficient]<c, axis=1)  # Keep any point with a lower cost
            is_efficient[i] = True  # And keep self
    return is_efficient


# Faster than is_pareto_efficient_simple, but less readable.
def is_pareto_efficient(costs, return_mask = True):
    """
    Find the pareto-efficient points
    :param costs: An (n_points, n_costs) array
    :param return_mask: True to return a mask
    :return: An array of indices of pareto-efficient points.
        If return_mask is True, this will be an (n_points, ) boolean array
        Otherwise it will be a (n_efficient_points, ) integer array of indices.
    """
    is_efficient = np.arange(costs.shape[0])
    n_points = costs.shape[0]
    next_point_index = 0  # Next index in the is_efficient array to search for
    while next_point_index<len(costs):
        nondominated_point_mask = np.any(costs<costs[next_point_index], axis=1)
        nondominated_point_mask[next_point_index] = True
        is_efficient = is_efficient[nondominated_point_mask]  # Remove dominated points
        costs = costs[nondominated_point_mask]
        next_point_index = np.sum(nondominated_point_mask[:next_point_index])+1
    if return_mask:
        is_efficient_mask = np.zeros(n_points, dtype = bool)
        is_efficient_mask[is_efficient] = True
        return is_efficient_mask
    else:
        return is_efficient

分析测试（使用从正态分布中得出的点）：

含10,000个样本，有2个成本：

is_pareto_efficient_dumb: Elapsed time is 1.586s
is_pareto_efficient_simple: Elapsed time is 0.009653s
is_pareto_efficient: Elapsed time is 0.005479s

拥有1,000,000个样本，有2个成本：

is_pareto_efficient_dumb: Really, really, slow
is_pareto_efficient_simple: Elapsed time is 1.174s
is_pareto_efficient: Elapsed time is 0.4033s

使用10,000个样本，需要15个费用：

is_pareto_efficient_dumb: Elapsed time is 4.019s
is_pareto_efficient_simple: Elapsed time is 6.466s
is_pareto_efficient: Elapsed time is 6.41s

请注意，如果您担心效率问题，可以通过预先对数据重新排序来进一步提高2倍左右的速度，请参见此处。

python 2022/1/1 18:43:41 有283人围观

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节

关注并接收问题和回答的更新提醒

参与内容的编辑和改进，让解决方法与时俱进

请先登录

用Python快速计算Pareto前沿

撰写回答

推荐问题

如何使用PHP检查目录是否为空？

使用PHP的MySQL中的查询时间结果

如何使用PHP和Mysql DB下载文件

iphoneX不能调用preferredsStatusBarHidden

ImportError：没有使用Python2的名为mysql.connector的模块

使用perlbrew和cpm安装模块-在docker build期间perlbrew开关不会更改@INC

使用popen（）通过套接字执行命令

使用PHP或JavaScript提交表单失败后，是否可以重新填充文件输入？

我应该在CSS中使用px或rem值单位吗？

用PDO和准备好的语句替换mysql_ *函数

我们可以以某种方式重命名使用puppeteer下载的文件吗？

如何有效地使用PHP中的try…catch块

使用PHP / Apache上传文件夹的适当权限是什么？

如何使用PHP ping服务器端口？

休眠使用PostgreSQL序列不会影响序列表

如何使用Play Framework通过SSL连接到远程MySQL数据库？

用PHP替换\ r \ n

如何使用PHP跳过XML文件中的无效字符

使用PHP将html转换为word / excel / powerpoint

使用php变量创建动态mysql查询

分类汇总

您的鼓励是对我最大的支持