您好, 欢迎来到 !    登录 | 注册 | | 设为首页 | 收藏本站

如何获取不区分大小写的Python SET

如何获取不区分大小写的Python SET

如果需要保留大小写,可以改用字典。大小写折叠键,然后将值提取到集合中:

 set({v.casefold(): v for v in l}.values())

str.casefold()方法使用Unicode大小写折叠规则(pdf)来规范化字符串,以进行不区分大小写的比较。这对于非ASCII字母和带连字的文本尤其重要。例如,德国ß尖锐的S,将其标准化为longss,或者从相同的语言标准化为slong

>>> print(s := 'Wa??er?chloß', s.lower(), s.casefold(), sep=" - ")
Wa??er?chloß - wa??er?chloß - wasserschloss

您可以将其封装到一个类中。

如果您不关心保留大小写,只需使用set理解即可:

{v.casefold() for v in l}

注意,Python 2没有这种方法,请str.lower()在这种情况下使用。

演示:

>>> l = ['#Trending', '#Trending', '#TrendinG', '#Yax', '#YAX', '#Yax']
>>> set({v.casefold(): v for v in l}.values())
{'#Yax', '#TrendinG'}
>>> {v.lower() for v in l}
{'#trending', '#yax'}

将第一种方法包装到类中将如下所示:

try:
    # Python 3
    from collections.abc import MutableSet
except ImportError:
    # Python 2
    from collections import MutableSet

class CasePreservingSet(MutableSet):
    """String set that preserves case but tests for containment by case-folded value

    E.g. 'Foo' in CasePreservingSet(['FOO']) is True. Preserves case of *last*
    inserted variant.

    """
    def __init__(self, *args):
        self._values = {}
        if len(args) > 1:
            raise TypeError(
                f"{type(self).__name__} expected at most 1 argument, "
                f"got {len(args)}"
            )
        values = args[0] if args else ()
        try:
            self._fold = str.casefold  # Python 3
        except AttributeError:
            self._fold = str.lower     # Python 2
        for v in values:
            self.add(v)

    def __repr__(self):
        return '<{}{} at {:x}>'.format(
            type(self).__name__, tuple(self._values.values()), id(self))

    def __contains__(self, value):
        return self._fold(value) in self._values

    def __iter__(self):
        try:
            # Python 2
            return self._values.itervalues()
        except AttributeError:
            # Python 3
            return iter(self._values.values())

    def __len__(self):
        return len(self._values)

    def add(self, value):
        self._values[self._fold(value)] = value

    def discard(self, value):
        try:
            del self._values[self._fold(value)]
        except KeyError:
            pass

用法演示:

>>> cps = CasePreservingSet(l)
>>> cps
<CasePreservingSet('#TrendinG', '#Yax') at 1047ba290>
>>> '#treNdinG' in cps
True
python 2022/1/1 18:33:56 有217人围观

撰写回答


你尚未登录,登录后可以

和开发者交流问题的细节

关注并接收问题和回答的更新提醒

参与内容的编辑和改进,让解决方法与时俱进

请先登录

推荐问题


联系我
置顶