Python高能小技巧：了解bytes与str的区别 -清零世界

概述

a = b'h\x65llo'
print(list(a))
print(a)
>>>
[104, 101, 108, 108, 111]
b'hello'

a = 'a\u0300 propos'
print(list(a))
print(a)
>>>
['a', '`', ' ', 'p', 'r', 'o', 'p', 'o', 's']
à propos

调用这些方法的时候，可以明确指出自己要使用的编码方案，也可以采用系统默认的方案，通常是指UTF-8（但有时也不一定，下面就会讲到这个问题）。
编写Python程序的时候，一定要把解码和编码操作放在界面最外层来做，让程序的核心部分可以使用Unicode数据来运作，这种办法通常叫作Unicode三明治（Unicode sandwich）。程序的核心部分，应该用str类型来表示Unicode数据，并且不要锁定到某种字符编码上面。
这样可以让程序接受许多种文本编码（例如Latin-1、Shift JIS及Big5），并把它们都转化成Unicode，也能保证输出的文本信息都是用同一种标准（最好是UTF-8）编码的。
两种不同的字符类型与Python中两种常见的使用情况相对应：

我们通常需要编写两个辅助函数（helper function），以便在这两种情况之间转换，确保输入值类型符合开发者的预期形式。
第一个辅助函数接受bytes或str实例，并返回str：

def to_str(bytes_or_str):
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.decode('utf-8')
    else:
        value = bytes_or_str
    return value  # Instance of str

print(repr(to_str(b'foo')))
print(repr(to_str('bar')))
>>>
'foo'
'bar'

def to_bytes(bytes_or_str):
    if isinstance(bytes_or_str, str):
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str
    return value  # Instance of bytes

print(repr(to_bytes(b'foo')))
print(repr(to_bytes('bar')))

print(b'one' + b'two')
print('one' + 'two')
>>>
b'onetwo'
onetwo

b'one' + 'two'
>>>
Traceback ...
TypeError: can't concat str to bytes

'one' + b'two'
>>>
Traceback ...
TypeError: can only concatenate str (not "bytes") to str

assert b'red' > b'blue'
assert 'red' > 'blue'

assert 'red' > b'blue'

assert b'blue' < 'red'

print(b'foo' == 'foo')
>>>
False

print(b'red %s' % b'blue')
print('red %s' % 'blue')
>>>
b'red blue'
red blue

print(b'red %s' % 'blue')

print('red %s' % b'blue')
>>>
red b'blue'

with open('data.bin', 'w') as f:
    f.write(b'\xf1\xf2\xf3\xf4\xf5')
>>>
Traceback ...
TypeError: write() argument must be str, not bytes

with open('data.bin', 'wb') as f:
    f.write(b'\xf1\xf2\xf3\xf4\xf5')

with open('data.bin', 'r') as f:
    data = f.read()

with open('data.bin', 'rb') as f:
    data = f.read()

assert data == b'\xf1\xf2\xf3\xf4\xf5'

with open('data.bin', 'r', encoding='cp1252') as f:
    data = f.read()

assert data == 'ñòóôõ'

总结

以上是编程之家为你收集整理的Python高能小技巧：了解bytes与str的区别全部内容，希望文章能够帮你解决Python高能小技巧：了解bytes与str的区别所遇到的程序开发问题。

如果您也喜欢它,动动您的小指点个赞吧

Python高能小技巧：了解bytes与str的区别

概述

总结

分类汇总

您的鼓励是对我最大的支持