您好, 欢迎来到 !    登录 | 注册 | | 设为首页 | 收藏本站

提供UnicodeDecodeError的Python 3 CSV文件:“ utf-8”编解码器在我打印时无法解码字节错误

提供UnicodeDecodeError的Python 3 CSV文件:“ utf-8”编解码器在我打印时无法解码字节错误

我们知道文件包含字节,b'\x96'因为错误消息中已提到该字节:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 7386: invalid start byte

现在,我们可以编写一个小脚本来找出b'\x96'解码到的 编码是否存在ñ

import pkgutil
import encodings
import os

def all_encodings():
    modnames = set([modname for importer, modname, ispkg in pkgutil.walk_packages(
        path=[os.path.dirname(encodings.__file__)], prefix='')])
    aliases = set(encodings.aliases.aliases.values())
    return modnames.union(aliases)

text = b'\x96'
for enc in all_encodings():
    try:
        msg = text.decode(enc)
    except Exception:
        continue
    if msg == 'ñ':
        print('Decoding {t} with {enc} is {m}'.format(t=text, enc=enc, m=msg))

产生

Decoding b'\x96' with mac_roman is ñ
Decoding b'\x96' with mac_farsi is ñ
Decoding b'\x96' with mac_croatian is ñ
Decoding b'\x96' with mac_arabic is ñ
Decoding b'\x96' with mac_romanian is ñ
Decoding b'\x96' with mac_iceland is ñ
Decoding b'\x96' with mac_turkish is ñ

因此,请尝试更改

with open('my_file.csv', 'r', newline='') as csvfile:

这些编码之一,例如:

with open('my_file.csv', 'r', encoding='mac_roman', newline='') as csvfile:
python 2022/1/1 18:31:51 有201人围观

撰写回答


你尚未登录,登录后可以

和开发者交流问题的细节

关注并接收问题和回答的更新提醒

参与内容的编辑和改进,让解决方法与时俱进

请先登录

推荐问题


联系我
置顶