您好, 欢迎来到 !    登录 | 注册 | | 设为首页 | 收藏本站

Python:如何在字符串中的每个发音元音前添加字符串“ ub”?

Python:如何在字符串中的每个发音元音前添加字符串“ ub”?

一个简单的正则表达式更复杂,例如,

"Hi, how are you?" → "Hubi, hubow ubare yubou?"

简单的正则表达式将无法捕获e未在中发音的正则表达式are

您需要一个提供发音词典的库,例如nltk.corpus.cmudict

from nltk.corpus import cmudict # $ pip install nltk
# $ python -c "import nltk; nltk.download('cmudict')"

def spubeak(word, pronunciations=cmudict.dict()):
    istitle = word.istitle() # remember, to preserve titlecase
    w = word.lower() #note: ignore Unicode case-folding
    for syllables in pronunciations.get(w, []):
        parts = []
        for syl in syllables:
            if syl[:1] == syl[1:2]:
                syl = syl[1:] # remove duplicate
            isvowel = syl[-1].isdigit()
            # pronounce the word
            parts.append('ub'+syl[:-1] if isvowel else syl)
        result = ''.join(map(str.lower, parts))
        return result.title() if istitle else result
    return word # word not found in the dictionary

例:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re

sent = "Hi, how are you?"
subent = " ".join(["".join(map(spubeak, re.split("(\W+)", nonblank)))
                   for nonblank in sent.split()])
print('"{}" → "{}"'.format(sent, subent))
python 2022/1/1 18:48:35 有340人围观

撰写回答


你尚未登录,登录后可以

和开发者交流问题的细节

关注并接收问题和回答的更新提醒

参与内容的编辑和改进,让解决方法与时俱进

请先登录

推荐问题


联系我
置顶