概述
In [1]: from pattern.en import singularize In [2]: singularize('patterns') Out[2]: 'pattern' In [3]: singularize('gases') Out[3]: 'gase'
我通过定义来解决第二个例子中的问题
def my_singularize(strn): ''' Return the singular of a noun. Add special cases to correct pattern generic rules. ''' exceptionDict = {'gases':'gas','spectra':'spectrum','cross':'cross','nuclei':'nucleus'} try: return exceptionDict[strn] except: return singularize(strn)
有没有更好的方法来做到这一点,例如添加到模式的规则,或使exceptionDict以某种方式内部模式?
from nltk.stem import WordNetLemmatizer wnl = WordNetLemmatizer() test_words = ['gases','spectrum','cross','nuclei'] %timeit [wnl.lemmatize(wrd) for wrd in test_words] 10000 loops,best of 3: 60.5 µs per loop
与你的功能相比
%timeit [my_singularize(wrd) for wrd in test_words] 1000 loops,best of 3: 162 µs per loop
nltk lemmatizing表现更好.
总结
以上是编程之家为你收集整理的添加术语到python模式singularize的好方法全部内容,希望文章能够帮你解决添加术语到python模式singularize的好方法所遇到的程序开发问题。
如果您也喜欢它,动动您的小指点个赞吧