BeautifulSoup在内部将数据存储为Unicode,因此您无需手动执行字符编码操作。
要在文本中找到关键字(不区分大小写)( 不在 属性值或标记名称中):
#!/usr/bin/env python
import urllib2
from contextlib import closing
import regex # pip install regex
from BeautifulSoup import BeautifulSoup
with closing(urllib2.urlopen(URL)) as page:
soup = BeautifulSoup(page)
print soup(text=regex.compile(ur'(?fi)\L<keywords>',
keywords=['your', 'keywords', 'go', 'here']))