您可以使用Gensim的内置方法show_topic从LDA模型中获取最常用的单词。
lda = models.LdaModel.load('lda.model')
for i in range(0, lda.num_topics):
with open('output_file.txt', 'w') as outfile:
outfile.write('{}\n'.format('Topic #' + str(i + 1) + ': '))
for word, prob in lda.show_topic(i, topn=20):
outfile.write('{}\n'.format(word.encode('utf-8')))
outfile.write('\n')
这将写入具有类似于以下格式的文件:
Topic #69:
pet
dental
tooth
adopt
animal
puppy
rescue
dentist
adoption
animal
shelter
pet
dentistry
vet
paw
pup
patient
mix
foster
owner
Topic #70:
periscope
disneyland
disney
snapchat
brandon
britney
periscope
periscope
replay
britneyspear
buffaloexchange
britneyspear
https
meerkat
blab
periscope
kxci
toni
disneyland
location