First, u should install jieba, pip install jieba or download in http://pypi.python.org/pypi/jieba/, unpack and cd the storage directory install. or put it on site-packages directory.
Second, u can use it to segment words.
import os
import jieba
file_path = 'd:/平凡的世界.txt'
file = open(file_path, 'r', encoding = 'utf-8', errors = 'ignore')
try:
file_context = file.read()
finally:
file.close()
seg_list = jieba.cut(file_context, cut_all=False)
for words in seg_list:
word = words + '\t'
with open('d:/result.txt', 'a', encoding='utf-8', errors='ignore') as f:
f.write(word)
U can get more help in https://github.com/fxsjy/jieba