nltk - problems solved

1. Installation
http://www.nltk.org/download
The numpy and yaml need be installed first, all the three modules are installed from source.
That's quite easy to install these modules, just download the package, unzip and then run command: sudo python setup.py install.
Usually, everything should be ready now.

However, when installing numpy module, it looks numpy need to compile some file with gcc, and the dynamic library of python is needed.
There are two different python version in my system, the first in /usr/bin/ is 2.4. The other one in usage is 2.5 and located in /usr/self/bin/.
When compiling, numpy recognize the version is 2.5, but gcc can't find libpython2.5.so, which is in /usr/self/libpython2.5.so.

My solution is create a soft link to /usr/self/libpython2.5.so under a directory gcc will search.
And everything is ok now.


2. Download data package
The first time I downloaded the data package, I tried following the instruction by input nltk.download(), and then "d book".

When downloading the first data package, I got the error msg:
"Unzipping corpora/brown.zip"
"Error with downloaded zip file"
Of course I'm confused, so I tried again. ...still the same error

Then I find the downloaded zip file and try to unzip the file myself, but unzip complains that the zip file is corrupted.

I have to check the source code of download.py, according to the file, I try to download the data another way, 'nltk.download('book').
er.. this times it works at first, then failed again in another package with the same error msg.
After that, I tried both the way a few times, but all failed.

So I have to go on reading the source code, after reading the whole flow for downloading, there nothing special. Thus I tried the second way again, and some more packages are downloaded ..
Then failed with a new error msg: "Error downloading 'state_union' from <http://nltk.googlecode.com/svn/trunk/nltk_data/packages/corpora/state_union.zip>: <urlopen error(-3, 'Temporary failure in name resulution')>

This error message is much more clear, the problems were always related to downloading.
Maybe the network was not stable then, or the server was not stable, or ...

After that, I gives it some time more trying. And all data been downloaded now.:)

3. Import nltk module
Error msg: TypeError: walk() got an unexpected keyword argument 'followlinks'.
when import nltk book module, it make use of os.walk() function with named argument 'followlinks', which is add in python 2.6. However, my python version is 2.5.
solution: install a new version of python, 2.7.2 for me.
And yaml, numpy and nltk should be reinstalled for new python.

4. Tkinter
When I try command 'text4.dispersion_plot(["python"]), it failed and complains "nltk.draw package not loaded (please install Tkinter library)".
The result is that my system is redhat, and tcl/tk was installed without devel package. So install the corresponding devel package and remake and install your python. done!

5. Matplotlib
When running command 'dispersion_plot(), it will complain: "ValueError: The plot function requeres the matplotlib package (aka pylab)", if you have install matplotlib.
When install matplotlib, I met this error:
"File "/build/buildd/matplotlib-1.0.1/setupext.py", line 832, in check_for_tk (Tkinter.__version__.split()[-2], Tkinter.TkVersion, Tkinter.TclVersion)) IndexError: list index out of range"

This is an known error, you could find the fix patch here:
https://trac.macports.org/ticket/29893

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值