本文只对一次无聊的成功的经历做个记录,就算是来祭奠快逝去的2010年吧~~
pylucene: http://lucene.apache.org/pylucene/index.html
PyLucene is a Python extension for accessing Java Lucene. Its goal is to allow you to use Lucene's text indexing and searching capabilities from Python
基础环境:(试验机环境)
pylucene-3.0.3-1 http://mirror.bjtu.edu.cn/apache//lucene/pylucene/pylucene-3.0.3-1-src.tar.gz 内附:Jcc 2.7
Python 2.5.4
Java 1.6.0_16
Red Hat 4.1.2-46
GCC 4.1.2
Ant 1.6.5
准备上面的环境,download pylucene tar包并解压缩
pushd jcc
Edit setup.py and review that values in the INCLUDES, CFLAGS, DEBUG_CFLAGS, LFLAGS and JAVAC are correct for your system(我的jdk是自己装的,所以修改了JDK[linux2],其他review后没问题)
$ python setup.py build
$ sudo python setup.py install(到此Jcc安装完毕)
popd
edit Makefile to match your environment(只需要把你对应的linux那部分注释去掉,然后更改下PREFIX_PYTHON,如果python是你自己装的话)
make
sudo make install
make test (look for failures)(到此pylucene安装完毕)
cd sample
python IndexFiles <doc_directory> 会把<doc_directory目录里的文本文件建立一个index的索引放在当前目录
python SearchFiles.py 交互式查询刚才建立的index中的内容
~.~