在看本文章之前,可以先浏览Solr5.0快速入门
1,启动单机版的Solr
进入Solr5.0安装目录,执行:
[root@datanode-4 solr-5.0.0]# ./bin/solr start
Waiting to see Solr listening on port 8983 [|]
Started Solr server on port 8983 (pid=46859). Happy searching!
访问http://10.51.121.10:8983/solr/
2,建立名称为”mycore1”的core。
[root@datanode-4 solr-5.0.0]# ./bin/solr create -c mycore1
Setup new core instance directory:
/root/nutch/solr-5.0.0/server/solr/mycore1
Creating new core 'mycore1' using command:
http://localhost:8983/solr/admin/cores?action=CREATE&name=mycore1&instanceDir=mycore1
{
"responseHeader":{
"status":0,
"QTime":5533},
"core":"mycore1"}
访问http://10.51.121.10:8983/solr/#/mycore1
3,建立索引,为/root/nutch/data目录下的所有文件建立索引,此目录下有word、excel、ppt、pdf、txt等文件。执行如下命令,#./bin/post -c mycore1 /root/nutch/data/
建立索引。
[root@datanode-4 solr-5.0.0]# ./bin/post -c mycore1 /root/nutch/data/
/usr/lib/jdk1.7.0_75/bin/java -classpath /root/nutch/solr-5.0.0/dist/solr-core-5.0.0.jar -Dauto=yes -Dc=mycore1 -Ddata=files -Drecursive=yes org.apache.solr.util.SimplePostTool /root/nutch/data/
SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8983/solr/mycore1/update...
Entering auto mode. File endings considered are xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
Entering recursive mode, max depth=999, delay=0s
Indexing directory /root/nutch/data (8 files, depth=0)
POSTing file Oracle Database 12c 数据库服务技术.ppt (application/vnd.ms-powerpoint) to [base]/extract
POSTing file index.html (text/html) to [base]/extract
POSTing file Oracle_Database_12c_的新_PLSQL_功能.pptx (application/vnd.openxmlformats-officedocument.presentationml.presentation) to [base]/extract
POSTing file Apache Hive Guideline.docx (application/vnd.openxmlformats-officedocument.wordprocessingml.document) to [base]/extract
POSTing file Excel测试.xlsx (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) to [base]/extract
POSTing file 文本文件测试1.txt (text/plain) to [base]/extract
POSTing file Hadoop YARN 基本架构和发展趋势.pdf (application/pdf) to [base]/extract
POSTing file Oracle NoSQL Guideline.docx (application/vnd.openxmlformats-officedocument.wordprocessingml.document) to [base]/extract
8 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/mycore1/update...
Time spent: 0:00:02.335
4,搜索
在Query tab的q输入域中输入”总部“,点击”Execute Query“按钮,即可查询出包含”总部“的文档。