要使自己的内容能够通过一个search框找到,需要使用dotcms里面的lucene对所有的内容进行index.
以下是一些条件,因为软件设计得有问题,所以说明一下:
1.首先确保你的站点是可以通过127.0.0.1访问的。
2.设置你的端口号为80.非80的话spindle将不会爬取这个端口所访问的内容。太脑残了~
有时候我安装软件的时候80已经被占用了,这个dotcms还非得用80的。。。
3.保证你的站点的link必须存在一个,爬虫就可以根据这个link一个个的找过去了。
这些先决条件满足后,通过管理员帐户进入:
http://{yoursite.com}/dotScheduledJobs
如:
http://127.0.0.1:80/dotScheduledJobs
然后执行 com.dotmarketing.quartz.job.BuildSearch
爬虫就会开始爬并建立索引。在你的日志里面可以看到过程的。
完了就可以通过在content 里面写velocity文本信息写结果反馈了。可以参考home/search-result.dot这个代码。
另外这个索引建立的过程是可以设置为自动的。
搜索到安装目录下的dotmarketing-config.properties.
对内容进行如下设置就可以了。
#BuildSearchServletThread
ENABLE_BUILD_SEARCH_SERVLET_THREAD=true
BUILD_SEARCH_SERVLET_THREAD_INIT_DELAY=60
测了下,发现中文是可以搜索到的。但是分词的效果不怎么样。如果想词典,请看我的下一篇文章吧.^_^
我的例子:
Search Results
-
塘洞老年协会 | dotCMS Core
-
Score: 18.042368%
...
简介 | dotCMS Core
-
Score: 15.911873000000002%
this is zhaowei page. 塘洞村介绍 塘洞村地处广西资源县两水苗族乡老山界下。塘洞村有400多户,1800多人,竹林与森林资源丰富,田园广阔,水旱不忧,堪称鱼米之乡。这是一个风光秀 丽,有连绵的竹林与林海,有高山瀑布,清澈见底的小河,还有层层梯田等。沿着山路,可以直通猫儿山主峰,一路秀色。村中还有几百年的月月桂(广西仅有六 棵),每月绽放!同时,该村具有光荣革命传统的古村落,桂北瑶民曾在此举起义旗,反抗暴政。根据老一辈革命家陈云同志的日记记载,中国工农红军长征翻越老 山界后下宿的第一地点,就是当 ...
协会介绍 | dotCMS Core
-
Score: 15.625148999999999%
...
协会活动 | dotCMS Core
引用:
http://www.dotcms.org/documentation/EnablingSiteSearch
Enabling Site Search
Spindle is a search engine built on top of lucene that can be used to perform site wide searching in dotCMS. Setting up spindle can be a little tricky, or stinky as the case may be, and we are looking into more robust options to enable site-wide searching. But for now, here are the steps to getting spindle initially set up:
- Make sure that your dotCMS hostname correctly resolves to your dotCMS instance. This means that if your dotCMS hostname is mysite.com and dotCMS is running on 127.0.0.1, you need to make sure that when your dotCMS server does a DNS lookup, mysite.com resolves to 127.0.0.1. Many times mysite.com will resolve to an external load balancer / NAT address which spindle will not be able to crawl. Edit your "hosts" file on your dotCMS server to force correct resolution.
- Spindle will only crawl on port 80. If your dotCMS is running on port 8080, your content will not be indexed.
- Make sure that your "homepage" has spiderable links. If your homepage is all javascript, spindle will not be able to crawl the urls.
- Log into dotCMS backend as a CMS administrator. Then, in the same browser window, goto http://{yoursite.com}/dotScheduledJobs .
- Run the job called: com.dotmarketing.quartz.job.BuildSearch
- If you tail the log files, you should be able to see spindle indexing urls. Once that is complete, you should get search results when search on your site.
To automatically schedule a re-index of your site, you can edit the dotmarketing-config.properties. Again, for this to work, all the above condtions need to be true.
#BuildSearchServletThread
ENABLE_BUILD_SEARCH_SERVLET_THREAD=true
BUILD_SEARCH_SERVLET_THREAD_INIT_DELAY=60