学习Oracle SES一段时间了,在此总结一下:
在 Oracle Secure Enterprise Search 站点上提供了一篇白皮书,另外安装包里面的doc比较详细说明了SES的详细配置和运用,而我下载的是一个速成例子的教程http://stcurriculum.oracle.com/tutorial/SESAdminTutorial/index.htm,有兴趣的朋友也可以在后面下载看看感受一下。
先总结下SES的搜索类型,一共八种:
Web: A Web source represents the content on a specific Web site. Web sources facilitate maintenance crawling of specific Web sites.
Table: A table source represents content in an Oracle database table or view.
File: A file source is the set of documents that can be accessed through the file protocol.
E-mail: An e-mail source derives its content from e-mails sent to a specific e-mail address. When Oracle SES crawls an e-mail source, it collects e-mail from all folders set up in the e-mail account, including Drafts, Sent Items, and Trash e-mails.
Mailing list: A mailing list source derives its content from e-mails sent to a specific mailing list.
OracleAS Portal: An OracleAS Portal source allows users to search across multiple OracleAS Portal repositories, such as Web pages, files on disk, and pages on other OracleAS Portal instances.
Federated: A federated source is a repository that maintains its own index. Oracle SES can issue a search, and the repository can return results.
User-defined: You can implement a crawler plug-in to crawl and index a proprietary document repository, such as Lotus Notes or Documentum.
接下来是特点:
Secure Search
Federated Search
Web Services API
Extensible Crawler Plug-in Framework
SES顾名思义,重点肯定是能够进行安全的搜索存储介质,所以Secure Search肯定是重头大戏。Secure Search 一共有四种形式:
Admin-based Authorization
Custom Crawler Plug-in
Query Time Authorization
Self Service Authorization
对于第一种形式是基于一个全局的静态访问控制列表ACLs实现的,用户对资源的访问都必须在这个ACLs中有定义。
第二种形式是一种比较灵活的形式,用户在使用SES的时候,可以根据自己的需求,通过实现SES提供的API编写自定义的爬虫,对自己的存储介质进行索引。
第三中形式相比第一种则是一个动态的形式,他在用户每次搜索的时候都进行验证,而为了做到这些仅需要实现QueryTimeFilter interface 。
第四种则是一种自定义的验证形式,这可以对SES定义存贮介质范围之外的介质做搜索验证。
接下来总结一下SES的第三个特点Web Services API,这是比较关注的一个特点,用它就可以把SES加入到自己的项目中来。Oracle SES 在 Web Services方面我觉得做的很好,很灵活,他允许用户直接编写 xml与SES进行通讯,也可以使用它提供的Java Proxy Libraries来实现,这样使得不熟悉Web Services的用户群体使用SES成为可能。当然用户如果使用Java Proxy Libraries和SES进行通讯的时候,SES提供了友好的入口,用户可以填写通讯的参数后提交,便可查看到完整的XML源文件。下面是一段我的代码:
public OracleSearchResult getSearchResult(String searchTerm, String locale, DataGroup[] groups) { OracleSearchService oss = new OracleSearchService(); oss.setSoapURL( "http://shane-cfca9ec81:7777/search/query/OracleSearch" ); try { searchResult = oss.doOracleSearch(searchTerm, default_startIndex, default_docsRequested, default_dupRemoved, default_dupMarked, groups, locale, default_docLang, default_returnCount, default_filterConnector, default_filters, default_fetchAttributes); } catch (Exception e) { log .error( "Can't get all of the search results from SES server!!! The error message following like this:" ); log.equals(e.getMessage()); } return searchResult这样通过迭代searchResult就可以拿到全部的搜索结果, So cool!
今天总结完毕