Chapter 1. Getting started

最新推荐文章于 2019-11-10 15:55:06 发布

某先生xxxx

最新推荐文章于 2019-11-10 15:55:06 发布

阅读量2k

点赞数

分类专栏： Hibernate Search 文章标签： hibernate search hibernate search 4.2 中文文档中文 lucene

Hibernate Search 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

1.3. Configuration

首先是配置开发环境：

主要记录一下和Hibernate Search 相关的配置，Spring 和 JPA的配置请查看相关文章。

Hibernate Search 的配置相当简单，主要涉及三个方面：_{(本文是基于jpa的)}

1、配置Hibernate Search，修改JPA配置文件：persistence.xml

<?xml version="1.0" encoding="utf-8"?>
    <persistence ...>
      <persistence-unit ...>    
          <properties>
            ...
            <!-- hibernate search -->
            <property name="hibernate.search.default.directory_provider" value="filesystem"/>
            <property name="hibernate.search.default.indexBase" value="索引文件存储位置"/>
        </properties>
      </persistence-unit>
    </persistence>

属性hibernate.search.default.directory_provider告诉hibernate使用哪个DirectoryProvider实现。在Apache Lucene中有一个概念Directory来保存Index Files。Hibernate通过DirectoryProvider来初始化和配置一个Lucene Directory实例。在本例中，我们使用一个能把Index Files保存在file system中的DirectoryProvider。当Index Files保存在file system中时，我们可以通过Luke工具实时查看Lucene索引文件。除了DirectoryProvider外，还需要告诉hibernate索引文件保存在哪个具体的目录中,这通过hibernate.search.default.indexBase属性来配置。

2、添加相应的jar依赖

hibernate-search-4.1.1.Final\dist：
hibernate-search-engine-4.1.1.Final.jar
hibernate-search-orm-4.1.1.Final.jar

hibernate-search-4.1.1.Final\dist\lib\required：
antlr-2.7.7.jar
avro-1.5.1.jar
dom4j-1.6.1.jar
hibernate-commons-annotations-4.0.1.Final.jar
hibernate-core-4.1.3.Final.jar
jackson-core-asl-1.9.2.jar
jackson-mapper-asl-1.9.2.jar
javassist-3.15.0-GA.jar
jboss-logging-3.1.0.GA.jar
lucene-core-3.5.0.jar
paranamer-2.3.jar
slf4j-api-1.6.1.jar
snappy-java-1.0.1-rc3.jar

3、为实体添加hibernate search 相关annotation

只需要以上三步，hibernate search就已经配置完成了，接下来要开始实体相关annotation的说明了。

package example;...@Entity
@Indexed
public class Book {

  @Id  @GeneratedValue  private Integer id;    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
  private String title;    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
  private String subtitle; 
  @Field(index = Index.YES, analyze=Analyze.NO, store = Store.YES)
  @DateBridge(resolution = Resolution.DAY)
  private Date publicationDate;

  @IndexedEmbedded
  @ManyToMany   private Set<Author> authors = new HashSet<Author>();

  public Book() {  }     // standard getters/setters follow here  ... }

package example;...@Entitypublic class Author {

  @Id  @GeneratedValue  private Integer id;

  @Field
  private String name;

  public Author() {  }    // standard getters/setters follow here  ...}

新建2个实体类

@Indexed标注需要索引的实体
如果有@Id标志，则不需要显示声明@DocumentId,因为lucene就不需要生成唯一标志来区分索引。
@Fileld标明为需要搜索的属性。默认@Filed的参数是index=Index.YES, analyze=Analyze.YES andstore=Store.NO，即进行索引，分词，不保存具体内容到索引。 Note :无论store=Store.NO还是store=Store.YES都不会影响最终的搜索能力。store.YES的作用是可以在搜索后可以直接从索引中获取域的完整值。在hibernate中,如果store=Store.NO,搜索结果中，域的值是通过数据库中获取，如果store=Store.YES，域的值是直接从索引文档中获取。
由于lucene都是基于String进行索引 (Lucene2.9后支持数值索引)，所以hibernate search使用@Bridge标签来转换数据类型，比如@DateBridge，时间数据转换
由于lucene索引是平整的数据结构(flat data structure)，无法识别对象关联关系@ManyToMany,@*ToOne, @Embedded and@ElementCollection，hibernate search，为了让上面的书的作者能被识别，使用@IndexedEmbedded标签

1.4. Indexing

hibernate search会透明的索引所有有标注过的实体类，不管是更新还是删除，都通过hibernate core自动进行。

但是如果数据库中已经有数据，则需手动初始化你的索引。如下 (see also Section 6.3, “Rebuilding the whole index” ):

 
        //使用Hibernate Session初始化索引  
 FullTextSession fullTextSession = Search.getFullTextSession(session);  
 fullTextSession.createIndexer().startAndWait();

         Java代码   
         
       
 //使用JPA初始化索引  
 EntityManager em = entityManagerFactory.createEntityManager();  
 FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(em);  
 fullTextEntityManager.createIndexer().startAndWait();  

上述操作之后，就可以在索引目录中看到新生成的索引.利用luke查看索引，可以更好的了解lucene是如何工作的。

1.5. Searching

有两种方法可以建立搜索，一种是直接使用Lucene的API，另一种是使用Hibernate Search query DSL。后者可以被包装成org.hibernate.Query，从而可以使用Hibernate API的机能。不推荐使用lucene api进行搜索

通过Hibernate Search query DSL，

Hibernate Session创建和运行搜索

      Java代码   
      
    
 FullTextSession fullTextSession = Search.getFullTextSession(session);  
 Transaction tx = fullTextSession.beginTransaction();  
 // create native Lucene query unsing the query DSL  
 // alternatively you can write the Lucene query using the Lucene query parser  
 // or the Lucene programmatic API. The Hibernate Search DSL is recommended though  
 QueryBuilder qb = fullTextSession.getSearchFactory()  
     .buildQueryBuilder().forEntity( Book.class ).get();  
 org.apache.lucene.search.Query query = qb  
   .keyword()  
   .onFields("title", "subtitle", "authors.name", "publicationDate")  
   .matching("Java rocks!")  
  .createQuery();  
 // wrap Lucene query in a org.hibernate.Query  
 org.hibernate.Query hibQuery =   
     fullTextSession.createFullTextQuery(query, Book.class);  
 // execute search  
 List result = hibQuery.list();  
     
 tx.commit();  
 session.close();  

通过JPA创建和运行搜索

      Java代码   
      
    

 
       EntityManager em = entityManagerFactory.createEntityManager();  
 FullTextEntityManager fullTextEntityManager =   
     org.hibernate.search.jpa.Search.getFullTextEntityManager(em);  
 em.getTransaction().begin();  
 // create native Lucene query unsing the query DSL  
 // alternatively you can write the Lucene query using the Lucene query parser  
 // or the Lucene programmatic API. The Hibernate Search DSL is recommended though  
 QueryBuilder qb = fullTextSession.getSearchFactory()  
     .buildQueryBuilder().forEntity( Book.class ).get();  
 org.apache.lucene.search.Query query = qb  
   .keyword()  
   .onFields("title", "subtitle", "authors.name", "publicationDate")  
   .matching("Java rocks!");  
   .createQuery();  
 // wrap Lucene query in a javax.persistence.Query  
 javax.persistence.Query persistenceQuery =   
     fullTextEntityManager.createFullTextQuery(query, Book.class);  
 // execute search  
 List result = persistenceQuery.getResultList();  
 em.getTransaction().commit();  
 em.close();   
 
     

1.6. Analyzer

假设有一个book实体的title为： "Refactoring: Improving the Design of Existing Code" ，你希望通过搜索 "refactor", "refactors", "refactored" and "refactoring"就能搜到这个实体。在lucene，我们可以通过选择analyzer分词类来进行单词提取.Hibernate Search提供几种配置分词器的方法。详见 Section 4.3.1, “Default analyzer and analyzer by class” )

在配置文件中设置默认分词器hibernate.search.analyzer.
用标注的方式在实体上设置分词器 @Analyzer .
用标注的方式在field域上设置分词器 @Analyzer .

我们也可以利用@AnalyzerDef自定义分词器，接下来的例子将结合Solr分词器以及factories,更多的

可使用factory类详见Solr文档， Solr Wiki.

下面的例子使用StandardTokenizerFactory，以及2个filter factories,LowerCaseFilterFactory andSnowballPorterFilterFactory.标准的分词器通过会标点符号和连接符进行分词，以便保持email和主机名的

完整。lowercase filter将所有字母转化为小写，最后由snowball filter进行截词

通常来说，使用Solr必须先声明分词器以及过滤器.

Example 1.11. Using @AnalyzerDef and the Solr framework to define and use an analyzer

@Entity@Indexed
@AnalyzerDef(name = "customanalyzer",
  tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
  filters = {
    @TokenFilterDef(factory = LowerCaseFilterFactory.class),
    @TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = {
      @Parameter(name = "language", value = "English")
    })
  })
public class Book {

  @Id  @GeneratedValue  @DocumentId  private Integer id;    @Field  @Analyzer(definition = "customanalyzer")
  private String title;    @Field  @Analyzer(definition = "customanalyzer")
  private String subtitle; 

  @IndexedEmbedded  @ManyToMany   private Set<Author> authors = new HashSet<Author>();
  @Field(index = Index.YES, analyze = Analyze.NO, store = Store.YES)  @DateBridge(resolution = Resolution.DAY)  private Date publicationDate;    public Book() {  }     // standard getters/setters follow here  ... }