Lucene
Lucene是很强大的检索工具,Hibernate Search将lucene core和JPA/Hibernate ORM结合起来,当我们通过JPA添加或者修改数据时,自动在Lucene中index了entity,在检索时采用lucene core搜索引起进行搜索,并返回JPA对象实体。
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-search-orm</artifactId>
<version>5.9.1.Final</version>
</dependency>
设置Hibernate Search
在上下文的配置中:
@Bean
public LocalContainerEntityManagerFactoryBean entityManagerFactory() throws PropertyVetoException{
Map<String, Object> properties = new Hashtable<>();
properties.put("javax.persistence.schema-generation.database.action","none");
/* 允许Hibernate ORM使用Hibernate Search。采用Lucene standalone (no Solr),并将索引保存在本地文件系统。
* 当本war里面的Hibernate ORM相关的数据库写时,将触发Hibernate search对相关内容进行索引,写入到文件中。
* 这种方式不适用于Tomcat集群的方式,如果采用Tomcat集群,需要使用Solr server。*/
properties.put("hibernate.search.default.directory_provider", "filesystem");
/* 本例放在 ../searchIndexes,开发环境中为eclipse的第一级目录 */
properties.put("hibernate.search.default.indexBase", "../searchIndexes");
properties.put("hibernate.show_sql", "true");
properties.put("hibernate.dialect", "org.hibernate.dialect.MySQL5InnoDBDialect");
LocalContainerEntityManagerFactoryBean factory = new LocalContainerEntityManagerFactoryBean();
factory.setJpaVendorAdapter(new HibernateJpaVendorAdapter());
factory.setDataSource(this.springJpaDataSource());
factory.setPackagesToScan("cn.wei.flowingflying.chapter23.entities");
factory.setSharedCacheMode(SharedCacheMode.ENABLE_SELECTIVE);
factory.setValidationMode(ValidationMode.NONE);
factory.setJpaPropertyMap(properties);
return factory;
}
小例子有关的Entity数据和小例子目的
这是数据库表UserPrincipal_23,映射为entity User:mysql> select * from UserPrincipal_23;
+--------+----------+
| UserId | Username |
+--------+----------+
| 4 | John |
| 3 | Mike |
| 1 | Nicholas |
| 2 | Sarah |
+--------+----------+
表格Post_23,映射为entity Post:
mysql> select * from Post_23;
+--------+--------+----------------+--------------------------------------+------------+
| PostId | UserId | Title | Body | Keywords |
+--------+--------+----------------+--------------------------------------+------------+
| 1 | 3 | Title One | Test One. Hello world! Java! | one java |
| 2 | 1 | Title Two | Hello, my friend! This is title two. | two friend |
| 3 | 1 | Hello Nicholas | My name is Nicholas! Hi, Nicholas | Nicholas |
+--------+--------+----------------+--------------------------------------+------------+
mysql> select Post_23.*,UserPrincipal_23.Username from Post_23 left join UserPrincipal_23 on Post_23.UserId = UserPrincipal_23.UserId;
+--------+--------+----------------+--------------------------------------+------------+----------+
| PostId | UserId | Title | Body | Keywords | Username |
+--------+--------+----------------+--------------------------------------+------------+----------+
| 1 | 3 | Title One | Test One. Hello world! Java! | one java | Mike |
| 2 | 1 | Title Two | Hello, my friend! This is title two. | two friend | Nicholas |
| 3 | 1 | Hello Nicholas | My name is Nicholas! Hi, Nicholas | Nicholas | Nicholas |
+--------+--------+----------------+--------------------------------------+------------+----------+
小例子搜索title,body,keywords和username。
小例子的search是两个表join,对于ORM,这里采用@ManyToOne,将在后面学习。对于Lucene,采用Hibernate search的标记。
被索引的entity
@Entity
@Table(name="UserPrincipal_23")
public class User {
private long id;
private String username;
@Basic
@Field //表明这个属性在Lucene中作为可索引项(被搜索内容)
public String getUsername() {
return username;
}
... ...
}
主entity
@Entity
@Table(name="Post_23")
/*【1】@Indexed:表明这个类对Hibernate search是全文检索,将自动为该实体创建或是更新Lucene的文档。
* 文档的Id由@DocumentId标识,如果不添加,则自动标注到entity的@Id */
@Indexed
public class ForumPost {
private long id;
private User user;//表格通过外键UserId 关联
private String title;
private String body;
private String keywords;
@Id
@Column(name="PostId")
@GeneratedValue(strategy=GenerationType.IDENTITY)
/*【2】设置文档的Id: Hibernate Search为这个entity自动创建和更新document。@DocumentId用来表示这是document ID。
* 这里加在@Id上作为唯一标识,如果没有加,自动加在@Id上。*/
@DocumentId
public long getId() { ... }
@ManyToOne(fetch=FetchType.EAGER,optional=false)
@JoinColumn(name="UserId")
/*【4】索引到根entity(本例为索引至User):告诉Hibernate Search这是属性是另一个entity的Id。
* 关联的对象的属性也可以进行index,本例为User中的@Field String username。有点类似于级联的设置 */
@IndexedEmbedded
public User getUser() { ... }
@Basic
@Field //【3】该属性需要进行全文搜索
public String getTitle() { ... }
@Lob
@Field //【3】该属性需要进行全文搜索
public String getBody() { ... }
@SuppressWarnings("deprecation")
@Basic
/* Deprecated. Index-time boosting will not be possible anymore starting from Lucene 7.
* You should use query-time boosting instead, for instance by calling boostedTo(float)
* when building queries with the Hibernate Search query DSL.
* @Boost:相关性加权 */
@Field(boost = @Boost(2.0F))
public String getKeywords() { ... }
... ...
}
search的相关代码
同样的,我们提供SearchResult来存放entity和相关度分值。
public class SearchResult<T> {
private final T entity;
private final double relevance;
......
}
设置相关的仓库接口
public interface SearchableRepository<T>{
Page<SearchResult<T>> search(String query, Pageable pageable);
}
public interface ForumPostRepository extends JpaRepository<ForumPost, Long>,SearchableRepository<ForumPost>{
}
在Hibernate Search中使用了Lucene文档,相关api和JPA的api相似,当然亦可以采用Lucene的API。我们看看具体的代码:
public class ForumPostRepositoryImpl implements SearchRepository<ForumPost>{
//【1】获取Hibernate search的全文检索的entity管理器,类似于JPA中的entityManager,相关的Lucene的全文搜索的方法均给
// 基于此FullTextEntityManager。请注意在root上下文配置的entityManagerFactory是涵盖了Hibernate search的相关设置。
@PersistenceContext EntityManager entityManager;
EntityManagerProxy entityManagerProxy;
// 1.1)在Spring框架中注入的@PersistenceContext EntityManager entityManager;,实际是EntityManger proxy
//(为每个事务都代表提供一个新EntityManager),我们通过initialize()获取该proxy。
@PostConstruct
public void initialize(){
if(!(this.entityManager instanceof EntityManagerProxy))
throw new FatalBeanException("Entity manager " + this.entityManager + " was not a proxy.");
this.entityManagerProxy = (EntityManagerProxy) entityManager;
}
// 1.2)FullTextEntityManager是真实的,非proxy的,也就是我们需要为每次搜索,创建一个新的对象。
//(无法如Spring注入的EntityManager proxy那样默默为你自动实现)。且FullTextEntityManager的获取
// 必须通过一个真正的Hibernate ORM EntityManager实现(而不能通过proxy)来获取。
private FullTextEntityManager getFullTextEntityManager(){
return Search.getFullTextEntityManager(this.entityManagerProxy.getTargetEntityManager());
}
//【2】Hibernate search的全文检索实现。
@Override
public Page<SearchResult<ForumPost>> search(String query, Pageable pageable) {
// 2.1)在事务中获取FullTextEntityManager
FullTextEntityManager manager = getFullTextEntityManager();
// 2.2)进行search。Hibernate search的API和JPA的API有相似之处,因为都是Hibernate架构。
QueryBuilder builder = manager.getSearchFactory().buildQueryBuilder().forEntity(ForumPost.class).get();
Query lucene = builder.keyword()
.onFields("title", "body", "keywords", "user.username") //指定要检索的属性,请注意user.username
.matching(query) //matching里面为要检索的内容
.createQuery();
FullTextQuery q = manager.createFullTextQuery(lucene, ForumPost.class);
q.setProjection(FullTextQuery.THIS,FullTextQuery.SCORE); //返回ForumPost和相关度
// 2.3)获取搜索的结果的数量
long total = q.getResultSize();
// 2.4)获取具体的内容
@SuppressWarnings("unchecked")
List<Object[]> results = q.setFirstResult(pageable.getOffset())
.setMaxResults(pageable.getPageSize())
.getResultList();
List<SearchResult<ForumPost>> list = new ArrayList<>();
results.forEach(o -> list.add(
new SearchResult<>((ForumPost)o[0], (Float)o[1])) );
return new PageImpl<>(list, pageable, total);
}
}