ElasticSerach基于scala的CRUD操作(全)
作为刚入职的女小白,在学习ElasticSearch基于scala的CRUD操作时,发现网络上现有的文章都是Java+Scala实现的,没有单纯的Scala实现,而且网上的文章对小白来说还是有点困难的,所以就写了这一篇详细一点的供同样的小白参考。
1.环境依赖
<dependency><!-- 选择性摘抄 -->
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.10</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>transport</artifactId>
<version>6.2.0</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.10.3</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-hadoop</artifactId>
<version>6.2.4</version>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.70</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.4.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.4.1</version>
</dependency>
<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20180813</version>
</dependency>
</dependencies>
2.创建Es的数据连接
我采用json配置文件的格式,配置文件格式代码如下:
{
"host": "192.168.0.184",
"port": "9300",
"clusterName": "elasticsearch",
"index":"library",
"t": "books"
}
需要注意的是,Es的Java端口号是9300,http端口是9200,所以代码中要使用9300,clusterName是主机名,在访问ES端口网页时可以查到。index是被操作的索引,t是type表。
然后写es连接代码如下:
package elasticSearch
import java.net.InetAddress
import com.alibaba.fastjson.{JSON, JSONObject}
import org.elasticsearch.client.transport.TransportClient
import org.elasticsearch.common.settings.Settings
import org.elasticsearch.common.transport.TransportAddress
import org.elasticsearch.transport.client.PreBuiltTransportClient
class Constants {
//读取配置文件
val configName = "elasticSearch.json"
//将json文件内容转换成json数据格式
val elastic:JSONObject=JSON.parseObject(ClassLoader.getSystemResourceAsStream(configName),classOf[JSONObject])
//将json里的数据读进来
val host=elastic.getString("host")
val port=elastic.getInteger("port")
val clusterName=elastic.getString("clusterName")
val index=elastic.getString("index")
val t=elastic.getString("t")
//Es的连接配置
val settings=Settings.builder()
.put("cluster.name",clusterName)
.put("client.transport.sniff", true)
.build()
var client:TransportClient =new PreBuiltTransportClient(settings)
val ta=new TransportAddress(InetAddress.getByName(host),port)
client.addTransportAddresses(ta)
//创建client的方法,方便CRUD操作
def getTransportClient(): TransportClient ={
return client
}
//client的关闭方法
def close(client: TransportClient): Unit ={
if (client!=null){
client.close()
}
}
}
3.创建索引
这里就跟其他博主写的差不多了我将原作者连接放在下面,他写的比较全,我就举两个例子:
package elasticSearch
import com.alibaba.fastjson.JSONObject
import org.elasticsearch.action.index.IndexResponse
import org.elasticsearch.common.xcontent.{XContentBuilder, XContentType}
import org.elasticsearch.common.xcontent.json.JsonXContent
//继承Constants
object CreateIndex extends Constants {
//创建一个client
val cli=getTransportClient()
def createIndexByJson(): Unit = {
val json=new JSONObject()
json.put("name","我爱中国")
json.put("auther","周迅")
json.put("date","2018-06-06")
//将数据发送到Es里
var response: IndexResponse = cli.prepareIndex(index, t, "9")
.setSource(json.toString, XContentType.JSON).get()
//打印版本
println(response.getVersion)
}
def createIndexByXContentBuilder()={
var builder: XContentBuilder = JsonXContent.contentBuilder()
builder.startObject()
.field("name","西游记")
.field("author","吴承恩")
.field("version","1.0")
.endObject()
//将数据发送到ES里
var response: IndexResponse = cli.prepareIndex(index,t,"4").setSource(builder)
.get()
//打印版本号
println(response.getVersion)
}
def main(args: Array[String]): Unit = {
// createIndexByJson()
//调用方法
createIndexByXContentBuilder()
//关闭client
close(cli)
}
}
4.CRUD操作和批量操作
package elasticSearch
import java.util
import com.alibaba.fastjson.JSONObject
import org.elasticsearch.action.bulk.BulkResponse
import org.elasticsearch.action.delete.DeleteResponse
import org.elasticsearch.action.update.UpdateResponse
import org.elasticsearch.common.xcontent.{XContentBuilder, XContentType}
import org.elasticsearch.common.xcontent.json.JsonXContent
object ElasticsearchCRUD extends Constants {
//创建client
val cli=getTransportClient()
def main(args: Array[String]): Unit = {
//方法调用
Bulk()
//关闭client
close(cli)
}
//删除操作
def Delete(): Unit = {
var response:DeleteResponse=client.prepareDelete(index,t,"2").get()
println(response.getVersion)
}
//更新操作
def Update(): Unit ={
var builder:XContentBuilder=JsonXContent.contentBuilder()
builder.startObject()
.field("version","3.0")
.endObject()
var response:UpdateResponse=client.prepareUpdate(index,t,"2").setDoc(builder).get()
println(response.getVersion)
}
//批量操作
def Bulk(): Unit ={
val map=new util.HashMap[String,String]()
map.put("name","无双")
map.put("author","周润发")
map.put("version","2")
val json=new JSONObject
json.put("name","红楼梦")
json.put("author","曹雪芹")
json.put("version","1.0")
var responses: BulkResponse = client.prepareBulk().add(client.prepareIndex(index, t, "7")
.setSource(map))
.add(client.prepareIndex(index, t, "8").setSource(json.toString(),XContentType.JSON))
.get()
for(response <-responses.getItems){
print(response.getVersion)
}
}
}
有没有总结到,其实ES的各类操作都是用response来完成的,只要建立好连接入口,获取到client,然后就可以开始操作数据了,最后使用response将数据发送到ES上面,这样想,貌似就简单很多了。
5.全文索引、分页索引、高亮显示
还是用的是上面提到的那位作者代码,毕竟敲代码 就是copy上再做一点小小改动,这里就不挂作者链接了,小伙伴们直接点击上面的连接就好了。我将我自己的代码放在下面(按照我的步骤来的可以直接借鉴):
package elasticSearch
import java.util
import org.elasticsearch.action.search.{SearchResponse, SearchType}
import org.elasticsearch.index.query.QueryBuilders
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder
import org.elasticsearch.search.{SearchHit, SearchHits}
import org.json.JSONObject
import scala.collection.JavaConversions
object Search extends Constants {
val cli=getTransportClient()
def main(args: Array[String]): Unit = {
//全文索引
//fullTextSearch()
//分页索引
//pagingSearch()
//高亮索引
highlightSearch()
}
//全文索引
def fullTextSearch()={
val json=new JSONObject()
val response = client.prepareSearch(index) //设置检索的类型
.setSearchType(SearchType.DEFAULT) //设置检索的类型
.setQuery(QueryBuilders.matchQuery("author", "天蚕土豆")) //设置检索方式
.get()
val hits = response.getHits //获取检索结果
println("totals:"+hits.getTotalHits) //检索出的数据的个数
println("maxSource"+hits.getMaxScore) //最大的得分
//查询的具体的内容
val myhits = hits.getHits
for(hit <- myhits){
val index = hit.getIndex
val id = hit.getId
val t = hit.getType
val source =hit.getSourceAsString
val score=hit.getScore
json.put("_index",index)
json.put("_id",id)
json.put("_type",t)
json.put("_score", score )
json.put("_source",new JSONObject(source))
println(json.toString())
}
}
//分页索引
//分页查询:查询第num页,查count条 每一页的长度*(num-1)+count
def pagingSearch(from:Int=0,size:Int=10)={
var response: SearchResponse = client.prepareSearch(index)
.setSearchType(SearchType.QUERY_THEN_FETCH)
.setQuery(QueryBuilders.matchQuery("name", "西游记"))
.setFrom(from)
.setSize(size)
.get()
val myhits: SearchHits = response.getHits
val total=myhits.totalHits
println("zzy为您查询出"+total+"记录:")
val hits: Array[SearchHit] = myhits.getHits
for (hit<-hits){
val map: util.Map[String, AnyRef] = hit.getSourceAsMap
val author=map.get("author")
val name=map.get("name")
val version=map.get("version")
print(
s"""
|author:${author}
|name:${name}
|version:${version}
""".stripMargin)
}
}
//高亮索引
def highlightSearch()={
val response=client.prepareSearch(index)
.setSearchType(SearchType.DEFAULT)
.setQuery(QueryBuilders.matchQuery("author","周润发"))
.highlighter(new HighlightBuilder()
.field("author")//给哪个字段添加标签
.preTags("<font color='red' size='20px'>")//添加的前置标签
.postTags("</font>"))//添加的后置标签
.get()
val myHits = response.getHits
val total = myHits.totalHits
println("zzy为您查询出" + total + "记录:")
val hits: Array[SearchHit] = myHits.getHits
for(hit <-hits){
//注意这里如果想要获取高亮的字段,必须使用高亮的方式获取
val HLfields = hit.getHighlightFields
//这里的field是设置高亮的字段名:author highlight查询的所有的字段值(含高亮的)
for((field,highlight)<-JavaConversions.mapAsScalaMap(HLfields)){
var date=""
val fragments=highlight.getFragments
for(fragment <-fragments){
date+=fragment.toString
}
print(date)
}
}
}
}