*ElasticSerach基于scala的CRUD操作（全）

最新推荐文章于 2024-09-20 09:18:24 发布

Angeline Wong

最新推荐文章于 2024-09-20 09:18:24 发布

阅读量758

点赞数 1

文章标签： elasticsearch Scala 2.11 API.chm

本文链接：https://blog.csdn.net/weixin_48180570/article/details/107339531

版权

ElasticSerach基于scala的CRUD操作（全）

作为刚入职的女小白，在学习ElasticSearch基于scala的CRUD操作时，发现网络上现有的文章都是Java+Scala实现的，没有单纯的Scala实现，而且网上的文章对小白来说还是有点困难的，所以就写了这一篇详细一点的供同样的小白参考。

1.环境依赖

    <dependency><!-- 选择性摘抄 -->
        <groupId>junit</groupId>
        <artifactId>junit</artifactId>
        <version>4.10</version>
    </dependency>
    <dependency>
        <groupId>org.elasticsearch.client</groupId>
        <artifactId>transport</artifactId>
        <version>6.2.0</version>
    </dependency>
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.10.3</version>
    </dependency>
    <dependency>
        <groupId>org.elasticsearch</groupId>
        <artifactId>elasticsearch-hadoop</artifactId>
        <version>6.2.4</version>
    </dependency>
    <dependency>
       <groupId>com.alibaba</groupId>
       <artifactId>fastjson</artifactId>
       <version>1.2.70</version>
    </dependency>
    <dependency>
       <groupId>org.apache.spark</groupId>
       <artifactId>spark-sql_2.11</artifactId>
       <version>2.4.1</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.11</artifactId>
      <version>2.4.1</version>
    </dependency>
    <dependency>
       <groupId>org.json</groupId>
       <artifactId>json</artifactId>
       <version>20180813</version>
</dependency>
</dependencies>

2.创建Es的数据连接

我采用json配置文件的格式，配置文件格式代码如下：

{
  "host": "192.168.0.184",
  "port": "9300",
  "clusterName": "elasticsearch",
  "index":"library",
  "t": "books"
}

需要注意的是，Es的Java端口号是9300，http端口是9200，所以代码中要使用9300，clusterName是主机名，在访问ES端口网页时可以查到。index是被操作的索引，t是type表。

然后写es连接代码如下：

package elasticSearch

import java.net.InetAddress
import com.alibaba.fastjson.{JSON, JSONObject}
import org.elasticsearch.client.transport.TransportClient
import org.elasticsearch.common.settings.Settings
import org.elasticsearch.common.transport.TransportAddress
import org.elasticsearch.transport.client.PreBuiltTransportClient

class Constants {
//读取配置文件
  val configName = "elasticSearch.json"
  //将json文件内容转换成json数据格式
  val elastic:JSONObject=JSON.parseObject(ClassLoader.getSystemResourceAsStream(configName),classOf[JSONObject])
 //将json里的数据读进来
  val host=elastic.getString("host")
  val port=elastic.getInteger("port")
  val clusterName=elastic.getString("clusterName")
  val index=elastic.getString("index")
  val t=elastic.getString("t")

//Es的连接配置
  val settings=Settings.builder()
    .put("cluster.name",clusterName)
    .put("client.transport.sniff", true)
    .build()
  var  client:TransportClient =new PreBuiltTransportClient(settings)
  val ta=new TransportAddress(InetAddress.getByName(host),port)
  client.addTransportAddresses(ta)
  //创建client的方法，方便CRUD操作
  def getTransportClient(): TransportClient ={
    return client
  }
  //client的关闭方法
  def close(client: TransportClient): Unit ={
    if (client!=null){
      client.close()
    }
  }
}

3.创建索引

这里就跟其他博主写的差不多了我将原作者连接放在下面，他写的比较全，我就举两个例子：

package elasticSearch

import com.alibaba.fastjson.JSONObject
import org.elasticsearch.action.index.IndexResponse
import org.elasticsearch.common.xcontent.{XContentBuilder, XContentType}
import org.elasticsearch.common.xcontent.json.JsonXContent
//继承Constants
object CreateIndex extends Constants {
//创建一个client
  val cli=getTransportClient()

  def createIndexByJson(): Unit = {
    val json=new JSONObject()
    json.put("name","我爱中国")
    json.put("auther","周迅")
    json.put("date","2018-06-06")
    //将数据发送到Es里
    var response: IndexResponse = cli.prepareIndex(index, t, "9")
      .setSource(json.toString, XContentType.JSON).get()
      //打印版本
    println(response.getVersion)
  }
  def createIndexByXContentBuilder()={
    var builder: XContentBuilder = JsonXContent.contentBuilder()
    builder.startObject()
      .field("name","西游记")
      .field("author","吴承恩")
      .field("version","1.0")
      .endObject()
      //将数据发送到ES里
    var response: IndexResponse = cli.prepareIndex(index,t,"4").setSource(builder)
      .get()
      //打印版本号
    println(response.getVersion)
  }
  def main(args: Array[String]): Unit = {
//    createIndexByJson()
//调用方法
    createIndexByXContentBuilder()
    //关闭client
    close(cli)
  }
}

原作者出处

4.CRUD操作和批量操作

package elasticSearch

import java.util

import com.alibaba.fastjson.JSONObject
import org.elasticsearch.action.bulk.BulkResponse
import org.elasticsearch.action.delete.DeleteResponse
import org.elasticsearch.action.update.UpdateResponse
import org.elasticsearch.common.xcontent.{XContentBuilder, XContentType}
import org.elasticsearch.common.xcontent.json.JsonXContent

object ElasticsearchCRUD extends Constants {
//创建client
  val cli=getTransportClient()

  def main(args: Array[String]): Unit = {
  //方法调用
    Bulk()
    //关闭client
    close(cli)
  }
  //删除操作
  def Delete(): Unit = {
    var response:DeleteResponse=client.prepareDelete(index,t,"2").get()
    println(response.getVersion)
  }
  //更新操作
  def Update(): Unit ={
    var builder:XContentBuilder=JsonXContent.contentBuilder()
    builder.startObject()
      .field("version","3.0")
      .endObject()
    var response:UpdateResponse=client.prepareUpdate(index,t,"2").setDoc(builder).get()
  println(response.getVersion)
  }
  //批量操作
  def Bulk(): Unit ={
    val map=new util.HashMap[String,String]()
    map.put("name","无双")
    map.put("author","周润发")
    map.put("version","2")
    val json=new JSONObject
    json.put("name","红楼梦")
    json.put("author","曹雪芹")
    json.put("version","1.0")
    var responses: BulkResponse = client.prepareBulk().add(client.prepareIndex(index, t, "7")
      .setSource(map))
      .add(client.prepareIndex(index, t, "8").setSource(json.toString(),XContentType.JSON))
      .get()
    for(response <-responses.getItems){
      print(response.getVersion)
    }
  }
}

有没有总结到，其实ES的各类操作都是用response来完成的，只要建立好连接入口，获取到client，然后就可以开始操作数据了，最后使用response将数据发送到ES上面，这样想，貌似就简单很多了。

5.全文索引、分页索引、高亮显示

还是用的是上面提到的那位作者代码，毕竟敲代码就是copy上再做一点小小改动，这里就不挂作者链接了，小伙伴们直接点击上面的连接就好了。我将我自己的代码放在下面（按照我的步骤来的可以直接借鉴）：

package elasticSearch

import java.util
import org.elasticsearch.action.search.{SearchResponse, SearchType}
import org.elasticsearch.index.query.QueryBuilders
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder
import org.elasticsearch.search.{SearchHit, SearchHits}
import org.json.JSONObject

import scala.collection.JavaConversions


object Search extends Constants {
  val cli=getTransportClient()
  def main(args: Array[String]): Unit = {
    //全文索引
    //fullTextSearch()
    //分页索引
    //pagingSearch()
    //高亮索引
    highlightSearch()
  }
  //全文索引
  def fullTextSearch()={
    val json=new JSONObject()
    val response = client.prepareSearch(index) //设置检索的类型
      .setSearchType(SearchType.DEFAULT) //设置检索的类型
      .setQuery(QueryBuilders.matchQuery("author", "天蚕土豆")) //设置检索方式
      .get()
    val hits = response.getHits  //获取检索结果
    println("totals:"+hits.getTotalHits)  //检索出的数据的个数
    println("maxSource"+hits.getMaxScore) //最大的得分
    //查询的具体的内容
    val myhits = hits.getHits
    for(hit <- myhits){
      val index = hit.getIndex
      val id = hit.getId
      val t = hit.getType
      val source =hit.getSourceAsString
      val score=hit.getScore
      json.put("_index",index)
      json.put("_id",id)
      json.put("_type",t)
      json.put("_score", score )
      json.put("_source",new JSONObject(source))
      println(json.toString())
    }
  }
  //分页索引
  //分页查询：查询第num页，查count条   每一页的长度*（num-1）+count
  def pagingSearch(from:Int=0,size:Int=10)={
    var response: SearchResponse = client.prepareSearch(index)
      .setSearchType(SearchType.QUERY_THEN_FETCH)
      .setQuery(QueryBuilders.matchQuery("name", "西游记"))
      .setFrom(from)
      .setSize(size)
      .get()
    val myhits: SearchHits = response.getHits
    val total=myhits.totalHits
    println("zzy为您查询出"+total+"记录：")
    val hits: Array[SearchHit] = myhits.getHits
    for (hit<-hits){
      val map: util.Map[String, AnyRef] = hit.getSourceAsMap
      val author=map.get("author")
      val name=map.get("name")
      val version=map.get("version")
      print(
        s"""
           |author:${author}
           |name:${name}
           |version:${version}
         """.stripMargin)
    }
  }
  //高亮索引
  def highlightSearch()={
    val response=client.prepareSearch(index)
      .setSearchType(SearchType.DEFAULT)
      .setQuery(QueryBuilders.matchQuery("author","周润发"))
      .highlighter(new HighlightBuilder()
        .field("author")//给哪个字段添加标签
        .preTags("<font color='red' size='20px'>")//添加的前置标签
        .postTags("</font>"))//添加的后置标签
      .get()
    val myHits = response.getHits
    val total = myHits.totalHits
    println("zzy为您查询出" + total + "记录：")
    val hits: Array[SearchHit] = myHits.getHits
    for(hit <-hits){
      //注意这里如果想要获取高亮的字段，必须使用高亮的方式获取
      val HLfields = hit.getHighlightFields
      //这里的field是设置高亮的字段名：author  highlight查询的所有的字段值（含高亮的）
      for((field,highlight)<-JavaConversions.mapAsScalaMap(HLfields)){
        var date=""
        val fragments=highlight.getFragments
        for(fragment <-fragments){
          date+=fragment.toString
        }
        print(date)
      }
    }
  }
}