ElasticSearch读流程

最新推荐文章于 2024-07-05 16:51:40 发布

呵呵你个巴拉

最新推荐文章于 2024-07-05 16:51:40 发布

阅读量1.9k

点赞数

分类专栏： elasticsearch 文章标签： elasticsearch

elasticsearch 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

目录 [隐藏]

基于版本：2.3.2
这次分析的读流程指 GET/MGET 过程，不包含搜索过程。

GET/MGET 必须指定三元组: index type id。 type 可以使用 _all 表示从所有 type 获取第一个匹配 id 的 doc。
mget 时， type 留空表示 _all，例如可以这样:

GET 则必须明确指定 _all ，例如必须这样:

而不能

GET 流程

整体分为五个阶段:准备集群信息,内容路由,协调请求,数据读取,回复客户端。

在处理入口，根据action字符串获取对应的TransportAction实现类，对于一个单个doc的get请求，获取到的是一个

 
         1 
       
         2 
       
         3 
       
        TransportSingleShardAction  
        TransportAction 
        < 
        Request 
        , 
          
        Response 
        > 
          
        transportAction 
          
        = 
          
        actions 
        . 
        get 
        ( 
        action 
        ) 
        ;

一个 TransportSingleShardAction 对象用来处理存在于一个单个主分片或者副本分片上的读请求。

准备集群信息

1.在 TransportSingleShardAction 构造函数中，已准备好 clusterState、nodes 列表等信息

2.resolveRequest函数从ClusterState中获取IndexMetaData，更新可能存在的自定义routing信息

内容路由
确定目标节点,获取shard迭代器，其中包含了目的node信息

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
        private 
          
        AsyncSingleAction 
        ( 
        Request  
        request 
        , 
          
        ActionListener 
        < 
        Response 
        > 
          
        listener 
        ) 
          
        { 
       
        ClusterState  
        clusterState 
          
        = 
          
        clusterService 
        . 
        state 
        ( 
        ) 
        ;​ 
           
        //集群nodes列表             
       
        nodes 
          
        = 
          
        clusterState 
        . 
        nodes 
        ( 
        ) 
        ;​ 
       
        resolveRequest 
        ( 
        clusterState 
        , 
          
        internalRequest 
        ) 
        ;​ 
          
        //根据hash和shard数量取余计算一个随机目的shard，或者走优先级规则 
       
        this 
        . 
        shardIt 
          
        = 
          
        shards 
        ( 
        clusterState 
        , 
          
        internalRequest 
        ) 
        ; 
          
        }

作协调请求,向目标节点发送请求,处理响应,回复客户端,主要代码如下：

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
         15 
       
         16 
       
         17 
       
         18 
       
         19 
       
         20 
       
         21 
       
         22 
       
         23 
       
         24 
       
        private 
          
        void 
          
        perform 
        ( 
        @ 
        Nullable  
        final 
          
        Throwable  
        currentFailure 
        ) 
          
        { 
       
        DiscoveryNode  
        node 
          
        = 
          
        nodes 
        . 
        get 
        ( 
        shardRouting 
        . 
        currentNodeId 
        ( 
        ) 
        ) 
        ; 
       
        if 
          
        ( 
        node 
          
        == 
          
        null 
        ) 
          
        { 
       
        onFailure 
        ( 
        shardRouting 
        , 
          
        new 
          
        NoShardAvailableActionException 
        ( 
        shardRouting 
        . 
        shardId 
        ( 
        ) 
        ) 
        ) 
        ; 
       
        } 
          
        else 
          
        { 
       
        internalRequest 
        . 
        request 
        ( 
        ) 
        . 
        internalShardId 
          
        = 
          
        shardRouting 
        . 
        shardId 
        ( 
        ) 
        ; 
       
        transportService 
        . 
        sendRequest 
        ( 
        node 
        , 
          
        transportShardAction 
        , 
          
        internalRequest 
        . 
        request 
        ( 
        ) 
        , 
          
        new 
          
        BaseTransportResponseHandler 
        < 
        Response 
        > 
        ( 
        ) 
          
        { 
       
        //上面的 sendRequest 不管是发送到网络还是由本地节点直接处理的,下面的函数用于处理后续的响应操作 
       
        @ 
        Override 
       
        public 
          
        void 
          
        handleResponse 
        ( 
        final 
          
        Response  
        response 
        ) 
          
        { 
       
        listener 
        . 
        onResponse 
        ( 
        response 
        ) 
        ; 
       
        } 
       
        @ 
        Override 
       
        public 
          
        void 
          
        handleException 
        ( 
        TransportException  
        exp 
        ) 
          
        { 
       
        onFailure 
        ( 
        shardRouting 
        , 
          
        exp 
        ) 
        ; 
       
        } 
       
        } 
        ) 
        ; 
       
        } 
       
        }

代码入口：
rest请求接受和处理的类位于：
HttpRequestHandler::messageReceived

接收到请求后根据action获取handle，调用不同handle进行处理的的实现位于：RequestHandlerRegistry::processMessageReceived

单个shard读请求处理实现位于：
TransportSingleShardAction：：messageReceived

index 读取的核心实现位于:
InternalEngine::get

内容路由,获取shardit过程

shardit 是一个List的迭代器,默认情况下是所有 activeShard 中随机选择的一个位置的迭代器,如果存在优先级参数会有些其他的过滤条件.ShardRouting类是对一个独一无二的 shard 相关的路由信息.因此这个环节就是要确定最终目的 shard 是哪个,他的相关节点是哪个。
调用OperationRouting::getShards实现

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
        public 
          
        ShardIterator  
        getShards 
        ( 
        ClusterState  
        clusterState 
        , 
          
        String 
          
        index 
        , 
          
        String 
          
        type 
        , 
          
        String 
          
        id 
        , 
          
        @ 
        Nullable  
        String 
          
        routing 
        , 
          
        @ 
        Nullable  
        String 
          
        preference 
        ) 
          
        { 
       
        return 
          
        preferenceActiveShardIterator 
        ( 
        shards 
        ( 
        clusterState 
        , 
          
        index 
        , 
          
        type 
        , 
          
        id 
        , 
          
        routing 
        ) 
        , 
          
        clusterState 
        . 
        nodes 
        ( 
        ) 
        . 
        localNodeId 
        ( 
        ) 
        , 
          
        clusterState 
        . 
        nodes 
        ( 
        ) 
        , 
          
        preference 
        ) 
        ; 
       
        }

1.计算 shardid,然后从路由表获取匹配index 和 shard 的activeShardsgenerateShardId()通过对id等进行hash，对主分片取余,获得目的shardid,然后获取 shardid 对应的内容路由表.期间,会检查索引是否存在,不存在则抛异常。

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
        protected 
          
        IndexShardRoutingTable  
        shards 
        ( 
        ClusterState  
        clusterState 
        , 
          
        String 
          
        index 
        , 
          
        String 
          
        type 
        , 
          
        String 
          
        id 
        , 
          
        String 
          
        routing 
        ) 
          
        { 
       
        int 
          
        shardId 
          
        = 
          
        generateShardId 
        ( 
        clusterState 
        , 
          
        index 
        , 
          
        type 
        , 
          
        id 
        , 
          
        routing 
        ) 
        ; 
       
        return 
          
        clusterState 
        . 
        getRoutingTable 
        ( 
        ) 
        . 
        shardRoutingTable 
        ( 
        index 
        , 
          
        shardId 
        ) 
        ; 
       
        }

其中内容路由规则：
generateShardId中实现

 
         1 
       
         2 
       
         3 
       
        shard 
          
        = 
          
        hash 
        ( 
        routing 
        ) 
          
        % 
          
        number_of_primary 
        _shards

routing 是一个可变值，默认是文档的 _id ,另外可以根据routing指定的值，或者同时参考id与type
读取的时候也是这样hash计算出的目的shard

2.从 activeShard s 中选择目标.调用OperationRouting::preferenceActiveShardIterator()实现后续流程。首先检查是否存在优先级:preference如果不存在,调用ctiveInitializingShardsRandomIt();从activeshards中返回一个随机的node，随机算法在CollectionUtils.rotate实现,只是用一个随机数对activeShards.size()取余如果请求中存在优先级设置,进入分片查询优先级判断逻辑,优先级算法只是将对 activeShard 的随机选择改成了按一定条件把某个shard 放到List 最前面,然后返回第一个.

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
         15 
       
        private 
          
        ShardIterator  
        preferenceActiveShardIterator 
        ( 
        IndexShardRoutingTable  
        indexShard 
        , 
          
        String 
          
        localNodeId 
        , 
          
        DiscoveryNodes  
        nodes 
        , 
          
        @ 
        Nullable  
        String 
          
        preference 
        ) 
          
        { 
       
        if 
          
        ( 
        preference 
          
        == 
          
        null 
          
        || 
          
        preference 
        . 
        isEmpty 
        ( 
        ) 
        ) 
          
        { 
       
        String 
        [ 
        ] 
          
        awarenessAttributes 
          
        = 
          
        awarenessAllocationDecider 
        . 
        awarenessAttributes 
        ( 
        ) 
        ; 
       
        if 
          
        ( 
        awarenessAttributes 
        . 
        length 
          
        == 
          
        0 
        ) 
          
        { 
       
        return 
          
        indexShard 
        . 
        activeInitializingShardsRandomIt 
        ( 
        ) 
        ; 
       
        } 
          
        else 
          
        { 
       
        return 
          
        indexShard 
        . 
        preferAttributesActiveInitializingShardsIt 
        ( 
        awarenessAttributes 
        , 
          
        nodes 
        ) 
        ; 
       
        } 
       
        } 
       
        if 
          
        ( 
        preference 
        . 
        charAt 
        ( 
        0 
        ) 
          
        == 
          
        '_' 
        ) 
          
        { 
       
        . 
        . 
       
        } 
       
        }

协调请求过程

本节点作为协调节点,向目标 node 转发请求,或者目标是本地节点,直接通过函数调用读取数据.发送流程封装了对请求的发送,并且声明了如何对 Response 进行处理:AsyncSingleAction 类中声明的对 Response 进行处理的函数,无论请求在本节点处理还是发送到其他节点,都会经过这个函数处理:

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
        public 
          
        void 
          
        handleResponse 
        ( 
        final 
          
        Response  
        response 
        ) 
          
        { 
       
        listener 
        . 
        onResponse 
        ( 
        response 
        ) 
        ; 
       
        }

最终调用到给客户端回复 Response ,在RestResponseListener类发送:

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
        protected 
          
        final 
          
        void 
          
        processResponse 
        ( 
        Response  
        response 
        ) 
          
        throws 
          
        Exception 
          
        { 
       
        channel 
        . 
        sendResponse 
        ( 
        buildResponse 
        ( 
        response 
        ) 
        ) 
        ; 
       
        }

下面看下具体过程：
1.TransportService::sendRequest中检查目标是否本地node

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
        if 
          
        ( 
        node 
        . 
        equals 
        ( 
        localNode 
        ) 
        ) 
          
        { 
       
        sendLocalRequest 
        ( 
        requestId 
        , 
          
        action 
        , 
          
        request 
        ) 
        ; 
       
        } 
          
        else 
          
        { 
       
        transport 
        . 
        sendRequest 
        ( 
        node 
        , 
          
        requestId 
        , 
          
        action 
        , 
          
        request 
        , 
          
        options 
        ) 
        ; 
       
        }

2.如果是本地node，进入TransportService::sendLocalRequest流程sendLocalRequest不发送到网络，直接根据action获取注册的reg,执行processMessageReceived

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
        //sendRequest发现目标node是本地时（if (node.equals(localNode))），调用到本函数 
       
        private 
          
        void 
          
        sendLocalRequest 
        ( 
        long 
          
        requestId 
        , 
          
        final 
          
        String 
          
        action 
        , 
          
        final 
          
        TransportRequest  
        request 
        ) 
          
        { 
       
        final 
          
        DirectResponseChannel  
        channel 
          
        = 
          
        new 
          
        DirectResponseChannel 
        ( 
        logger 
        , 
          
        localNode 
        , 
          
        action 
        , 
          
        requestId 
        , 
          
        adapter 
        , 
          
        threadPool 
        ) 
        ; 
       
        try 
          
        { 
       
        final 
          
        RequestHandlerRegistry  
        reg 
          
        = 
          
        adapter 
        . 
        getRequestHandler 
        ( 
        action 
        ) 
        ; 
       
        final 
          
        String 
          
        executor 
          
        = 
          
        reg 
        . 
        getExecutor 
        ( 
        ) 
        ; 
       
        if 
          
        ( 
        ThreadPool 
        . 
        Names 
        . 
        SAME 
        . 
        equals 
        ( 
        executor 
        ) 
        ) 
          
        { 
       
        //noinspection unchecked 
       
        reg 
        . 
        processMessageReceived 
        ( 
        request 
        , 
          
        channel 
        ) 
        ; 
       
        } 
          
        . 
        . 
        . 
        . 
        . 
        . 
        . 
       
        }

3.进入数据读取流程
4.如果是发送到网络,请求被异步发送,sendRequest的时候注册 handle:

 
         1 
       
         2 
       
         3 
       
        transportService 
        . 
        sendRequest 
        ( 
        node 
        , 
          
        transportShardAction 
        , 
          
        internalRequest 
        . 
        request 
        ( 
        ) 
        , 
          
        new 
          
        BaseTransportResponseHandler 
        < 
        Response 
        > 
        ( 
        )

在TransportService::sendRequest中,这个 handle 最终被添加到

 
         1 
       
         2 
       
         3 
       
        clientHandlers 
        :  
        clientHandlers 
        . 
        put 
        ( 
        requestId 
        , 
          
        new 
          
        RequestHolder 
        <> 
        ( 
        handler 
        , 
          
        node 
        , 
          
        action 
        , 
          
        timeoutHandler 
        ) 
        ) 
        ;

然后,设置超时,等待处理 Response:

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
        public 
          
        TransportResponseHandler  
        onResponseReceived 
        ( 
        final 
          
        long 
          
        requestId 
        ) 
          
        { 
       
        RequestHolder  
        holder 
          
        = 
          
        clientHandlers 
        . 
        remove 
        ( 
        requestId 
        ) 
        ; 
       
        holder 
        . 
        cancelTimeout 
        ( 
        ) 
        ; 
       
        if 
          
        ( 
        traceEnabled 
        ( 
        ) 
          
        && 
          
        shouldTraceAction 
        ( 
        holder 
        . 
        action 
        ( 
        ) 
        ) 
        ) 
          
        { 
       
        traceReceivedResponse 
        ( 
        requestId 
        , 
          
        holder 
        . 
        node 
        ( 
        ) 
        , 
          
        holder 
        . 
        action 
        ( 
        ) 
        ) 
        ; 
       
        } 
          
        return 
          
        holder 
        . 
        handler 
        ( 
        ) 
        ; 
       
        }

收到其他节点的 Response 后,通过之前声明的handleResponse,给客户端返回响应内容.

本地节点数据读取和发送流程

RequestHandlerRegistry::processMessageReceived作为Request消息处理的总入口，根据action获取handle，调用对应的handler.messageReceived
进行处理

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
         15 
       
         16 
       
         17 
       
         18 
       
         19 
       
        //对于所有Request消息处理的入口 
       
        public 
          
        void 
          
        processMessageReceived 
        ( 
        Request  
        request 
        , 
          
        TransportChannel  
        channel 
        ) 
          
        throws 
          
        Exception 
          
        { 
       
        final 
          
        Task  
        task 
          
        = 
          
        taskManager 
        . 
        register 
        ( 
        channel 
        . 
        getChannelType 
        ( 
        ) 
        , 
          
        action 
        , 
          
        request 
        ) 
        ; 
       
        if 
          
        ( 
        task 
          
        == 
          
        null 
        ) 
          
        { 
       
        handler 
        . 
        messageReceived 
        ( 
        request 
        , 
          
        channel 
        ) 
        ; 
       
        } 
          
        else 
          
        { 
       
        boolean 
          
        success 
          
        = 
          
        false 
        ; 
       
        try 
          
        { 
       
        handler 
        . 
        messageReceived 
        ( 
        request 
        , 
          
        new 
          
        TransportChannelWrapper 
        ( 
        taskManager 
        , 
          
        task 
        , 
          
        channel 
        ) 
        , 
          
        task 
        ) 
        ; 
       
        success 
          
        = 
          
        true 
        ; 
       
        } 
          
        finally 
          
        { 
       
        if 
          
        ( 
        success 
          
        == 
          
        false 
        ) 
          
        { 
       
        taskManager 
        . 
        unregister 
        ( 
        task 
        ) 
        ; 
       
        } 
       
        } 
       
        } 
       
        }

对于单个shard的读请求，进入

 
         1 
       
         2 
       
         3 
       
        TransportSingleShardAction 
        :: 
        ShardTransportHandler 
        :: 
        messageReceived 
        ( 
        )

读取数据组织成 Response, 给客户端 channel 返回。

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
        public 
          
        void 
          
        messageReceived 
        ( 
        final 
          
        Request  
        request 
        , 
          
        final 
          
        TransportChannel  
        channel 
        ) 
          
        throws 
          
        Exception 
          
        { 
       
        Response  
        response 
          
        = 
          
        shardOperation 
        ( 
        request 
        , 
          
        request 
        . 
        internalShardId 
        ) 
        ; 
       
        channel 
        . 
        sendResponse 
        ( 
        response 
        ) 
        ; 
       
        }

shardOperation主要处理请求中是否有 refresh 选项,然后调用indexShard.getService().get() 读取数据,存储到 GetResult.为什么需要在 realtime 未开启的状态下 refresh 选项才能生效呢?如果一个GET操作要求先刷新数据,以此实现实时读取,这意味着数据从 lucene 获取,不走 translog.那他确实没必要开启 realtime 选项

 
   
 
 
  
         1 
       

         2 
       

         3 
       

         4 
       

         5 
       

         6 
       

         7 
       

         8 
       

         9 
       

         10 
       

         11 
       

         12 
       

         13 
       

         14 
       

           
       
 
        protected 
          
        GetResponse  
        shardOperation 
        ( 
        GetRequest  
        request 
        , 
          
        ShardId  
        shardId 
        ) 
          
        { 
       
 
            
        IndexService  
        indexService 
          
        = 
          
        indicesService 
        . 
        indexServiceSafe 
        ( 
        shardId 
        . 
        getIndex 
        ( 
        ) 
        ) 
        ; 
       
 
            
        IndexShard  
        indexShard 
          
        = 
          
        indexService 
        . 
        shardSafe 
        ( 
        shardId 
        . 
        id 
        ( 
        ) 
        ) 
        ; 
       

           
       
 
            
        if 
          
        ( 
        request 
        . 
        refresh 
        ( 
        ) 
          
        && 
          
        ! 
        request 
        . 
        realtime 
        ( 
        ) 
        ) 
          
        { 
       
 
                
        indexShard 
        . 
        refresh 
        ( 
        "refresh_flag_get" 
        ) 
        ; 
       
 
            
        } 
       

           
       
 
            
        GetResult  
        result 
          
        = 
          
        indexShard 
        . 
        getService 
        ( 
        ) 
        . 
        get 
        ( 
        request 
        . 
        type 
        ( 
        ) 
        , 
          
        request 
        . 
        id 
        ( 
        ) 
        , 
          
        request 
        . 
        fields 
        ( 
        ) 
        , 
       
 
                    
        request 
        . 
        realtime 
        ( 
        ) 
        , 
          
        request 
        . 
        version 
        ( 
        ) 
        , 
          
        request 
        . 
        versionType 
        ( 
        ) 
        , 
          
        request 
        . 
        fetchSourceContext 
        ( 
        ) 
        , 
          
        request 
        . 
        ignoreErrorsOnGeneratedFields 
        ( 
        ) 
        ) 
        ; 
       
 
            
        return 
          
        new 
          
        GetResponse 
        ( 
        result 
        ) 
        ; 
       
 
        } 
       

           
       
 
 

ShardGetService::get()中,调用GetResult getResult = innerGet()获取到结果.
GetResult类用于存储读取到的真实数据内容.而 Engine::GetResult类封装的是响应的 lucene IndexSearch 和translog 等信息因此,核心的数据读取实现在ShardGetService::innerGet()函数中.

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
         15 
       
         16 
       
         17 
       
         18 
       
         19 
       
         20 
       
         21 
       
         22 
       
         23 
       
        private 
          
        GetResult  
        innerGet 
        ( 
        String 
          
        type 
        , 
          
        String 
          
        id 
        , 
          
        String 
        [ 
        ] 
          
        gFields 
        , 
          
        boolean 
          
        realtime 
        , 
          
        long 
          
        version 
        , 
          
        VersionType  
        versionType 
        , 
          
        FetchSourceContext  
        fetchSourceContext 
        , 
          
        boolean 
          
        ignoreErrorsOnGeneratedFields 
        ) 
          
        { 
       
        fetchSourceContext 
          
        = 
          
        normalizeFetchSourceContent 
        ( 
        fetchSourceContext 
        , 
          
        gFields 
        ) 
        ; 
       
        Engine 
        . 
        GetResult  
        get 
          
        = 
          
        null 
        ; 
       
        if 
          
        ( 
        type 
          
        == 
          
        null 
          
        || 
          
        type 
        . 
        equals 
        ( 
        "_all" 
        ) 
        ) 
          
        { 
       
        . 
        . 
        . 
       
        } 
          
        else 
          
        { 
       
        get 
          
        = 
          
        indexShard 
        . 
        get 
        ( 
        new 
          
        Engine 
        . 
        Get 
        ( 
        realtime 
        , 
          
        new 
          
        Term 
        ( 
        UidFieldMapper 
        . 
        NAME 
        , 
          
        Uid 
        . 
        createUidAsBytes 
        ( 
        type 
        , 
          
        id 
        ) 
        ) 
        ) 
       
        . 
        version 
        ( 
        version 
        ) 
        . 
        versionType 
        ( 
        versionType 
        ) 
        ) 
        ; 
       
        . 
        . 
        . 
       
        } 
       
        DocumentMapper  
        docMapper 
          
        = 
          
        mapperService 
        . 
        documentMapper 
        ( 
        type 
        ) 
        ; 
       
        try 
          
        { 
       
        // break between having loaded it from translog (so we only have _source), and having a document to load 
       
        if 
          
        ( 
        get 
        . 
        docIdAndVersion 
        ( 
        ) 
          
        != 
          
        null 
        ) 
          
        { 
       
        return 
          
        innerGetLoadFromStoredFields 
        ( 
        type 
        , 
          
        id 
        , 
          
        gFields 
        , 
          
        fetchSourceContext 
        , 
          
        get 
        , 
          
        docMapper 
        , 
          
        ignoreErrorsOnGeneratedFields 
        ) 
        ; 
       
        } 
          
        . 
        . 
        . 
       
        } 
          
        }

1.首先,通过indexShard.get()获取Engine.GetResult,里面有重要的 lucene indexsearch,或者Translog.Source 等信息。
get()函数最终实现在InternalEngine::get()
先获取读锁:

 
         1 
       
         2 
       
         3 
       
        try 
          
        ( 
        ReleasableLock  
        lock 
          
        = 
          
        readLock 
        . 
        acquire 
        ( 
        ) 
        )

如果是数据位于本机的最新数据(versionMap中存在),则从 translog 获取.非realtime通过 lucene 获取,如果指定了 version 且 version 不存在,读取失败

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
         15 
       
         16 
       
         17 
       
         18 
       
         19 
       
         20 
       
         21 
       
         22 
       
         23 
       
         24 
       
         25 
       
         26 
       
         27 
       
         28 
       
         29 
       
        public 
          
        GetResult  
        get 
        ( 
        Get  
        get 
        ) 
          
        throws 
          
        EngineException 
          
        { 
       
        try 
          
        ( 
        ReleasableLock  
        lock 
          
        = 
          
        readLock 
        . 
        acquire 
        ( 
        ) 
        ) 
          
        { 
       
        ensureOpen 
        ( 
        ) 
        ; 
       
        if 
          
        ( 
        get 
        . 
        realtime 
        ( 
        ) 
        ) 
          
        { 
       
        //versionMap 中的值是写入索引的时候添加的,并且不做持久化. 
       
        VersionValue  
        versionValue 
          
        = 
          
        versionMap 
        . 
        getUnderLock 
        ( 
        get 
        . 
        uid 
        ( 
        ) 
        . 
        bytes 
        ( 
        ) 
        ) 
        ; 
       
        //一般不进入下面的 if, 有两个条件:1.最近写入的数据(具体多新未知),2,读取的时候指定了 version 
       
        if 
          
        ( 
        versionValue 
          
        != 
          
        null 
        ) 
          
        { 
       
        if 
          
        ( 
        versionValue 
        . 
        delete 
        ( 
        ) 
        ) 
          
        { 
        //删除标识,数据已通过 delete 接口删除了 
       
        return 
          
        GetResult 
        . 
        NOT_EXISTS 
        ; 
       
        } 
       
        if 
          
        ( 
        get 
        . 
        versionType 
        ( 
        ) 
        . 
        isVersionConflictForReads 
        ( 
        versionValue 
        . 
        version 
        ( 
        ) 
        , 
          
        get 
        . 
        version 
        ( 
        ) 
        ) 
        ) 
          
        { 
       
        Uid  
        uid 
          
        = 
          
        Uid 
        . 
        createUid 
        ( 
        get 
        . 
        uid 
        ( 
        ) 
        . 
        text 
        ( 
        ) 
        ) 
        ; 
       
        throw 
          
        new 
          
        VersionConflictEngineException 
        ( 
        shardId 
        , 
          
        uid 
        . 
        type 
        ( 
        ) 
        , 
          
        uid 
        . 
        id 
        ( 
        ) 
        , 
          
        versionValue 
        . 
        version 
        ( 
        ) 
        , 
          
        get 
        . 
        version 
        ( 
        ) 
        ) 
        ; 
       
        } 
       
        Translog 
        . 
        Operation  
        op 
          
        = 
          
        translog 
        . 
        read 
        ( 
        versionValue 
        . 
        translogLocation 
        ( 
        ) 
        ) 
        ; 
       
        if 
          
        ( 
        op 
          
        != 
          
        null 
        ) 
          
        { 
       
        return 
          
        new 
          
        GetResult 
        ( 
        true 
        , 
          
        versionValue 
        . 
        version 
        ( 
        ) 
        , 
          
        op 
        . 
        getSource 
        ( 
        ) 
        ) 
        ; 
       
        } 
       
        } 
       
        } 
       
        // no version, get the version from the index, we know that we refresh on flush 
       
        return 
          
        getFromSearcher 
        ( 
        get 
        ) 
        ; 
       
        } 
       
        }

2.调用ShardGetService::innerGetLoadFromStoredFields(),根据 type,id,DocumentMapper 等信息从刚刚get 到的信息中获取数据,对指定的field,source,进行过滤(source 过滤只支持对字段),把结果存于GetResult对象

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
         15 
       
         16 
       
         17 
       
         18 
       
        private 
          
        GetResult  
        innerGetLoadFromStoredFields 
        ( 
        String 
          
        type 
        , 
          
        String 
          
        id 
        , 
          
        String 
        [ 
        ] 
          
        gFields 
        , 
          
        FetchSourceContext  
        fetchSourceContext 
        , 
          
        Engine 
        . 
        GetResult  
        get 
        , 
          
        DocumentMapper  
        docMapper 
        , 
          
        boolean 
          
        ignoreErrorsOnGeneratedFields 
        ) 
          
        { 
       
        Map 
        < 
        String 
        , 
          
        GetField 
        > 
          
        fields 
          
        = 
          
        null 
        ; 
       
        BytesReference  
        source 
          
        = 
          
        null 
        ; 
       
        Versions 
        . 
        DocIdAndVersion  
        docIdAndVersion 
          
        = 
          
        get 
        . 
        docIdAndVersion 
        ( 
        ) 
        ; 
       
        FieldsVisitor  
        fieldVisitor 
          
        = 
          
        buildFieldsVisitors 
        ( 
        gFields 
        , 
          
        fetchSourceContext 
        ) 
        ; 
       
        if 
          
        ( 
        fieldVisitor 
          
        != 
          
        null 
        ) 
          
        { 
       
        try 
          
        { 
       
        docIdAndVersion 
        . 
        context 
        . 
        reader 
        ( 
        ) 
        . 
        document 
        ( 
        docIdAndVersion 
        . 
        docId 
        , 
          
        fieldVisitor 
        ) 
        ; 
       
        } 
          
        catch 
          
        ( 
        IOException 
          
        e 
        ) 
          
        { 
       
        throw 
          
        new 
          
        ElasticsearchException 
        ( 
        "Failed to get type [" 
          
        + 
          
        type 
          
        + 
          
        "] and id [" 
          
        + 
          
        id 
          
        + 
          
        "]" 
        , 
          
        e 
        ) 
        ; 
       
        } 
       
        source 
          
        = 
          
        fieldVisitor 
        . 
        source 
        ( 
        ) 
        ; 
       
        . 
        . 
        . 
       
        } 
       
        return 
          
        new 
          
        GetResult 
        ( 
        shardId 
        . 
        index 
        ( 
        ) 
        . 
        name 
        ( 
        ) 
        , 
          
        type 
        , 
          
        id 
        , 
          
        get 
        . 
        version 
        ( 
        ) 
        , 
          
        get 
        . 
        exists 
        ( 
        ) 
        , 
          
        source 
        , 
          
        fields 
        ) 
        ; 
       
        }

MGET流程

mget主要处理类:TransportMultiGetAction,其集成关系如下,通过封装单个 GET 请求实现
主要流程如下:
1.request 的 doc 数目中遍历,计算出由 sharid 为 key 组成的 request map.这个过程不在 TransportSingleShardAction 中实现,
是因为如果在那边实现, shardid 会重复
2.循环处理组织好的每个请求,走TransportSingleShardAction中处理单个 doc 的流程.与处理单个 doc 时相比,只是在构建TransportSingleShardAction对象时,传入的泛型:Request,Response不同.这就是说大约只是 shardid 内部计算还是外部计算等3.收集Response,全部 Response返回后执行finishHim(),给客户端返回结果

总结:回复的消息中 doc 顺序与请求的顺序一致如果部分 doc检索失败,不影响其他结果,检索失败的 doc 会在回复信息中标出

通过分析读流程,我们思考以下问题：

读失败是怎么处理的？
没有重试处理.无论是否指定优先级,都不会尝试重读,优先级只是在处理 avtiveshard 的ArrayList时将匹配
的放到了ArrayList前面而已.

怎么选择从主分片还是副本分片读取的？
从 activeshard 中随机选择,通过指定优先级可以从主分片读

分配是 shardit迭代器, 而不是单个目标 node, 难道想挨个尝试?
没有实现挨个尝试,只是向一个发请求后不管

读请求命中 translog 的条件是什么?
写完数据之后,短时间内发起的读操作(无论读操作到达哪个节点,都可以命中,每个主\副分片所在节点都写了 translog)

为什么要加读锁?怕读的时候有人删了改?需要分布式锁吗?
用于多线程间的同步.不需要分布式锁,只锁本节点本进程即可.设想 A 节点在读, B 节点要删除,B 节点删除成功, B 作为协调节点向 A 发送删除请求,该请求会阻塞在读写锁.锁的范围:每个shard有一个读写锁.读写锁是Engine类的成员变量,集群启动的时候, 为存储于本地节点的 index::shard创建一个Engine对象.循环位于:IndicesClusterStateService::applyNewOrUpdatedShards()

读取指定 route 如何处理的?
对GET 请求中 routing 参数的的处理,就是把默认对 id 进行 hash, 改为对 routing 指定的值进行 hash

对 _source,_field 等过滤器如何处理的?在哪个环节处理的?是否 lucene 处理的?全部读取出来之后才做的 filter 吗?
_source,_field 是在读取了完整的 doc 之后在innerGetLoadFromStoredFields函数中做过滤的

refresh参数在哪实现的?
TransportGetAction::shardOperation()函数,设置 refesh 为 true,并且关闭 realtime 才会刷新 shard.

cache 机制是如何的?
早期版本缓存一切可以缓存的数据
使用频率较高,数据量较大的才进行缓存
缓存老化算法为 LRU: 最近最少使用
参考:
https://www.elastic.co/guide/en/elasticsearch/guide/current/filter-caching.htmlhttps://www.elastic.co/guide/cn/elasticsearch/guide/current/filter-caching.html

读取操作是实时还是准实时?
实时指写入完成后立刻读取,是否能读到。
读取是实时的。因为三元组明确,不需要走倒排索引
search 给的的是关键词,必须走倒排才能查到,不走 translog, 所以是近实时的.
参考：
https://www.elastic.co/guide/en/elasticsearch/guide/current/near-real-time.htmlhttps://www.elastic.co/guide/en/elasticsearch/reference/2.3/docs-get.html#realtimeGET/MGET

为什么不默认读本地节点?
？？

GET 相关参数

realtime
默认开启2.3.2版本:尝试从 translog 读取.命中条件:写完数据之后,短时间内发起的读操作会命中,无论这个读取操作被发到哪个节点.因为当一个写操作返回时,所有主,副分片所在节点都有 translog 可以命中5.5的版本中,不受索引刷新速率的影响，如果一个document没有被更新了，但是还没有刷新，那么get API获取此文档的时候会先刷新，然后再get

Optional Type
如果想要查询所有的类型，可以直接指定_type为_all，从而匹配所有的类型。返回匹配 id 的第一个 doc

Source filtering
默认情况下get操作会返回_source字段，除非你使用了fields字段或者禁用了_source字段。通过设置_source属性，可以禁止返回source内容:

 
         1 
       
         2 
       
         3 
       
        curl 
          
        - 
        XGET 
          
        'http://localhost:9200/twitter/tweet/1?_source=false'

如果想要返回特定的字段，可以使用_source_include或者_source_exclude进行过滤。可以使用逗号分隔来设置多种匹配模式，比如：

 
         1 
       
         2 
       
         3 
       
        curl 
          
        - 
        XGET 
          
        'http://localhost:9200/twitter/tweet/1?_source_include=*.id&_source_exclude=entities'

如果希望返回特定的字段，也可以直接写上字段的名称:

 
         1 
       
         2 
       
         3 
       
        curl 
          
        - 
        XGET 
          
        'http://localhost:9200/twitter/tweet/1?_source=*.id,retweeted'

Fields
get操作允许设置fields字段，返回特定的字段：
curl -XGET ‘http://localhost:9200/twitter/tweet/1?fields=title,content’
如果请求的字段没有被存储，那么他们会从source中分析出来，这个功能也可以用 source filter来替代。
元数据比如_routing和_parent是永远不会被返回的。
只有叶子字段才能通过field选项返回.所以对象字段这种是不能返回的，这种请求也会失败。

Routing
当索引的时候指定了路由，那么查询的时候就一定要指定路由。
curl -XGET ‘http://localhost:9200/twitter/tweet/1?routing=kimchy’
如果路由信息不正确，就会查找不到文档

Preference
控制为get请求维护一个分片的索引，这个索引可以设置为：
_primary 这个操作仅仅会在主分片上执行。
_local 这个操作会在本地的分片上执行。
Custom (string) value 用户可以自定义值，对于相同的分片可以设置相同的值。这样可以保证不同的刷新状态下，查询不同的分片。就像sessionid或者用户名一样。

Refresh
refresh参数可以让每次get之前都刷新分片，使这个值可以被搜索。设置true的时候，尽量要考虑下性能问题，因为每次刷新都会给系统带来一定的压力

Versioning support
使用 version 参数检索文档,只有当前版本号与指定版本号相同时才能成功,这种机制对所有版本类型都有效,除了FOUCE, 他总是返回 doc.

ref：
http://blog.csdn.net/u010994304/article/details/50441419
http://www.code123.cc/2582.html
http://www.jianshu.com/p/62febe581fcb
http://blog.csdn.net/july_2/article/details/24777931
http://blog.csdn.net/laigood/article/details/8450331
http://www.cnblogs.com/xing901022/p/5317698.html