SolrCloud Wiki翻译(2)Nodes,Cores,Clusters & Leaders

Nodes and Cores


In SolrCloud, a node is Java Virtual Machine instance running Solr, commonly called a server. Each Solr core can also be considered a node. Any node can contain both an instance of Solr and various kinds of data.

在SolrCloud里面,一个node代表运行一个Solr应用的JVM进程,一般叫做一个server。每一个Solr core也可以认为是一个node。一个node可以包含一个Solr的运行实例和各种各样的索引数据。

A Solr core is basically an index of the text and fields found in documents. A single Solr instance can contain multiple "cores", which are separate from each other based on local criteria. It might be that they are going to provide different search interfaces to users (customers in the US and customers in Canada, for example), or they have security concerns (some users cannot have access to some documents), or the documents are really different and just won't mix well in the same index (a shoe database and a dvd database).

一个Solr core是一个基本概念,指的是可以在文档里面的查找文本和其他类型字段的一个索引。一个单独的Solr实例可以包含多个“core”,这些core因为一系列的条件而需要隔离的。这些条件可能是:它们要给用户提供不同的一些搜索服务(例如在美国的顾客和在加拿大的顾客),它们要各自设置一些安全策略(比如某些用户不能访问某些文档),或者是它们的数据格式完全不同导致它们不能很好的混合在一份索引里面(比如一个鞋子的数据库和dvd的数据库)

When you start a new core in SolrCloud mode, it registers itself with ZooKeeper. This involves creating an Ephemeral node that will go away if the Solr instance goes down, as well as registering information about the core and how to contact it (such as the base Solr URL, core name, etc). Smart clients and nodes in the cluster can use this information to determine who they need to talk to in order to fulfill a request.

当你在SolrCloud模式下面启动一个新的core的时候,它会把自己注册到ZooKeeper里面,这个注册操作包含创建一个Ephemeral节点,这个节点在Solr实例关闭的时候会自动删除,也会把core的相关信息和怎么和这个core通信的方式注册到ZooKeeper里面(例如Solr的base url,core名字,等等)。为了成功的执行一个请求,在集群中的智能客户端和节点能够运用这些信息来确定他们需要和谁通信。

New Solr cores may also be created and associated with a collection via  CoreAdmin. Additional cloud-related parameters are discussed in the Parameter Reference page. Terms used for the CREATE action are:

新版本的Solr core亦可通过 CoreAdmin来创建并且把它和一个collection关联在一起。一些SolrCloud相关的附加参数已经在Parameter Reference进行了说明

  • collection: the name of the collection to which this core belongs. Default is the name of the core.
  • shard: the shard id this core represents. (Optional: normally you want to be auto assigned a shard id.)
  • collection.<param>=<value>: causes a property of <param>=<value> to be set if a new collection is being created. For example, usecollection.configName=<configname> to point to the config for a new collection.
  • collection:该core所属collection的名称,默认就是core的名称。
  • shard: 该core所代表的shard的id(该参数是可选的,一般你都会想要集群帮你自动分配一个shard id)
  • collection.<param>=<value>: 在一个新的collection创建的时候使用这种<param>=<value>的方式来设置相关参数。例如,用collection.configName=<configname>来指明新的collection所需要使用的索引配置。

For example:


curl  'http://localhost:8983/solr/admin/cores?action=CREATE&name=mycore&collection=collection1&shard=shard2'


A cluster is set of Solr nodes managed by ZooKeeper as a single unit. When you have a cluster, you can always make requests to the cluster and if the request is acknowledged, you can be sure that it will be managed as a unit and be durable, i.e., you won't lose data. Updates can be seen right after they are made and the cluster can be expanded or contracted.


Creating a Cluster


A cluster is created as soon as you have more than one Solr instance registered with ZooKeeper. The section Getting Started with SolrCloud reviews how to set up a simple cluster.

只要一个Solr实例注册到了ZooKeeper上那么就认为一个集群已经创建好了。 你可以去Getting Started with SolrCloud复习一下怎么创建一个简单的集群。

Resizing a Cluster


Clusters contain a settable number of shards. You set the number of shards for a new cluster by passing a system property, numShards, when you start up Solr. ThenumShards parameter must be passed on the first startup of any Solr node, and is used to auto-assign which shard each instance should be part of. Once you have started up more Solr nodes than numShards, the nodes will create replicas for each shard, distributing them evenly across the node, as long as they all belong to the same collection.

集群包含了若干个shard,shard的数量可以通过数字设置,你可以通过设置一个numShards的system property来指定一个新的集群的shard数量。numShards参数只能在第一次启动Solr节点的时候指定,而且shard中的节点都是自动分配到各个shard中去的。如果你启动了比numShards参数更多的solr节点的话,这些新启动的节点都会作为shard的replica加入到集群中,这些节点都是均匀的分布到shard中的,同时他们都是属于同一个collection。

To add more cores to your collection, simply start the new core. You can do this at any time and the new core will sync its data with the current replicas in the shard before becoming active.


You can also avoid numShards and manually assign a core a shard ID if you choose.

当然你也可以不使用numShards参数而是选择手动的分配一个core到一个shard,这个shard通过一个shard id来指定。

The number of shards determines how the data in your index is broken up, so you cannot change the number of shards of the index after initially setting up the cluster.


However, you do have the option of breaking your index into multiple shards to start with, even if you are only using a single machine. You can then expand to multiple machines later. To do that, follow these steps:


  1. Set up your collection by hosting multiple cores on a single physical machine (or group of machines). Each of these shards will be a leader for that shard. 
  2. When you're ready, you can migrate shards onto new machines by starting up a new replica for a given shard on each new machine. 
  3. Remove the shard from the original machine. ZooKeeper will promote the replica to the leader for that shard.
  1. 在一个拥有多个core的物理机上(或者是多个物理机上)构建一个collection。每个shard都会有一个属于该shard的leader。
  2. 当你准备好以后,你可以为每个shard在一个新的机器上创建一个replica,这样就可以把每个shard都迁移到不同机器上去。
  3. 删除原来机器上的shard,ZooKeeper会自动把replica提升为当前shard的leader。

Leaders and Replicas


The concept of a leader is similar to that of master when thinking of traditional Solr replication. The leader is responsible for making sure the replicas are up to date with the same information stored in the leader.


However, with SolrCloud, you don't simply have one master and one or more "slaves", instead you likely have distributed your search and index traffic to multiple machines. If you have bootstrapped Solr with numShards=2, for example, your indexes are split across both shards. In this case, both shards are considered leaders. If you start more Solr nodes after the initial two, these will be automatically assigned as replicas for the leaders.


Replicas are assigned to shards in the order they are started the first time they join the cluster. This is done in a round-robin manner, unless the new node is manually assigned to a shard with the shardId parameter during startup. This parameter is used as a system property, as in -DshardId=1, the value of which is the ID number of the shard the new node should be attached to.

Replica在第一次启动并且加入到集群的时候会被有序的分配到shard里面去。 这是一种重复的工作方式,除非新的节点被在启动的时候被手动的指定了一个shardId参数来分配到一个特定的Shard上去。这个参数是作为一个system property使用的,例如 -DshardId=1,参数的值是新节点想要加入的shard的ID数值。

On subsequent restarts, each node joins the same shard that it was assigned to the first time the node was started (whether that assignment happened manually or automatically). A node that was previously a replica, however, may become the leader if the previously assigned leader is not available.


Consider this example:


  • Node A is started with the bootstrap parameters, pointing to a stand-alone ZooKeeper, with the numShards parameter set to 2.
  • Node B is started and pointed to the stand-alone ZooKeeper.
  • 节点A在启动的时候用参数将自己指向了一个单独运行的ZooKeeper,并且把numShards参数给设置到了2.
  • 节点B也在启动的时候把自己指向了这个ZooKeeper。

Nodes A and B are both shards, and have fulfilled the 2 shard slots we defined when we started Node A. If we look in the Solr Admin UI, we'll see that both nodes are considered leaders (indicated with a solid black circle).

节点A和节点B都是shard,在启动A节点的时候定义好了集群只能有2个shard。如我们我们看一下Solr Admin UI,我们将会看到所有的节点都被当做了leader(通过一个实心黑圆来表示)

  • Node C is started and pointed to the stand-alone ZooKeeper.

  • 启动节点C并且把它指向单独的ZooKeeper.

Node C will automatically become a replica of Node A because we didn't specify any other shard for it to belong to, and it cannot become a new shard because we only defined two shards and those have both been taken.


  • Node D is started and pointed to the stand-alone ZooKeeper.

  • 启动节点D并且把它指向单独的ZooKeeper

Node D will automatically become a replica of Node B, for the same reasons why Node C is a replica of Node A.


Upon restart, suppose that Node C starts before Node A. What happens? Node C will become the leader, while Node A becomes a replica of Node C.




  • 0
  • 0
    觉得还不错? 一键收藏
  • 0




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


