cassandra （2）Understanding the Architecture【Data distribution and replication】

最新推荐文章于 2024-07-10 21:37:49 发布

weian404

最新推荐文章于 2024-07-10 21:37:49 发布

阅读量994

点赞数

分类专栏： cassandra 文章标签： java Cassandra 数据库文档

cassandra 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

About data distribution and replication
In Cassandra, data distribution and replication go together. This is because Cassandra is designed as a
peer-to-peer system that makes copies of the data and distributes the copies among a group of nodes.
Data is organized by table and identified by a primary key. The primary key determines which node the
data is stored on. Copies of rows are called replicas. When data is first written, it is also referred to as a
replica.

When your create a cluster, you must specify the following:

• Virtual nodes: assigns data ownership to physical machines.
• Partitioner: partitions the data across the cluster.
• Replication strategy: determines the replicas for each row of data.
• Snitch: defines the topology information that the replication strategy uses to place replicas.

Consistent hashing
Details about how the consistent hashing mechanism distributes data across a cluster in Cassandra.
Consistent hashing partitions data based on the primary key. For example, if you have the following data:

jim age: 36 car: camaro gender: M
carol age: 37 car: bmw gender: F
johnny age: 12gender: M
suzy age: 10 gender: F

Cassandra assigns a hash value to each primary key:

Primary key Murmur3 hash value
jim -2245462676723223822
carol 7723358927203680754
johnny -6723372854036780875
suzy 1168604627387940318

Each node in the cluster is responsible for a range of data based on the hash value:

Node Murmur3 start rangeMurmur3 end range
A -9223372036854775808 -4611686018427387903
B -4611686018427387904-1
C 0 4611686018427387903
D 46116860184273879049223372036854775807

Cassandra places the data on each node according to the value of the primary key and the range that
the node is responsible for. For example, in a four node cluster, the data in this example is distributed as
follows:

Node Start rangeEnd range Primary key Hash value
A -9223372036854775808-4611686018427387903 johnny-6723372854036780875
B -4611686018427387904 -1 jim -2245462676723223822
C 0 4611686018427387903suzy 1168604627387940318
D 46116860184273879049223372036854775807 carol7723358927203680754

About virtual nodes
Overview of virtual nodes.
Virtual nodes simplify many tasks in Cassandra:
• You no longer have to calculate and assign tokens to each node.

• Rebalancing a cluster is no longer necessary when adding or removing nodes. When a node joins the
cluster, it assumes responsibility for an even portion of data from the other nodes in the cluster. If a
node fails, the load is spread evenly across other nodes in the cluster.
• Rebuilding a dead node is faster because it involves every other node in the cluster and because data
is sent to the replacement node incrementally instead of waiting until the end of the validation phase.
• Improves the use of heterogeneous machines in a cluster. You can assign a proportional number of
virtual nodes to smaller and larger machines.
For more information, see the article Virtual nodes in Cassandra 1.2.
How data is distributed across a cluster (using virtual nodes)

Prior to version 1.2, you had to calculate and assign a single token to each node in a cluster. Each token
determined the node's position in the ring and its portion of data according to its hash value. Although the
design of consistent hashing used prior to version 1.2 (compared to other distribution designs), allowed
moving a single node's worth of data when adding or removing nodes from the cluster, it still required
substantial effort to do so.
Starting in version 1.2, Cassandra changes this paradigm from one token and range per node to many
tokens per node. The new paradigm is called virtual nodes. Virtual nodes allow each node to own a large
number of small partition ranges distributed throughout the cluster. Virtual nodes also use consistent
hashing to distribute data but using them doesn't require token generation and assignment.

The top portion of the graphic shows a cluster without virtual nodes. In this paradigm, each node is
assigned a single token that represents a location in the ring. Each node stores data determined by
mapping the row key to a token value within a range from the previous node to its assigned value. Each
node also contains copies of each row from other nodes in the cluster. For example, range E replicates to
nodes 5, 6, and 1. Notice that a node owns exactly one contiguous partition range in the ring space.
The bottom portion of the graphic shows a ring with virtual nodes. Within a cluster, virtual nodes are
randomly selected and non-contiguous. The placement of a row is determined by the hash of the row key
within many smaller partition ranges belonging to each node.

Setting up virtual nodes
Set the number of tokens on each node in your cluster with the num_tokens parameter in the
cassandra.yaml file.

Generally when all nodes have equal hardware capability, they should have the same number of virtual
nodes. If the hardware capabilities vary among the nodes in your cluster, assign a proportional number of
virtual nodes to the larger machines. For example, you could designate your older machines to use 128
virtual nodes and your new machines (that are twice as powerful) with 256 virtual nodes.
Set the number of tokens on each node in your cluster with the num_tokens parameter in the
cassandra.yaml file.
The recommended value is 256. Do not set the initial_token parameter.

About data replication
Cassandra stores replicas on multiple nodes to ensure reliability and fault tolerance. A replication
strategy determines the nodes where replicas are placed.
The total number of replicas across the cluster is referred to as the replication factor. A replication factor
of 1 means that there is only one copy of each row on one node. A replication factor of 2 means two
copies of each row, where each copy is on a different node. All replicas are equally important; there is no
primary or master replica. As a general rule, the replication factor should not exceed the number of nodes
in the cluster. However, you can increase the replication factor and then add the desired number of nodes
later. When replication factor exceeds the number of nodes, writes are rejected, but reads are served as
long as the desired consistency level can be met.
Two replication strategies are available:
• SimpleStrategy: Use for a single data center only. If you ever intend more than one data center, use the
NetworkTopologyStrategy.
• NetworkTopologyStrategy: Highly recommended for most deployments because it is much easier to
expand to multiple data centers when required by future expansion.
SimpleStrategy
Use only for a single data center. SimpleStrategy places the first replica on a node determined by the
partitioner. Additional replicas are placed on the next nodes clockwise in the ring without considering
topology (rack or data center location).
NetworkTopologyStrategy
Use NetworkTopologyStrategy when you have (or plan to have) your cluster deployed across multiple data
centers. This strategy specify how many replicas you want in each data center.
NetworkTopologyStrategy places replicas in the same data center by walking the ring clockwise until
reaching the first node in another rack. NetworkTopologyStrategy attempts to place replicas on distinct
racks because nodes in the same rack (or similar physical grouping) often fail at the same time due to
power, cooling, or network issues.
When deciding how many replicas to configure in each data center, the two primary considerations are (1)
being able to satisfy reads locally, without incurring cross data-center latency, and (2) failure scenarios.
The two most common ways to configure multiple data center clusters are:
• Two replicas in each data center: This configuration tolerates the failure of a single node per replication
group and still allows local reads at a consistency level of ONE.
• Three replicas in each data center: This configuration tolerates either the failure of a one node per
replication group at a strong consistency level of LOCAL_QUORUM or multiple node failures per data
center using consistency level ONE.

Asymmetrical replication groupings are also possible. For example, you can have three replicas per data
center to serve real-time application requests and use a single replica for running analytics.
Choosing keyspace replication options
To set the replication strategy for a keyspace, see CREATE KEYSPACE.
When you use NetworkToplogyStrategy, during creation of the keyspace strategy_options, you use the
data center names defined for the snitch used by the cluster. To place replicas in the correct location,

Cassandra requires a keyspace definition that uses the snitch-aware data center names. For example, if
the cluster uses the PropertyFileSnitch, create the keyspace using the user-defined data center and rack
names in the cassandra-topologies.properties file. If the cluster uses the EC2Snitch, create the
keyspace using EC2 data center and rack names.

weian404

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
cassandra （2）Understanding the Architecture【Data distribution and replication】

About data distribution and replicationIn Cassandra, data distribution and replication go together. This is because Cassandra is designed as apeer-to-peer system that makes copies of the data and
复制链接

扫一扫