Redis学习之Redis分区

分区是一种将数据分成多个Redis的情况下,让每一个实例将只包含关键字的子集的过程。

分区的好处

  • 它允许更大的数据库,使用的多台计算机的内存的总和。如果不分区,一台计算机有限的内存可以支持有限的数量。

  • 它允许以大规模的计算能力,以多个内核和多个计算机,以及网络带宽向多台计算机和网络适配器在一起使用。

分区的缺点

  • 通常不支持涉及多个按键的操作。例如,不能两个集合之间执行交叉点,如果它们被存储在被映射到不同的Redis实例中的键。

  • 涉及多个键的Redis事务不能被使用。

  • 分区粒度是键,所以它不可能将分片数据集用一个硕大的键在一个非常大的有序集合。

  • 当分区时,数据处理比较复杂,比如要处理多个RDB/AOF文件,使数据备份,需要从多个实例和主机聚集持久性文件。

  • 添加和删除的能力可能很复杂。比如Redis集群支持有加,并在运行时删除节点不支持此功能的能力,但其他系统,如客户端的分区和代理的数据大多是透明平衡。有一个叫Presharding技术有助于解决这方面的问题。

分区的类型

redis提供两种类型的分区。假设我们有四个的Redis实例R0,R1,R2,R3和代表用户喜欢的用户很多键: user:1, user:2, ... 等等

范围分区

范围分区被映射对象转化为具体的Redis实例的范围内实现。假定在本例中用户ID0〜ID10000将进入实例R0,而用户形成ID10001至20000号将进入实例R1等等。

散列分区

在这种类型的分区,一个散列函数(例如,模数函数)被用于转换键成数字,然后数据被存储在不同地方 - 它们是不同redis的实例。

 

-------------------------------------------------------------------------------------------------------------------

写在最前,最近一直在研究redis的使用,包括redis应用场景、性能优化、可行性。这是看到redis官网中一个链接,主要是讲解redis数据分区的,既然是官方推荐的,那我就翻译一下,与大家共享。

Partitioning: how to split data among multiple Redis instances.

分区:如何把数据存储在多个实例中。

Partitioning is the process of splitting your data into multiple Redis instances, so that every instance will only contain a subset of your keys. The first part of this document will introduce you to the concept of partitioning, the second part will show you the alternatives for Redis partitioning.

分区是把你的数据分割存储在多个redis实例中的一个过程,每个实例中只保存一部分key。本文件的第一部分将介绍你到分区的概念,第二部分说明如何使用redis分区。

Why partitioning is useful

为什么分区是有效的

Partitioning in Redis serves two main goals:

在redis服务器中使用分区有两个主要作用:

  • It allows for much larger databases, using the sum of the memory of many computers. Without partitioning you are limited to the amount of memory a single computer can support.
    他可以利用多台计算机的内存共同构建一个大型数据库。不使用分区的情况下你会单个计算机的内存限制。
  • It allows to scale the computational power to multiple cores and multiple computers, and the network bandwidth to multiple computers and network adapters.
    他可以在多核和多台计算机之间扩展,并且适应不同的计算机带宽。

Partitioning basics

分区的基本概念

There are different partitioning criteria. Imagine we have four Redis instances R0, R1, R2, R3, and many keys representing users like user:1, user:2, ... and so forth, we can find different ways to select in which instance we store a given key. In other words there are different systems to map a given key to a given Redis server.

有多种分区方式。比如:我们有四个redis实例:R0, R1, R2, R3和许多代表用户的键(像 user:1, user:2)等等,我可以用不同的方式来从中选择一个实例来存储一个键。换句话说,有不同的系统来映射给定的键存储到给定的redis服务器中。

One of the simplest way to perform partitioning is called range partitioning, and is accomplished by mapping ranges of objects into specific Redis instances. For example I could say, users from ID 0 to ID 10000 will go into instanceR0, while users form ID 10001 to ID 20000 will go into instance R1 and so forth.

一个最简单的分区方法就是范围分区,并通过具体的实例对象来映射该范围。比如,id 1到10000的用户存储到R0中,10001到20000的用户存储到R1中,依此类推。

This systems works and is actually used in practice, however it has the disadvantage that there is to take a table mapping ranges to instances. This table needs to be managed and we need a table for every kind of object we have. Usually with Redis it is not a good idea.
这个方案是可以被应用到实践中的,但是他有一个缺点就是他需要一个表来存储每个实例存储范围的映射关系。这个表是需要维护的,并且我们需要为我们每一种对象创建这么一张表。所以在使用redis时,这不是一个很好的方案。

An alternative to to range partitioning is hash partitioning. This scheme works with any key, no need for a key in the form object_name:<id> as is as simple as this:

散列分区:一种可以替代范围分区的分区方式。该方案适用于任何键,他简单到不需要使用这样的键(object_name:<id>):

  • Take the key name and use an hash function to turn it into a number. For instance I could use the crc32 hash function. So if the key is foobar I do crc32(foobar) that will output something like 93024922.
    使用一个哈希函数把key转换成一个数字。例如:我可以使用CRC32算法。所以如果key是foobar,那么执行CRC32(foobar)的结果就是像93024922一样的东西。
  • I use a modulo operation with this number in order to turn it into a number between 0 and 3, so that I can map this number to one of the four Redis instances I've. So 93024922 modulo 4 equals 2, so I know my key foobar should be stored into the R2 instance. Note: the modulo operation is just the rest of the division, usually it is implemented by the% operator in many programming languages.
    我是用一种取模的函数把一个号码转换到0到3中的一个数字,这样我就可以把这个数字映射到4个redis实例中的一个实例上。93024922模4等于2,这样我就知道foobar这个key应该存放到R2实例中。提示:取模运算是他工程里的说法,通常我们在程序语言设计中只需要使用%(取余)就可以了。

There are many other ways to perform partitioning, but with this two examples you should get the idea. One advanced form of hash partitioning is called consistent hashing and is implemented by a few Redis clients and proxies.

通过这两个例子,你应该能想到还有很多其他的划分方式。哈希分区是一种先进的分区形式,它也被叫做一致性分区,他由几个redis客户端和代理实现。

Different implementations of partitioning

不同的划分方式的实现

Partitioning can be responsibility of different parts of a software stack.

分区可以由一个软件栈的不同职责区域完成。

  • Client side partitioning means that the clients directly select the right node where to write or read a given key. Many Redis clients implement client side partitioning.
    客户端实现分区:是指有客户端直接选在合适的借点进行读写键。许多redis客户端都实现了这种分区方式。
  • Proxy assisted partitioning means that our clients send requests to a proxy that is able to speak the Redis protocol, instead of sending requests directly to the right Redis instance. The proxy will make sure to forward our request to the right Redis instance accordingly to the configured partitioning schema, and will send the replies back to the client. The Redis and Memcached proxy Twemproxy implements proxy assisted partitioning.
    代理辅助分区: 是指客户端把请求通过redis协议发送给代理,而不是直接发送给真正的redis实例服务器。这个代理会确保我们的请求根据配置分区架构发送到正确的redis实例上,并返回给客户端。redis和memcached的代理都是用
    Twemproxy(twitter的一个代理框架)来实现代理服务分区的。
  • Query routing means that you can send your query to a random instance, and the instance will make sure to forward your query to the right node. Redis Cluster implements an hybrid form of query routing, with the help of the client (the request is not directly forwarded from a Redis instance to another, but the client gets redirected to the right node).
    查询路由:是指你可以把一个请求发送给一个随机的实例,这时实例会把该查询转发给正确的节点。Redis集群实现了一种混合查询路由,客户端的请求不用直接从一个实例转发到另一个实例,而是被重定向到正确的节点。

Disadvantages of partitioning
分区的一些缺点

Some features of Redis don't play very well with partitioning:

redis分区在有些方面做的并不好:

  • Operations involving multiple keys are usually not supported. For instance you can't perform the intersection between two sets if they are stored in keys that are mapped to different Redis instances (actually there are ways to do this, but not directly).
    不支持涉及多个键的操作。比如你不能操作映射在两个redis实例上的两个集合的交叉集。(其实可以做到这一点,但是需要间接的解决)
  • Redis transactions involving multiple keys can not be used.
    redis之间多个键的事务不能使用。
  • The partitioning granuliary is the key, so it is not possible to shard a dataset with a single huge key like a very big sorted set.
    使用类似于一个大的排序集合将单一的数据集进行分片是不太可能的。因为分区关键是键。
  • When partitioning is used, data handling is more complex, for instance you have to handle multiple RDB / AOF files, and to make a backup of your data you need to aggregate the persistence files from multiple instances and hosts.
    如果使用分区,数据的处理会变得复杂,你不得不对付多个redis数据库和AOF文件,不得在多个实例和主机之间持久化你的数据。
  • Adding and removing capacity can be complex. For instance Redis Cluster plans to support mostly transparent rebalancing of data with the ability to add and remove nodes at runtime, but other systems like client side partitioning and proxies don't support this feature. However a technique called Presharding helps in this regard.
    添加和删除节点也会变得复杂。比如redis集群计划支持透明的运行时添加和删除节点,但是像客户端分区或者代理分区的特性就不会再被支持。不过Presharding(预分片)可以在这方面提供帮助。

Data store or cache?
作为数据存储还是作为缓存使用?

Partitioning when using Redis ad a data store or cache is conceptually the same, however there is a huge difference. While when Redis is used as a data store you need to be sure that a given key always maps to the same instance, when Redis is used as a cache if a given node is unavailable it is not a big problem if we start using a different node, altering the key-instance map as we wish to improve the availability of the system (that is, the ability of the system to reply to our queries).
使用redis存储数据或者缓存数据在概念上是相同的,但是使用过程中这两者有巨大的差距。当redis被当作持久化数据存储服务器使用的时候意味着对于相同的键值必须被映射到相同的实例上面,但是如果把redis当作数据缓存器,当我们使用不同的节点的时候,找不到对应键值的对象不是什么大问题(缓存就是随时准备好牺牲自己),改变键值和实例映射逻辑可以提供系统的可用性(也就是系统处理查询请求的能力)。

Consistent hashing implementations are often able to switch to other nodes if the preferred node for a given key is not available. Similarly if you add a new node, part of the new keys will start to be stored on the new node.
一致性哈希可以为给定的键值不可用的情况下能够切换到其他的节点上。同样的,你添加一个新的节点,部分新的键值开始存储到新添加的节点上面。

The main concept here is the following:
主要的概念如下:

  • If Redis is used as a cache scaling up and down using consistent hashing is easy.
    如果redis只作为缓存服务器来使用,那么用哈希是相当容易的。
  • If Redis is used as a store, we need to take the map between keys and nodes fixed, and a fixed number of nodes. Otherwise we need a system that is able to rebalance keys between nodes when we add or remove nodes, and currently only Redis Cluster is able to do this, but Redis Cluster is not production ready.
    若果redis被作为数据持久化服务器,我们需要提供节点和键值的固定映射,还有一组固定的redis实例节点。否则我们需要一个系统来为我们增加或者删除键值和节点,目前,redis集群可以做到这点,但是redis集群还没有发布正式版本。

以上两篇收藏于网络,供自己学习

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值