redis翻译_内存优化

最新推荐文章于 2024-10-09 12:19:39 发布

啊帮BILL

最新推荐文章于 2024-10-09 12:19:39 发布

阅读量1.8k

点赞数

分类专栏： redis文档翻译 redis文档翻译文章标签： redis 文档

redis文档翻译同时被 2 个专栏收录

7 篇文章 0 订阅

订阅专栏

redis文档翻译

7 篇文章 0 订阅

订阅专栏

Special encoding of small aggregate data types

特别编码小集合的数据类型

Since Redis 2.2 many data types are optimized to use less space up to a certain size. Hashes, Lists, Sets composed of just integers, and Sorted Sets, when smaller than a given number of elements, and up to a maximum element size, are encoded in a very memory efficient way that uses up to 10 times less memory (with 5 time less memory used being the average saving).

从redis2.2开始，一些数据类型可以指定一定的size使用更小的内存。Hashes，List，Sets仅仅需要一个配置数字，还有Sorted sets,当给定的元素和元素的最大size小于配置的数字，将使用小于10倍的内存做记录（少于5倍的内存去保存）。

This is completely transparent from the point of view of the user and API. Since this is a CPU / memory trade off it is possible to tune the maximum number of elements and maximum element size for special encoded types using the following redis.conf directives.

从用户的角度和API来看，这是完全透明的。下面的配置指令就是对这些数据类型这个数字的配置：

hash-max-zipmap-entries 64 (hash-max-ziplist-entries for Redis >= 2.6)
hash-max-zipmap-value 512  (hash-max-ziplist-value for Redis >= 2.6)
list-max-ziplist-entries 512
list-max-ziplist-value 64
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
set-max-intset-entries 512

If a specially encoded value will overflow the configured max size, Redis will automatically convert it into normal encoding. This operation is very fast for small values, but if you change the setting in order to use specially encoded values for much larger aggregate types the suggestion is to run some benchmark and test to check the conversion time.

如果这个特殊的编码的值大于配置的值，Redis将自动转换成普通的编码。配置小值的情况下操作是非常快的，但是如果你想加大配置，你最好反复测试。

Using 32 bit instances使用32为redis实例

Redis compiled with 32 bit target uses a lot less memory per key, since pointers are small, but such an instance will be limited to 4 GB of maximum memory usage. To compile Redis as 32 bit binary use make 32bit. RDB and AOF files are compatible between 32 bit and 64 bit instances (and between little and big endian of course) so you can switch from 32 to 64 bit, or the contrary, without problems.

因为在32位操作系统中指针比较小，所以使用32为的Redis每一个key占的内存都比较小，但是32为的redis实例只能使用的最大内存为4G。可以将redis编译成32位。RDB和AOF文件在32为和64位都可以使用，因此你可以使Redis在32和64位中相互转换没有任何问题.

Bit and byte level operations(bit和byte级别的操作)

Redis 2.2 introduced new bit and byte level operations: GETRANGE, SETRANGE, GETBIT and SETBIT. Using this commands you can treat the Redis string type as a random access array. For instance if you have an application where users are identified by an unique progressive integer number, you can use a bitmap in order to save information about sex of users, setting the bit for females and clearing it for males, or the other way around. With 100 millions of users this data will take just 12 megabyte of RAM in a Redis instance. You can do the same using GETRANGE and SETRANGE in order to store one byte of information for user. This is just an example but it is actually possible to model a number of problems in very little space with this new primitives.

Redis2.2新引进bit和byte级别的操作：GETRANGE,SETRNGE,GETBIT和SETBIT。使用这些命令你可以将redis的String类型看成一个随机访问的数组。比如，你有一个app，app中所有的user都有一个唯一的数字（比如id），你可以使用一个位图保存user的性别信息，设置bit位成男，或者相反。这样设置之后100 万的user保存这样的信息只需要12M。还可以以同样的方式使用GETRANGE和SETRANGE保存user的信息。这仅仅是一个例子，但是从事实上说明小空间保存标准。

Use hashes when possible在可能的时候使用hash

Small hashes are encoded in a very small space, so you should try representing your data using hashes every time it is possible. For instance if you have objects representing users in a web application, instead of using different keys for name, surname, email, password, use a single hash with all the required fields.

小hash编码消耗的空间很小，因此如果可能的话应该尝试使用它。比如，在web应用中user，使用不同的key表示name，seruname，email，password，这些可以使用hash中的fields代替。

If you want to know more about this, read the next section.

如果想知道更多关于hash的使用，继续往下看。

Using hashes to abstract a very memory efficient plain key-value store on top of Redis 在redis中使用hash，在key-value存储中是领先的

I understand the title of this section is a bit scaring, but I'm going to explain in details what this is about.

理解这个标题有点困难，但是我马上就详细解释。

Basically it is possible to model a plain key-value store using Redis where values can just be just strings, that is not just more memory efficient than Redis plain keys but also much more memory efficient than memcached.

当使用redis，values仅仅只是string类型，那么redis的效率比memcached高，但是redis要消耗更多的内存。

Let's start with some fact: a few keys use a lot more memory than a single key containing a hash with a few fields. How is this possible? We use a trick. In theory in order to guarantee that we perform lookups in constant time (also known as O(1) in big O notation) there is the need to use a data structure with a constant time complexity in the average case, like a hash table.

让我们看看一个事实：几个key使用的内存大于单个key保护几个字段的hash。这怎么可能呢？这样来看。理论上，为了保证查找时间是一个不变的数(我们知道是O(1))，redis使用了类似hash table的这种数据结构。

But many times hashes contain just a few fields. When hashes are small we can instead just encode them in an O(N)data structure, like a linear array with length-prefixed key value pairs. Since we do this only when N is small, the amortized time for HGET and HSET commands is still O(1): the hash will be converted into a real hash table as soon as the number of elements it contains will grow too much (you can configure the limit in redis.conf).

很多时候hash只包含几个字段。当hash比较小时，我们可以只编码一个O(N)的数据结构，像一个有长度和前缀key-value对的线性数组。因为现在的N非常小，所有HGET和HSET命令消耗也只是O(1)：当这个hash不断地增加新元素那么它将被转换成一个真正的hash table（转换的限制可以在redis.conf中配置）

This does not work well just from the point of view of time complexity, but also from the point of view of constant times, since a linear array of key value pairs happens to play very well with the CPU cache (it has a better cache locality than a hash table).

从时间复杂度来说这并不是很好，但是从恒定时间的观点上来看，key- value对的线性数组很好地利用了CPU 的缓存(它的缓存位置比hash好)。

However since hash fields and values are not (always) represented as full featured Redis objects, hash fields can't have an associated time to live (expire) like a real key, and can only contain a string. But we are okay with this, this was anyway the intention when the hash data type API was designed (we trust simplicity more than features, so nested data structures are not allowed, as expires of single fields are not allowed).

然而hash field和value不能表示所有的Redis 对象，hash field 不能像真正的key一样关联时间，并且只能包含string类型。但是这也是好的，因为这就是hash数据类型被设计的意图（我们坚信简单的特点，因此在hash中，嵌套不被允许，超时也不被允许）。

So hashes are memory efficient. This is very useful when using hashes to represent objects or to model other problems when there are group of related fields. But what about if we have a plain key value business?

因此hash是有内存效率的。在我们需要对表示一些有相关联字段的对象或者问题模式的时候非常有用。但是，如何只有一个 key-value的业务会怎么样呢？

Imagine we want to use Redis as a cache for many small objects, that can be JSON encoded objects, small HTML fragments, simple key -> boolean values and so forth. Basically anything is a string -> string map with small keys and values.

想象一下，如果我们使用Redis作为一些小对象一个缓存，比如一个JSON编码的对象，小的HTML文档，一个简单的key 对应boolean类型的value等等。都是比较小的key和value 映射的 sring->string标记。

Now let's assume the objects we want to cache are numbered, like:

现在假定我们要去缓存这些对象，像：

object:102393
object:1234
object:5

This is what we can do. Every time there is to perform a SET operation to set a new value, we actually split the key into two parts, one used as a key, and used as field name for the hash. For instance the object named "object:1234" is actually split into:

这是我们可以做的。无论什么时候都可以去执行SET 命令去设置一个新的值，实际上我们可以把这个key分成两个部分，一部分作为key，另一个部分作为hash的字段。比如“object:1234”可以被分成这样：

a Key named object:12
a Field named 34

So we use all the characters but the latest two for the key, and the final two characters for the hash field name. To set our key we use the following command:

这个key我们使用了最新的描述，并且最后两个字段表示hash字段的名称。使用下面命令设置我们的key：

HSET object:12 34 somevalue

As you can see every hash will end containing 100 fields, that is an optimal compromise between CPU and memory saved.

正如你所知，hash包含100个字段在cpu和内存之间保存是最理想的。

There is another very important thing to note, with this schema every hash will have more or less 100 fields regardless of the number of objects we cached. This is since our objects will always end with a number, and not a random string. In some way the final number can be considered as a form of implicit pre-sharding.

还有另一个非常重要的事情需要注意，无论我们缓存的对象多少，hash对象的字段数都可能大于或者小于100。这是由于我们的对象总是最后一个数字，并且不是一个随机的string。在某种程度上最后的数字可以被认为是一种隐式pre-sharding。

What about small numbers? Like object:2? We handle this case using just "object:" as a key name, and the whole number as the hash field name. So object:2 and object:10 will both end inside the key "object:", but one as field name "2" and one as "10".

小的数字怎么样？像 object:2？我们可以这样处理,使用“object:”作为key名称，整个数字作为hash的的一个field名。因此 object:2和object10 两个都是一共同的key "object ",但是field名称分别为 "2"和“10”。

How much memory we save this way?

这样做可以节省多少内存？

I used the following Ruby program to test how this works:

我使用一个Ruby程序去测试：

require 'rubygems'
require 'redis'

UseOptimization = true

def hash_get_key_field(key)
    s = key.split(":")
    if s[1].length > 2
        {:key => s[0]+":"+s[1][0..-3], :field => s[1][-2..-1]}
    else
        {:key => s[0]+":", :field => s[1]}
    end
end

def hash_set(r,key,value)
    kf = hash_get_key_field(key)
    r.hset(kf[:key],kf[:field],value)
end

def hash_get(r,key,value)
    kf = hash_get_key_field(key)
    r.hget(kf[:key],kf[:field],value)
end

r = Redis.new
(0..100000).each{|id|
    key = "object:#{id}"
    if UseOptimization
        hash_set(r,key,"val")
    else
        r.set(key,"val")
    end
}

This is the result against a 64 bit instance of Redis 2.2:

在64位的redis实例上得到结果：

UseOptimization set to true: 1.7 MB of used memory
UseOptimization set to false; 11 MB of used memory

This is an order of magnitude, I think this makes Redis more or less the most memory efficient plain key value store out there.

这只是一个数量级，我认为这是使得redis内存最有效率的key value存储。

WARNING: for this to work, make sure that in your redis.conf you have something like this:

警告：要能正常运行，确保redis.conf 有一些配置，比如这个：

hash-max-zipmap-entries 256(元素小于256使用zipmap存储)

Also remember to set the following field accordingly to the maximum size of your keys and values:

也要记得去设置相应的字段，你的key和value的长度的最大大小：（小于该值使用zipmap存储）

hash-max-zipmap-value 1024（key和value的长度小于2014使用zipmapcc）

Every time a hash will exceed the number of elements or element size specified it will be converted into a real hash table, and the memory saving will be lost.

每当hash的元素超过设置的数字或者指定的大小将会转换成真正的hash table，将不在节约内存。

You may ask, why don't you do this implicitly in the normal key space so that I don't have to care? There are two reasons: one is that we tend to make trade offs explicit, and this is a clear tradeoff between many things: CPU, memory, max element size. The second is that the top level key space must support a lot of interesting things like expires, LRU data, and so forth so it is not practical to do this in a general way.

你可能会问，为什么你们不默认正常的设置？这样我就不用管了？有两个原因：一个是我们需要一个明确的权衡，并且这个是关于CPU,内存，元素个数大小的权衡。另一个是key必须支持很多特性比如expires（过期），LRU 数据等等，因此做默认设置是没有实际意义的。

But the Redis Way is that the user must understand how things work so that he is able to pick the best compromise, and to understand how the system will behave exactly.

然而redis的方式让用户必须去理解很多事情以至于他可以做出最好的权衡，并且了解系统的行为。

Memory allocation内存配置

To store user keys, Redis allocates at most as much memory as the maxmemory setting enables (however there are small extra allocations possible).

使用key存储，redis运行使用 maxmemory 设置最大内存(可能会有额外的消耗).

The exact value can be set in the configuration file or set later via CONFIG SET (see Using memory as an LRU cache for more info). There are a few things that should be noted about how Redis manages memory:

准确的值可以在配置文件或者使用CONFIG SET（详见Using memory as an LRU cache for more info）命令设置。下面设置redis内存需要注意的几点：

Redis will not always free up (return) memory to the OS when keys are removed. This is not something special about Redis, but it is how most malloc() implementations work. For example if you fill an instance with 5GB worth of data, and then remove the equivalent of 2GB of data, the Resident Set Size (also known as the RSS, which is the number of memory pages consumed by the process) will probably still be around 5GB, even if Redis will claim that the user memory is around 3GB. This happens because the underlying allocator can't easily release the memory. For example often most of the removed keys were allocated in the same pages as the other keys that still exist.

当key被删除时，redis是不会向系统释放内存的。这不是redis特例，是大多数使用malloc（）函数分配内存程序的运行方式。比如，你的redis有5G的数据，即使你删除了2G的数据，常驻内存的大小（也被称为RSS,它是内存页的数量消耗）也仍然接近5G，尽管Redis可能显示的内存使用是3G左右。它的发生是因为分配运算符不能很容易的释放内存。例如，通常与删除键分配在同一个页面的其他键仍然存在。

The previous point means that you need to provision memory based on your peak memory usage. If your workload from time to time requires 10GB, even if most of the times 5GB could do, you need to provision for 10GB.

前面一点的意思是你需要分配的内存基于你需要用内存的最大值。如果你有时需要10G，那么尽管多数时间只需要5G，你也应该提供10G的内存。

However allocators are smart and are able to reuse free chunks of memory, so after you freed 2GB of your 5GB data set, when you start adding more keys again, you'll see the RSS (Resident Set Size) to stay steady and don't grow more, as you add up to 2GB of additional keys. The allocator is basically trying to reuse the 2GB of memory previously (logically) freed.

然而分配运算符会很聪明地区重新使用空闲的内存块，因此你的5G 的数据释放2G之后，当你重新添加更多的key时，即使你增加的key数据达到2G，你将看见RSS（常驻内存大小）保持稳定没有增长。分配符主要是去尝试使用之前释放的内存（逻辑上释放的内存）。

Because of all this, the fragmentation ratio is not reliable when you had a memory usage that at peak is much larger than the currently used memory. The fragmentation is calculated as the amount of memory currently in use (as the sum of all the allocations performed by Redis) divided by the physical memory actually used (the RSS value). Because the RSS reflects the peak memory, when the (virtually) used memory is low since a lot of keys / values were freed, but the RSS is high, the ratio mem_used / RSS will be very high.

所以，当你内存的最大值远远大于平时的内存产生的碎片率是不可靠的。碎片率的计算是目前使用的内存(redis执行分配的内存)除以实际分配的物理内存（RSS的值）。由于RSS是最大内存，当很多key/value被释放内存的使用率非常低，但是RSS很高，因此mem_used / RSS就非常高.

If maxmemory is not set Redis will keep allocating memory as it finds fit and thus it can (gradually) eat up all your free memory. Therefore it is generally advisable to configure some limit. You may also want to set maxmemory-policy tonoeviction (which is not the default value in some older versions of Redis).

如果没有设置maxmemory Redis就会一直分配能找到的内存并且逐步地吃光所有的空闲内存。因此一般都要明确地配置限制。你也可以设置maxmemory-policy为noeviction （在redis以前的一些版本不是默认值）.

It makes Redis return an out of memory error for write commands if and when it reaches the limit - which in turn may result in errors in the application but will not render the whole machine dead because of memory starvation.

这可以使当redis的内存使用达到限制值时返回内存溢出错误-这个返回的错误结果不会导致整个机器死掉因为饥饿的是内存。