【redis源码学习】redis 专属“链表”:ziplist,2024年最新java大数据挖掘面试

先自我介绍一下,小编浙江大学毕业,去过华为、字节跳动等大厂,目前阿里P7

深知大多数程序员,想要提升技能,往往是自己摸索成长,但自己不成体系的自学效果低效又漫长,而且极易碰到天花板技术停滞不前!

因此收集整理了一份《2024年最新Java开发全套学习资料》,初衷也很简单,就是希望能够帮助到想自学提升又不知道该从何学起的朋友。
img
img
img
img
img
img

既有适合小白学习的零基础资料,也有适合3年以上经验的小伙伴深入学习提升的进阶课程,涵盖了95%以上Java开发知识点,真正体系化!

由于文件比较多,这里只是将部分目录截图出来,全套包含大厂面经、学习笔记、源码讲义、实战项目、大纲路线、讲解视频,并且后续会持续更新

如果你需要这些资料,可以添加V获取:vip1024b (备注Java)
img

正文

用过 Python 的列表吗?就是那种可以存储任意类型数据的,支持随机读取的数据结构。

没有用过的话那就没办法了。

本质上这种列表可以使用数组、链表作为其底层结构,不知道Python中的列表是以什么作为底层结构的。

但是redis的列表既不是用链表,也不是用数组作为其底层实现的,原因也显而易见:数组不方便,弄个二维的?柔性的?怎么写?链表可以实现,通用链表嘛,数据域放 void* 就可以实现列表功能。但是,链表的缺点也很明显,容易造成内存碎片。

在这个大环境下,秉承着“能省就省”的指导思想,请你设计一款数据结构。


结构设计


在这里插入图片描述

这个图里要注意,右侧是没有记录“当前元素的大小”的

这个图挺详细哈,都省得我对每一个字段释义了,整挺好。

其他话,文件开头的注释也讲的很清楚了。(ziplist.c)

/* The ziplist is a specially encoded dually linked list that is designed

  • to be very memory efficient. It stores both strings and integer values,

  • where integers are encoded as actual integers instead of a series of

  • characters. It allows push and pop operations on either side of the list

  • in O(1) time. However, because every operation requires a reallocation of

  • the memory used by the ziplist, the actual complexity is related to the

  • amount of memory used by the ziplist.


  • ZIPLIST OVERALL LAYOUT

  • ======================

  • The general layout of the ziplist is as follows:

  • NOTE: all fields are stored in little endian, if not specified otherwise.

  • <uint32_t zlbytes> is an unsigned integer to hold the number of bytes that

  • the ziplist occupies, including the four bytes of the zlbytes field itself.

  • This value needs to be stored to be able to resize the entire structure

  • without the need to traverse it first.

  • <uint32_t zltail> is the offset to the last entry in the list. This allows

  • a pop operation on the far side of the list without the need for full

  • traversal.

  • <uint16_t zllen> is the number of entries. When there are more than

  • 2^16-2 entries, this value is set to 2^16-1 and we need to traverse the

  • entire list to know how many items it holds.

  • <uint8_t zlend> is a special entry representing the end of the ziplist.

  • Is encoded as a single byte equal to 255. No other normal entry starts

  • with a byte set to the value of 255.

  • ZIPLIST ENTRIES

  • ===============

  • Every entry in the ziplist is prefixed by metadata that contains two pieces

  • of information. First, the length of the previous entry is stored to be

  • able to traverse the list from back to front. Second, the entry encoding is

  • provided. It represents the entry type, integer or string, and in the case

  • of strings it also represents the length of the string payload.

  • So a complete entry is stored like this:

  • Sometimes the encoding represents the entry itself, like for small integers

  • as we’ll see later. In such a case the part is missing, and we

  • could have just:

  • The length of the previous entry, , is encoded in the following way:

  • If this length is smaller than 254 bytes, it will only consume a single

  • byte representing the length as an unsinged 8 bit integer. When the length

  • is greater than or equal to 254, it will consume 5 bytes. The first byte is

  • set to 254 (FE) to indicate a larger value is following. The remaining 4

  • bytes take the length of the previous entry as value.

  • So practically an entry is encoded in the following way:

  • <prevlen from 0 to 253>

  • Or alternatively if the previous entry length is greater than 253 bytes

  • the following encoding is used:

  • 0xFE <4 bytes unsigned little endian prevlen>

  • The encoding field of the entry depends on the content of the

  • entry. When the entry is a string, the first 2 bits of the encoding first

  • byte will hold the type of encoding used to store the length of the string,

  • followed by the actual length of the string. When the entry is an integer

  • the first 2 bits are both set to 1. The following 2 bits are used to specify

  • what kind of integer will be stored after this header. An overview of the

  • different types and encodings is as follows. The first byte is always enough

  • to determine the kind of entry.

  • |00pppppp| - 1 byte

  •  String value with length less than or equal to 63 bytes (6 bits).
    
  •  "pppppp" represents the unsigned 6 bit length.
    
  • |01pppppp|qqqqqqqq| - 2 bytes

  •  String value with length less than or equal to 16383 bytes (14 bits).
    
  •  IMPORTANT: The 14 bit number is stored in big endian.
    
  • |10000000|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes

  •  String value with length greater than or equal to 16384 bytes.
    
  •  Only the 4 bytes following the first byte represents the length
    
  •  up to 2^32-1. The 6 lower bits of the first byte are not used and
    
  •  are set to zero.
    
  •  IMPORTANT: The 32 bit number is stored in big endian.
    
  • |11000000| - 3 bytes

  •  Integer encoded as int16_t (2 bytes).
    
  • |11010000| - 5 bytes

  •  Integer encoded as int32_t (4 bytes).
    
  • |11100000| - 9 bytes

  •  Integer encoded as int64_t (8 bytes).
    
  • |11110000| - 4 bytes

  •  Integer encoded as 24 bit signed (3 bytes).
    
  • |11111110| - 2 bytes

  •  Integer encoded as 8 bit signed (1 byte).
    
  • |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer.

  •  Unsigned integer from 0 to 12. The encoded value is actually from
    
  •  1 to 13 because 0000 and 1111 can not be used, so 1 should be
    
  •  subtracted from the encoded 4 bit value to obtain the right value.
    
  • |11111111| - End of ziplist special entry.

  • Like for the ziplist header, all the integers are represented in little

  • endian byte order, even when this code is compiled in big endian systems.

  • EXAMPLES OF ACTUAL ZIPLISTS

  • ===========================

  • The following is a ziplist containing the two elements representing

  • the strings “2” and “5”. It is composed of 15 bytes, that we visually

  • split into sections:

  • [0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [ff]

  •    |             |          |       |       |     |
    
  • zlbytes        zltail    entries   "2"     "5"   end
    
  • The first 4 bytes represent the number 15, that is the number of bytes

  • the whole ziplist is composed of. The second 4 bytes are the offset

  • at which the last ziplist entry is found, that is 12, in fact the

  • last entry, that is “5”, is at offset 12 inside the ziplist.

  • The next 16 bit integer represents the number of elements inside the

  • ziplist, its value is 2 since there are just two elements inside.

  • Finally “00 f3” is the first entry representing the number 2. It is

  • composed of the previous entry length, which is zero because this is

  • our first entry, and the byte F3 which corresponds to the encoding

  • |1111xxxx| with xxxx between 0001 and 1101. We need to remove the “F”

  • higher order bits 1111, and subtract 1 from the “3”, so the entry value

  • is “2”. The next entry has a prevlen of 02, since the first entry is

  • composed of exactly two bytes. The entry itself, F6, is encoded exactly

  • like the first entry, and 6-1 = 5, so the value of the entry is 5.

  • Finally the special entry FF signals the end of the ziplist.

  • Adding another element to the above string with the value “Hello World”

  • allows us to show how the ziplist encodes small strings. We’ll just show

  • the hex dump of the entry itself. Imagine the bytes as following the

  • entry that stores “5” in the ziplist above:

  • [02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64]

  • The first byte, 02, is the length of the previous entry. The next

  • byte represents the encoding in the pattern |00pppppp| that means

  • that the entry is a string of length , so 0B means that

  • an 11 bytes string follows. From the third byte (48) to the last (64)

  • there are just the ASCII characters for “Hello World”.


  • Copyright © 2009-2012, Pieter Noordhuis

  • Copyright © 2009-2017, Salvatore Sanfilippo

  • All rights reserved.

*/


看完了么?接下来就是基操阶段了,对于任何一种数据结构,基操无非增删查改。

实际节点

typedef struct zlentry {

unsigned int prevrawlensize; /* Bytes used to encode the previous entry len*/

unsigned int prevrawlen; /* Previous entry len. */

unsigned int lensize; /* Bytes used to encode this entry type/len.

For example strings have a 1, 2 or 5 bytes

header. Integers always use a single byte.*/

unsigned int len; /* Bytes used to represent the actual entry.

For strings this is just the string length

while for integers it is 1, 2, 3, 4, 8 or

0 (for 4 bit immediate) depending on the

number range. */

unsigned int headersize; /* prevrawlensize + lensize. */

unsigned char encoding; /* Set to ZIP_STR_* or ZIP_INT_* depending on

the entry encoding. However for 4 bits

immediate integers this can assume a range

of values and must be range-checked. */

unsigned char p; / Pointer to the very start of the entry, that

is, this points to prev-entry-len field. */

} zlentry;

基本操作


我觉得这张图还是要再摆一下:

在这里插入图片描述

这个图里要注意,右侧是没有记录“当前元素的大小”的

真实插入的是这个函数:

讲真,头皮有点发麻。那么我们等下还是用老套路,按步骤拆开来看。

/* Insert item at “p”. */

unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {

size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen;

unsigned int prevlensize, prevlen = 0;

size_t offset;

int nextdiff = 0;

unsigned char encoding = 0;

long long value = 123456789; /* initialized to avoid warning. Using a value

that is easy to see if for some reason

we use it uninitialized. */

zlentry tail;

/* Find out prevlen for the entry that is inserted. */

if (p[0] != ZIP_END) {

ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);

} else {

unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);

if (ptail[0] != ZIP_END) {

prevlen = zipRawEntryLength(ptail);

}

}

/* See if the entry can be encoded */

if (zipTryEncoding(s,slen,&value,&encoding)) {

最后

由于篇幅有限,这里就不一一罗列了,20道常见面试题(含答案)+21条MySQL性能调优经验小编已整理成Word文档或PDF文档

MySQL全家桶笔记

还有更多面试复习笔记分享如下

Java架构专题面试复习

网上学习资料一大堆,但如果学到的知识不成体系,遇到问题时只是浅尝辄止,不再深入研究,那么很难做到真正的技术提升。

需要这份系统化的资料的朋友,可以添加V获取:vip1024b (备注Java)
img

一个人可以走的很快,但一群人才能走的更远!不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人,都欢迎加入我们的的圈子(技术交流、学习资源、职场吐槽、大厂内推、面试辅导),让我们一起学习成长!
e if the entry can be encoded */

if (zipTryEncoding(s,slen,&value,&encoding)) {

最后

由于篇幅有限,这里就不一一罗列了,20道常见面试题(含答案)+21条MySQL性能调优经验小编已整理成Word文档或PDF文档

[外链图片转存中…(img-zx9nJkRy-1713438222133)]

还有更多面试复习笔记分享如下

[外链图片转存中…(img-xIeIR4QF-1713438222133)]

网上学习资料一大堆,但如果学到的知识不成体系,遇到问题时只是浅尝辄止,不再深入研究,那么很难做到真正的技术提升。

需要这份系统化的资料的朋友,可以添加V获取:vip1024b (备注Java)
[外链图片转存中…(img-b1WAa9KH-1713438222134)]

一个人可以走的很快,但一群人才能走的更远!不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人,都欢迎加入我们的的圈子(技术交流、学习资源、职场吐槽、大厂内推、面试辅导),让我们一起学习成长!

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值