Lua中的表--读《Lua设计与实现》笔记

最新推荐文章于 2024-04-23 16:01:55 发布

zry963

最新推荐文章于 2024-04-23 16:01:55 发布

阅读量772

点赞数

分类专栏： Lua

本文链接：https://blog.csdn.net/zry112233/article/details/80745468

版权

Lua 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

概述

1.Lua语言用表来表示一切数据结构。

2.Lua表分为数组和散列表部分。

数组部分索引从1开始。

散列表部分可以存储任何不能存储在数组部分的数据，唯一的要求是键值不能为nil

数据结构

（lobject.h）
typedef struct Table {
  CommonHeader;
  lu_byte flags;  /* 1<<p means tagmethod(p) is not present */
  lu_byte lsizenode;  /* log2 of size of 'node' array */
  unsigned int sizearray;  /* size of 'array' array */
  TValue *array;  /* array part */
  Node *node;
  Node *lastfree;  /* any free position is before this position */
  struct Table *metatable;
  GCObject *gclist;
} Table;

lu_byte flags;是一个byte类型的数据，用于表示这个表中提供了哪些元方法。

lu_byte lsizenode;以2为底的散列表大小的对数值。由此可见散列表的大小一定是2的幂，即如果散列桶数组要扩展的话，也是每次在原大小的基础上乘以2的形式扩展。

unsigned int sizearray;表示数组部分的大小

TValue *array; 指向数组部分的指针

Node *node;指向该表的散列桶数组起始位置的指针

Node *lastfree;指向该表散列桶数组的最后位置的指针。

struct Table *metatable;存放该表的元表

GCObject *gclist;GC相关的链表

Node类型定义

typedef struct Node {
  TValue i_val;
  TKey i_key;
} Node;

Value类型定义

typedef struct lua_TValue {
  TValuefields;
} TValue;

TKey类型定义

typedef union TKey {
  struct {
    TValuefields;
    int next;  /* for chaining (offset for next node) */
  } nk;
  TValue tvk;
} TKey;

由此可知Lua表将数据存放在两种类型的数据结构中，一个是数组，一个是散列表。

操作算法

由于表中包含散列表和数组两部分数据，所以一个以整数作为键值的数据写入Lua表时，并不确定是写入数组还是散列表。

1、查找

伪代码如下：

如果输入的key是一个整数，并且它的值>0&&<=数组大小

尝试在数组部分查找

否则尝试在散列表查找：

计算出该key的散列值，根据此散列值访问Node数组得到散列桶所在的位置

遍历该散列桶下的所有链表元素，直到找到该key为止

function print_ipairs(t)
  for k,v in ipairs(t) do
    print(k)
  end
end
local t = {}
t[1] = 0
t[100] = 0

只有1作为数组部分存储下来了，而100是存储到散列表部分中的。

2、新增元素

添加新元素流程比较复杂，涉及到重新分配表中的数组和散列表部分的流程。

散列表部分的数据组织是：首先计算数据的key所在的桶数组位置，这个位置称为mainposition。相同的mainposition的数据以链表形式组织。

API包括luaH_set、luaH_setnum、luaH_setstr这3个函数，他们的实际行为并不是在其函数内部对key所对应的数据进行添加或者修改，而是返回根据该key查找到的TValue指针，由外部的使用者来进行实际的替换操作。当找不到对应的key时，这几个API最终都会调用内部的newkey函数分配一个新的key来返回。

(ltable.c)
/*
** inserts a new key into a hash table; first, check whether key's main
** position is free. If not, check whether colliding node is in its main
** position or not: if it is not, move colliding node to an empty place and
** put new key in its main position; otherwise (colliding node is in its main
** position), new key goes to an empty position.
*/
TValue *luaH_newkey (lua_State *L, Table *t, const TValue *key) {
  Node *mp;
  TValue aux;
  if (ttisnil(key)) luaG_runerror(L, "table index is nil");
  else if (ttisfloat(key)) {
    lua_Integer k;
    if (luaV_tointeger(key, &k, 0)) {  /* does index fit in an integer? */
      setivalue(&aux, k);
      key = &aux;  /* insert it as an integer */
    }
    else if (luai_numisnan(fltvalue(key)))
      luaG_runerror(L, "table index is NaN");
  }
  mp = mainposition(t, key);
  if (!ttisnil(gval(mp)) || isdummy(t)) {  /* main position is taken? */
    Node *othern;
    Node *f = getfreepos(t);  /* get a free place */
    if (f == NULL) {  /* cannot find a free place? */
      rehash(L, t, key);  /* grow table */
      /* whatever called 'newkey' takes care of TM cache */
      return luaH_set(L, t, key);  /* insert key into grown table */
    }
    lua_assert(!isdummy(t));
    othern = mainposition(t, gkey(mp));
    if (othern != mp) {  /* is colliding node out of its main position? */
      /* yes; move colliding node into free position */
      while (othern + gnext(othern) != mp)  /* find previous */
        othern += gnext(othern);
      gnext(othern) = cast_int(f - othern);  /* rechain to point to 'f' */
      *f = *mp;  /* copy colliding node into free pos. (mp->next also goes) */
      if (gnext(mp) != 0) {
        gnext(f) += cast_int(mp - f);  /* correct 'next' */
        gnext(mp) = 0;  /* now 'mp' is free */
      }
      setnilvalue(gval(mp));
    }
    else {  /* colliding node is in its own main position */
      /* new node will go into free position */
      if (gnext(mp) != 0)
        gnext(f) = cast_int((mp + gnext(mp)) - f);  /* chain new position */
      else lua_assert(gnext(f) == 0);
      gnext(mp) = cast_int(f - mp);
      mp = f;
    }
  }
  setnodekey(L, &mp->i_key, key);
  luaC_barrierback(L, t, key);
  lua_assert(ttisnil(gval(mp)));
  return gval(mp);
}

流程如下：

(1)根据可以来查找其所在散列桶的mainposition，如果返回的结果中，该Node的值为nil，那么直接将key赋值并且返回Node的TValue指针就可以了

(2)否则说明该mainposition上已经有其他数据了，需要重新分配空间给这个新的key，然后将这个新的Node串联到对应的散列桶上。

可见，整个过程都是在散列桶部分进行的，理由是即使key是个数字，也已经在调用newkey函数之前进行了查找，结果却没有找到，所以这个key都会进入散列同部分来查找。

以上操作设计重新对表空间进行分配的情况：

/*
** nums[i] = number of keys 'k' where 2^(i - 1) < k <= 2^i
*/
static void rehash (lua_State *L, Table *t, const TValue *ek) {
  unsigned int asize;  /* optimal size for array part */
  unsigned int na;  /* number of keys in the array part */
  unsigned int nums[MAXABITS + 1];
  int i;
  int totaluse;
  for (i = 0; i <= MAXABITS; i++) nums[i] = 0;  /* reset counts */
  na = numusearray(t, nums);  /* count keys in array part */
  totaluse = na;  /* all those keys are integer keys */
  totaluse += numusehash(t, nums, &na);  /* count keys in hash part */
  /* count extra key */
  na += countint(ek, nums);
  totaluse++;
  /* compute new size for array part */
  asize = computesizes(nums, &na);
  /* resize the table to new computed sizes */
  luaH_resize(L, t, asize, totaluse - na);
}

(1)分配一个位图nums,将其中的所有位置0.这个沃土额意义在于：nums数组中第i个元素存放的是key在2的(i-1)次幂和2的i次幂之间的元素数量

(2)遍历Lua表中的数组部分，计算其中的元素数量，更新对应的nums数组中的元素数量（numusearray函数）

(3)遍历Lua表中的散列桶部分，以为其中也可能存放了正整数，需要根据这里的正整数数量更新对应的nums数组元素数量（numusehash函数）

(4)此时nums数组已经有了当前这个Table中所有正整数的分配统计，逐个遍历nums数组，获得其范围区间内所有包含的整数数量大于50%的最大索引，作为重新散列后的数组大小，超过这个范围的正整数，就分配到散列桶部分了（computesizes函数）

(5)根据上面计算得到的调整后的数组和散列桶大小调整表（resize函数）

从以上的过程可以看出，一个整数key在同一个表中不同的阶段可能被分配到数组或者散列桶部分。

Lua的设计思想是，简单高效，并且还要尽量节省内存资源。

性能优化：当有很多很小的表需要创建，可以预先填充避免重新散列操作。

3、迭代

由于表的结构包含数组和散列桶部分，其迭代伪代码如下：

在数组部分查找数据：

查找成功，则返回该key的下一个数据

否则在散列桶部分查找数据：

查找成功，则返回该key的下一个数据

4、取长度操作

Lua中可以用#对表进行取长度操作

print(#{10,20,nil,40}) -- 输出2

当表中没数组而存在散列桶时，也是针对其中键值为正整数的部分进行取长度操作。

print(#{[1]=1,[2]=2}) -- 输出2
print(#{[1]=1,[2]=2,[4]=4}) -- 输出4

当表中混合了两种风格的数据，那么优先取数组部分的长度

print(#{[1]=1,[2]=2,1,2,3}) -- 输出3

伪代码如下：

如果表存在数组部分：

在数组部分二分查找返回位置i，其中i是满足条件t[i] ！= nil 且 t[i+1] = nil的最大数据

否则前面的数组部分查不到满足条件的数据，进入散列部分查找：

在散列桶部分二分查找返回位置i,其中i是满足条件t[i] != nil 且 t[i+1] = nil的最大数据

zry963

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Lua中的表--读《Lua设计与实现》笔记

概述1.Lua语言用表来表示一切数据结构。2.Lua表分为数组和散列表部分。数组部分索引从1开始。散列表部分可以存储任何不能存储在数组部分的数据，唯一的要求是键值不能为nil数据结构（lobject.h）typedef struct Table { CommonHeader; lu_byte flags; /* 1&lt;&lt;p means tagmethod(p)...
复制链接

扫一扫