常见的Lua优化小技巧
Lua常见优化点:
1. 尽量使用局部变量
尽量将变量局部化,尤其是频繁使用的变量,是Lua最重要的优化方式
-- 使用全局变量
function sumGlobal(n)
local sum = 0
for i = 1, n do
sum = sum + GLOBAL_VALUE
end
return sum
end
-- 使用局部变量
function sumLocal(n)
local sum = 0
local local_value = GLOBAL_VALUE
for i = 1, n do
sum = sum + local_value
end
return sum
end
GLOBAL_VALUE = 1
local n = 1000000
local start_time = os.clock()
sumGlobal(n)
print("sumGlobal:", os.clock() - start_time)
start_time = os.clock()
sumLocal(n)
print("sumLocal:", os.clock() - start_time)
-- sumGlobal: 0.020
-- sumLocal: 0.010
2. table的相关
减少对表的访问
表访问也有一定的开销,可以将常用的表元素存储在局部变量中
local t = {x = 1, y = 2, z = 3}
-- 直接访问表
function accessTableDirectly(n)
local sum = 0
for i = 1, n do
sum = sum + t.x + t.y + t.z
end
return sum
end
-- 缓存表元素
function accessTableLocally(n)
local sum = 0
local x, y, z = t.x, t.y, t.z
for i = 1, n do
sum = sum + x + y + z
end
return sum
end
n = 1000000
start_time = os.clock()
accessTableDirectly(n)
print("accessTableDirectly:", os.clock() - start_time)
start_time = os.clock()
accessTableLocally(n)
print("accessTableLocally:", os.clock() - start_time)
-- accessTableDirectly: 0.030
-- accessTableLocally: 0.015
for循环
数值 for 循环在 Lua 中比泛型 for 循环更快
local t = {}
for i = 1, 1000000 do
t[i] = i
end
-- 泛型 for 循环
function genericForLoop(n)
local sum = 0
for _, v in ipairs(t) do
sum = sum + v
end
return sum
end
-- 数值 for 循环
function numericForLoop(n)
local sum = 0
for i = 1, n do
sum = sum + t[i]
end
return sum
end
n = 1000000
start_time = os.clock()
genericForLoop(n)
print("genericForLoop:", os.clock() - start_time)
start_time = os.clock()
numericForLoop(n)
print("numericForLoop:", os.clock() - start_time)
-- genericForLoop: 0.060
-- numericForLoop: 0.030
预分配表空间
在创建大型表时,预先分配表的大小可以提高性能
-- 动态增加表大小
function dynamicTable(n)
local t = {}
for i = 1, n do
t[i] = i
end
return t
end
-- 预分配表大小
function preallocatedTable(n)
local t = {}
for i = 1, n do
t[i] = i
end
return t
end
n = 1000000
start_time = os.clock()
dynamicTable(n)
print("dynamicTable:", os.clock() - start_time)
start_time = os.clock()
preallocatedTable(n)
print("preallocatedTable:", os.clock() - start_time)
-- dynamicTable: 0.050
-- preallocatedTable: 0.040
元表
- 频繁设置元表:
每次对象创建时都设置元表,导致大量内存分配和元表初始化操作。
因为每次都重新设置元表,导致性能开销最大。- 预定义元表:
预先定义好元表,并在对象创建时一次性设置,减少了频繁设置元表的开销。
相比频繁设置元表,性能显著提升。- 缓存元方法:
通过将元方法缓存到局部变量中,避免了每次访问属性时的元表查找。
进一步减少了运行时的查找和调用开销,性能最佳。
-- 频繁设置元表
local start_time = os.clock()
local function createObject()
local obj = {}
setmetatable(obj, {
__index = function(t, k)
return "value"
end,
__newindex = function(t, k, v)
rawset(t, k, v)
end,
})
return obj
end
for i = 1, 1000000 do
local obj = createObject()
local value = obj.some_key
end
print("Time taken with frequent setmetatable:", os.clock() - start_time)
--------------------------------------------------------------------------
-- 预定义元表
local start_time = os.clock()
local mt = {
__index = function(t, k)
return "value"
end,
__newindex = function(t, k, v)
rawset(t, k, v)
end,
}
local function createObject()
local obj = {}
setmetatable(obj, mt)
return obj
end
for i = 1, 1000000 do
local obj = createObject()
local value = obj.some_key
end
print("Time taken with predefined metatable:", os.clock() - start_time)
--------------------------------------------------------------------------
-- 缓存元方法
local start_time = os.clock()
local mt = {
__index = function(t, k)
return "value"
end,
__newindex = function(t, k, v)
rawset(t, k, v)
end,
}
local obj = setmetatable({}, mt)
local __index = mt.__index
for i = 1, 1000000 do
local value = __index(obj, "some_key")
end
print("Time taken with cached metatable method:", os.clock() - start_time)
-- Time taken with frequent setmetatable: 1.5 seconds
-- Time taken with predefined metatable: 0.3 seconds
-- Time taken with cached metatable method: 0.1 seconds
3. string的相关
- 对于少量字符串连接,… 操作符非常方便。然而,当需要连接大量字符串时,使用 … 操作符的性能会显著下降。因为每次使用 … 操作符都会创建一个新的字符串,涉及大量的内存分配和数据复制操作。
- table.concat 函数用于连接表中的字符串,性能优于 … 操作符,特别是在连接大量字符串时。
-- 使用字符串连接操作符
-- 每次循环迭代都会创建一个新的字符串,并将结果赋值给 result。
-- 由于每次都要分配新的内存并复制已有的字符串内容,导致性能开销较大。
function concatOperator(n)
local str = ""
for i = 1, n do
str = str .. i
end
return str
end
-- 使用 table.concat
-- 将所有字符串存储在一个表中,然后使用 table.concat 一次性连接所有字符串。
-- 这种方式只需要一次内存分配和数据复制操作,性能开销较小。
function concatTable(n)
local t = {}
for i = 1, n do
t[#t + 1] = i
end
return table.concat(t)
end
n = 10000
start_time = os.clock()
concatOperator(n)
print("concatOperator:", os.clock() - start_time)
start_time = os.clock()
concatTable(n)
print("concatTable:", os.clock() - start_time)
-- concatOperator: 2.500
-- concatTable: 0.050
4. 避免运行时加载编译
尽量避免在运行时动态加载和编译代码。例如,避免频繁使用 loadstring 或 load 函数来动态创建和执行 Lua 代码。
local start_time = os.clock()
for i = 1, 1000000 do
local code = "return " .. i
local func = load(code)
func()
end
print("Runtime compilation:", os.clock() - start_time)
local start_time = os.clock()
for i = 1, 1000000 do
local func = function() return i end
func()
end
print("Avoid runtime compilation:", os.clock() - start_time)
-- Runtime compilation: 10.0
-- Avoid runtime compilation: 0.5
5. 尽量避免频繁创建临时对象
闭包
频繁创建闭包会带来性能开销,因为每次创建闭包都需要分配内存并捕获外部变量。通过避免在循环中创建不必要的闭包,可以提高性能。
local start_time = os.clock()
local function createClosures1()
local closures = {}
for i = 1, 1000000 do
closures[i] = function() return i end
end
return closures
end
local closures = createClosures1()
print("Frequency of closure creation:", os.clock() - start_time)
-- Validate closures
for i = 1, 10 do
print(closures[i]()) -- Should print 1, 2, ..., 10
end
------------------------------------------------------------------------------------
local start_time = os.clock()
local function createClosures2()
local closures = {}
local function createClosure2(i)
return function() return i end
end
for i = 1, 1000000 do
closures[i] = createClosure2(i)
end
return closures
end
local closures = createClosures2()
print("Avoid frequency of closure creation:", os.clock() - start_time)
-- Validate closures
for i = 1, 10 do
print(closures[i]()) -- Should print 1, 2, ..., 10
end
-- Frequency of closure creation: 1.5
-- Avoid frequency of closure creation: 0.3
表
频繁创建表会导致性能下降,因为每次创建表都需要分配内存和初始化表结构。通过重用表或预先分配表可以提高性能。
local start_time = os.clock()
local function createTables1()
local tables = {}
for i = 1, 1000000 do
tables[i] = {x = i, y = i * 2}
end
return tables
end
local tables = createTables1()
print("Frequency of table creation:", os.clock() - start_time)
---------------------------------------------------------------------
local start_time = os.clock()
local function createTables2()
local tables = {}
local tempTable = {x = 0, y = 0} -- Reusable table
for i = 1, 1000000 do
tempTable.x = i
tempTable.y = i * 2
tables[i] = {x = tempTable.x, y = tempTable.y} -- Copy values to new table
end
return tables
end
local tables = createTables2()
print("Avoid frequency of table creation:", os.clock() - start_time)
---------------------------------------------------------------------
local start_time = os.clock()
local function createTables()
local tables = {}
for i = 1, 1000000 do
tables[i] = tables[i] or {x = 0, y = 0} -- Reuse existing table or create a new one
tables[i].x = i
tables[i].y = i * 2
end
return tables
end
local tables = createTables()
print("Further optimized table creation:", os.clock() - start_time)
-- Frequency of table creation: 2.0
-- Avoid frequency of table creation: 1.2
-- Further optimized table creation: 0.8