首先
准备1个log 文件 app.log 它有60000 行数据
from loguru import logger
import os
def sample1():
log_path = get_log_path()
with open(log_path, "r") as f:
list_logs = f.readlines()
logger.info("length of app.logs: {}".format(len(list_logs)))
# get project path
def get_project_path():
return os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
# get log path
def get_log_path():
import os
return os.path.join(get_project_path(), "logs", "app.log")
if __name__ == "__main__":
sample1()
输出:
(.venv) [gateman@manjaro-x13 python_common_import]$ /home/gateman/Projects/python/python_common_import/.venv/bin/python /home/gateman/Projects/python/python_common_import/src/generator/gen_sample6.py
2024-05-13 01:16:19.932 | INFO | __main__:sample1:9 - length of app.logs: 62285
使用普通方法输出app.log 的内容到output.log
我们改一下文件, 增加1个方法sample2()来实现
from loguru import logger
import os
from src.decorator.sum_info import sum_info
@sum_info
def sample2():
log_path = get_log_path()
with open(log_path, "r") as f:
list_logs = f.readlines()
output_path = get_output_path()
with open(output_path, "w") as f:
for i in list_logs:
f.write(i)
logger.info("moved logs to output.log")
# get project path
def get_project_path():
return os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
# get log path
def get_log_path():
import os
return os.path.join(get_project_path(), "logs", "app.log")
# get output path
def get_output_path():
return os.path.join(get_project_path(), "logs", "output.log")
if __name__ == "__main__":
sample2()
这个方法利用f.readlines() 一次把文件内容读入1个列表
然后循环这个列表输出到另1个文件
我们看下内存占用,
(.venv) [gateman@manjaro-x13 python_common_import]$ /home/gateman/Projects/python/python_common_import/.venv/bin/python /home/gateman/Projects/python/python_common_import/src/generator/gen_sample5.py
2024-05-13 01:43:55.288 | INFO | src.decorator.print_time:wrapper:10 - Start time of sample2 is 2024-05-13 01:43:55
2024-05-13 01:43:55.343 | INFO | __main__:sample2:16 - moved logs to output.log
2024-05-13 01:43:55.351 | INFO | src.decorator.print_mem:wrapper:14 - Current memory usage is 0.000866MB; Peak was 9.868371MB
2024-05-13 01:43:55.352 | INFO | src.decorator.print_time:wrapper:13 - End time of sample2 is 2024-05-13 01:43:55
2024-05-13 01:43:55.352 | INFO | src.decorator.print_time:wrapper:14 - Time used of sample2 is 0.06403422355651855 seconds
可见到峰值内存是9Mb 多 , 因为它要把整个文件的内容读入内存
使用迭代器
我们改一下文件, 增加1个方法sample3()来实现
from loguru import logger
import os
from src.decorator.sum_info import sum_info
@sum_info
def sample3():
log_path = get_log_path()
output_path = get_output_path()
count = 0
with open(log_path, "r") as f:
with open(output_path, "a") as f2:
for i in f:
f2.write(i)
count += 1
logger.info("moved {} logs to output.log".format(count))
# get project path
def get_project_path():
return os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
# get log path
def get_log_path():
import os
return os.path.join(get_project_path(), "logs", "app.log")
# get output path
def get_output_path():
return os.path.join(get_project_path(), "logs", "output.log")
if __name__ == "__main__":
sample3()
由于 f实际上是1 TextIOWrapper, 它是1个interable
所以我们可以用for … in 来迭代它
这种方法的内存占用:
(.venv) [gateman@manjaro-x13 python_common_import]$ /home/gateman/Projects/python/python_common_import/.venv/bin/python /home/gateman/Projects/python/python_common_import/src/generator/gen_sample7.py
2024-05-13 01:50:33.133 | INFO | src.decorator.print_time:wrapper:10 - Start time of sample3 is 2024-05-13 01:50:33
2024-05-13 01:50:33.229 | INFO | __main__:sample3:16 - moved 62320 logs to output.log
2024-05-13 01:50:33.229 | INFO | src.decorator.print_mem:wrapper:14 - Current memory usage is 0.00086MB; Peak was 0.041176MB
2024-05-13 01:50:33.230 | INFO | src.decorator.print_time:wrapper:13 - End time of sample3 is 2024-05-13 01:50:33
2024-05-13 01:50:33.230 | INFO | src.decorator.print_time:wrapper:14 - Time used of sample3 is 0.09714841842651367 seconds
只有0.04MB
大大节省了内存!