前言
处理漏洞库xml文档特殊字符,优化后的工具代码
1、正则在线测试网站
在编写正则时,在线测试正则有效性的网站地址https://regex101.com/
2、详细代码
首先读写文件方法先提出来,代码如下(示例):
import re
def read_file(Path):
data = ''
with open(Path, 'r', encoding='utf-8') as file:
data = file.read() # 读取文件内容
return data
def write_file(Content):
with open('test.xml', 'a', encoding='utf-8') as file:
file.write(Content)
return True
2、核心正则
替换xml中的< > &符号,代码如下(示例):
def repl(matched):
return matched.group(1) + matched.group(2).replace('<', '+小于+').replace('>', '+大于+').replace('&',
'+AND+') + matched.group(
3)
3、处理xml数据
匹配替换xml数据,代码如下(示例):
def deal_xml(data, Re_descript, Re_name):
data = re.sub(Re_descript, repl, data)
data = re.sub(Re_name, repl, data)
return data
4、完整代码如下:
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
'''
@Project :PyCharm
@File :cnvd.py
@Author :trance
@Date :2023/8/24 10:20
'''
import re
def repl(matched):
return matched.group(1) + matched.group(2).replace('<', '+小于+').replace('>', '+大于+').replace('&',
'+AND+') + matched.group(
3)
def deal_xml(data, Re_descript, Re_name):
data = re.sub(Re_descript, repl, data)
data = re.sub(Re_name, repl, data)
return data
def read_file(Path):
data = ''
with open(Path, 'r', encoding='utf-8') as file:
data = file.read() # 读取文件内容
return data
def write_file(Content):
with open('test.xml', 'a', encoding='utf-8') as file:
file.write(Content)
return True
def main():
path = '8.24.xml'
re_descript = r'(<vuln-descript>)(.*?)(<\/vuln-descript>)'
re_name = r'(<name>)(.*?)(<\/name>)'
data = read_file(path)
content = deal_xml(data, re_descript, re_name)
write_file(content)
if __name__ == '__main__':
main()
总结
提示:本章只介绍了处理漏洞库xml文档的特殊字符,下篇文章将漏洞库xml文档转成csv格式的小工具