分析
素材文件在文末
观察nbmx文件,猜测是压缩文档,尝试解压,得到content.json文件。观察content.json,部分内容如下:
{
"data": {
"expandState": "expand",
"text": "先进制造系统(Advanced Manufacturing System)",
"font-family": "黑体, SimHei",
"font-weight": "bold"
},
"children": [
{
"data": {
"text": "第 1 章",
"connect-color": "#7FBADF",
"expandState": "expand"
},
"children": [
{
"data": {
"text": "AMS特点",
"connect-color": "#7FBADF",
"expandState": "expand"
},
"children": [
{
"data": {
... ...
思路
- 解压nbmx思维导图文件
- 读取content.json
- 解析json,递归读取每个"text"键的值,同时记录其层次深度
- 以markdown格式保存为文件
实现
#--encoding:utf-8--
# Author: Allen Lv
# Time: 2021-05-14
# 转载请著名出处
import json
def getText(path: str) -> str:
'''获取content.json的文本'''
text = ""
with open(path, "r", encoding="utf-8") as f:
text = f.read()
return text
def pickLines(info: dict, lines: list, depth=1) -> list:
'''读取json字典,把每条信息保存到列表中'''
# 取出信息行,然后标记层次
ln = "${}$ {}".format(depth, info["data"]["text"])
lines.append(ln)
# 递归取得信息行
if info.get("children", 0):
for child_info in info["children"]:
pickLines(child_info, lines, depth+1)
def mdFormat(lines: list) -> str:
'''把信息行逐个格式化为md格式'''
md_lines = []
for line in lines:
# 取得层次号
i = int(line[1])
symbol = "$" + str(i) + "$"
new_symbol = "ERROR"
# 前三层都是标题,之后都是无序列表
if i <= 3:
new_symbol = "#" * i
else:
new_symbol = " " * (i-4) + "-"
# 替换换行符
_line = line.replace("\n", "\\n")
# 替换层次标记符
new_ln = _line.replace(symbol, new_symbol)
md_lines.append(new_ln)
# 返回md格式文本
return "\n\n".join(md_lines)
def saveMD(path: str, md_text: str):
'''保存到md文件'''
with open(path, "w", encoding="utf-8") as f:
f.write(md_text)
def main():
json_path = r".\content.json"
json_text = getText(json_path)
info = json.loads(json_text)
lines = []
pickLines(info, lines)
md_text = mdFormat(lines)
md_path = r".\思维导图文字提取.md"
saveMD(md_path, md_text)
# 程序顺利结束
print(200)
main()
素材
素材链接
提取码: c6k7