概要:
要减少冗余的存储数据,采用了树状结构的JSON文件
又想从树的叶子节点进行数据的查找
有没有办法把树转换为一个线性的列表呢?
[{
"name":"heping",
"area":[{
"name":"一片",
"level":[{
"name":"重点",
"school":[{
"name":"鞍山道小学",
"district":null
},
{
"name":"万全小学",
"district":null
},
{
"name":"耀华小学",
"district":null
},
{
"name":"第二南开学校小学部",
"district":null
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
},
{
"name":"二片",
"level":[{
"name":"重点",
"school":[{
"name":"天津实验小学",
"district":null
},
{
"name":"和平区中心小学",
"district":null
},
{
"name":"岳阳道小学",
"district":null
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
},
{
"name":"三片",
"level":[{
"name":"重点",
"school":[{
"name":"昆明路小学",
"district":null
},
{
"name":"新华南路小学",
"district":null
},
{
"name":"二十中学附属小学",
"district":null
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
}]
},
{
"name":"hexi",
"area":[{
"name":"一片",
"level":[{
"name":"重点",
"school":[{
"name":"台湾路小学",
"district":["解放南路","台湾路","台儿庄路","宁波道"]
},
{
"name":"上海道小学",
"district":["向荣里","文静里","四化里","湛江路新村","三合里","新闻西里","江门里","服务楼","新会里","生昌里","新闻里","综合楼","天达里","环友里","积庆里","静安里","轻纸楼","长安里","亚中花园","地毯楼"]
},
{
"name":"闽侯路小学",
"district":["西楼北里","敬重里","信昌大楼","福至里","宝德里","存诚里","无锡道大楼","浦口东里","积庆里","广田里","安德里","三义大厦","祺寿里","安辛庄","南浦大厦","久仰里","吉万里","东莱里","台北路","南通里","富邦花园","春梅楼","荣华小区","鸿华里","利合里","重建里","海运里","同善里","浦江大厦","海汇名邸","爱慕里","新立里","荣华大厦","白楼名邸","解放南路","尚品嘉园","汇通大厦","天庆里","宝和里","港建里","福建路","连云里","积余里","广发楼","西楼后街","积厚里","南园里","扬州里"]
}]
},
{
"name":"优秀",
"school":[{
"name":"湘江道小学",
"district":["东舍宅","南华里","晶采大厦","富裕广场","龙海公寓","海华里","云广新里","重华南里","福盛花园","香江花园","恒华公寓","重华西里","名仕达花园","福熙园","美宁公寓","盈海园","海景公寓","新海大厦","汇文名邸","侨馨园","建设楼","美满里","联合里","金福兆公寓","棉二大院","台北路","美化里","同善里","津沽名园","东莱里","重华里","美荷苑","鑫瑞名苑","美泉新苑","海润公寓","环海公寓","北洋新里","排水大楼","云景大厦","湘南里","糖业大楼","解放南路","宁波道","福建路","奉化道","闽侯路","湘江道"]
}]
},
{
"name":"普通",
"school":[{
"name":"恩得里小学",
"district":["珠海里","红波里","恩德西里","珠波里","连荣里","健美里","广顺园","安德公寓","长安里","黄埔里","恩德东里","盛瑞公寓","圣德园","亚中花园","红波公寓珠海里","红波里","恩德西里","珠波里","连荣里","健美里","广顺园","安德公寓","长安里","黄埔里","恩德东里","盛瑞公寓","圣德园","亚中花园","红波公寓"]
}]
}]
},
{
"name":"二片",
"level":[{
"name":"重点",
"school":[{
"name":"河西中心小学",
"district":["寿园里","谊城公寓","育学里","西园西里","育文里","芳竹花园","可园东里","友谊花园","澳隆花园","顺园公寓","西园南里","鹤园里","可园里","西园北里"]
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
},
{
"name":"三片",
"level":[{
"name":"重点",
"school":[{
"name":"师范大学第二附属小学",
"district":["景福里","爱国北里","前程里","爱国里","向东里","中钢大厦","新跃里","挺进里","鹤望里","祥和里","中裕园","大沽南路672-674","跃进楼","生辉里","天意里","中豪世纪花园","三商大楼","三商平房","新颖里","中豪国际汽车大厦","红星里","曙光里","福丰里","鹤望里"]
},
{
"name":"三水道小学",
"district":["三水南里","秀峰里","元山里","云山里","彭山里","双山里","同江里","汉江里","松江里","粤江里"]
},
{
"name":"华江里小学",
"district":["华江里","桂山里","君山里","嫩江里","川江里","兰江里","富江里","云江里","安江里","金江里"]
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
}]
},
{
"name":"nankai",
"area":[{
"name":"北片",
"level":[{
"name":"重点",
"school":[{
"name":"五马路小学",
"district":null
},
{
"name":"南开中心小学",
"district":null
},
{
"name":"中营小学",
"district":null
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
},
{
"name":"中片",
"level":[{
"name":"重点",
"school":[{
"name":"南开小学",
"district":null
},
{
"name":"天大附小",
"district":null
},
{
"name":"南大附小",
"district":null
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
},
{
"name":"南片",
"level":[{
"name":"重点",
"school":[{
"name":"南开实验小学部",
"district":null
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
}]
},
{
"name":"hebei",
"area":[{
"name":"一片",
"level":[{
"name":"重点",
"school":[{
"name":"昆纬路第一小学",
"district":null
},
{
"name":"河北区实验小学",
"district":null
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
},
{
"name":"二片",
"level":[{
"name":"重点",
"school":[{
"name":"育婴里小学",
"district":null
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
},
{
"name":"三片",
"level":[{
"name":"重点",
"school":null
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
}]
},
{
"name":"hedong",
"area":[{
"name":"一片",
"level":[{
"name":"重点",
"school":null
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
},
{
"name":"二片",
"level":[{
"name":"重点",
"school":null
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
},
{
"name":"三片",
"level":[{
"name":"重点",
"school":[{
"name":"河东实验小学",
"district":null
},
{
"name":"河东第二实验小学",
"district":null
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
},
{
"name":"四片",
"level":[{
"name":"重点",
"school":null
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
}]
},
{
"name":"hongqiao",
"area":[{
"name":"一片",
"level":[{
"name":"重点",
"school":[{
"name":"红桥实验小学",
"district":null
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
},
{
"name":"二片",
"level":[{
"name":"重点",
"school":[{
"name":"师范附属小学",
"district":null
}]
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
},
{
"name":"三片",
"level":[{
"name":"重点",
"school":null
},
{
"name":"优秀",
"school":null
},
{
"name":"普通",
"school":null
}]
}]
}]
上面这个存成 xuexiao2.json
可以看到上面这个JSON文件记录的是:
天津,区,学片、等级(是否为重点)、学校,小区的一个样品数据。
注:
数据的格式跟算法是一对一的,所以这里做了一些基本约定:
1、只使用 [] {} dict str
2、每个层级的名字,用独立的 name:"xxxxx"来描述,下一层级用[],含义为key
3、叶子节点为最终的小区名称,用字符串表示(不用做成键值对)
我们希望得到的结果如下形式(示意):
hexi|一片|重点|闽侯路小学|积厚里
hexi|一片|重点|闽侯路小学|南园里
hexi|一片|重点|闽侯路小学|扬州里
hexi|一片|优秀|湘江道小学|东舍宅
hexi|一片|优秀|湘江道小学|南华里
hexi|一片|优秀|湘江道小学|晶采大厦
hexi|一片|优秀|湘江道小学|富裕广场
hexi|一片|优秀|湘江道小学|龙海公寓
hexi|一片|普通|恩得里小学|亚中花园
hexi|一片|普通|恩得里小学|红波公寓
hexi|二片|重点|河西中心小学|寿园里
hexi|二片|重点|河西中心小学|谊城公寓
hexi|二片|重点|河西中心小学|育学里
从树到列表的转化,这里主要使用了递归的方式访问JSON树型文件,话不多说上代码
def dump_json(input_json,pre):
# 对于JSON节点,几种类型:
# int,float,bool,complex,str(字符串),list,dict(字典),set,tuple
# dict a:b
# list []
# str "some like this"
# list {}
if isinstance(input_json,dict):
#是dic类型的
for key in input_json.keys():
value = input_json.get(key)
# 取name:"xxx"这个键值对,处理下
if(str(key) == "name"):
pre=pre+str(value)+"|"
#这里value的类型跟JSON的类型一样
if isinstance(value,dict):
# 压栈
dump_json(value,pre)
elif isinstance(value,list):
for i_value in value:
# 压栈
dump_json(i_value,pre)
elif isinstance(input_json,list):
for i_json in input_json:
#压栈
dump_json(i_json,pre)
elif isinstance(input_json,str):
# 叶子节点一定是基础类型
# 处理叶子节点
if(len(input_json) > 0):
# 如果不输出到屏幕上修改这里吧
print(pre+input_json)
调用部分,代码如下
import json
with open('xiaoxue2.json','r',encoding='utf8')as fp:
json_data = json.load(fp)
dump_json(json_data,"")
这样,既保证了数据存储的直观性,减少了冗余数据的存储,又完成了一个高效的数据查找过程。
JSON数据存树的场景很多,比如:国内的行政区划,磁盘的目录结构,公司的组织架构。
所以对各个字段的描述,除了name以外,还可以加上,编码(比如邮政编码,行政多级编码),每个节点的唯一识别码等等。
但数据结构的本质和算法万变不离其宗的,上面这个算法,修改成JS、C、C++、JAVA也非常容易,有兴趣的朋友可以进一步加工吧
好久没写博客了,今天就写到这里,最近在爬天津学区房的数据
现在是大龄程序员了,天津、北京、深圳,有好的岗位可以跟我联系
当然有创业项目想带上我的也请跟我联系
turui@163.net