python中字典数据的特点_Python字典(Dictionary) 在数据分析中的操作

最新推荐文章于 2023-05-25 22:53:58 发布

weixin_39943868

最新推荐文章于 2023-05-25 22:53:58 发布

阅读量104

点赞数

文章标签： python中字典数据的特点

今天来聊聊python中的字典在数据分析中的应用，为了贴近实战关于简单结构的字典就略过。

今天要聊的字典结构是如下这类复杂结构：

{

"id": "2406124091",

"type": "node",

"visible":"true",

"created": {

"version":"2",

"changeset":"17206049",

"timestamp":"2013-08-03T16:43:42Z",

"user":"linuxUser16",

"uid":"1219059"

"pos": [41.9757030, -87.6921867],

"address": {

"housenumber": "5157",

"postcode": "60625",

"street": "North Lincoln Ave"

"amenity": "restaurant",

"cuisine": "mexican",

"name": "La Cabana De Don Luis",

"phone": "1 (773)-271-5176"

}

这类数据结构是为了方便写成JSON，或者存入MongoDB使用而存在的。为了便于理解和掌握这种复杂字典的操作方式，我们采取几个有趣的实验，来感受一下:

一、复杂结构字典是否可以拆分成简单结构的字典

如果把这个复杂结构拆分成几个结构简单的小字典或者列表，那么处理起来就会简单许多：

##第一个小字典

{"id": "2406124091",

"type": "node",

"visible":"true"}

##第二个小字典

{"version":"2",

"changeset":"17206049",

"timestamp":"2013-08-03T16:43:42Z",

"user":"linuxUser16",

"uid":"1219059"}

##一个小列表

[41.9757030, -87.6921867]

##第三个小字典

{"housenumber": "5157",

"postcode": "60625",

"street": "North Lincoln Ave"}

##第四个小字典

{"amenity": "restaurant",

"cuisine": "mexican",

"name": "La Cabana De Don Luis",

"phone": "1 (773)-271-5176"}

接下来，我们看看用哪种方法可以进行合并：

d1 = {"id": "2406124091",

"type": "node",

"visible":"true"}

d2 = {"version":"2",

"changeset":"17206049",

"timestamp":"2013-08-03T16:43:42Z",

"user":"linuxUser16",

"uid":"1219059"}

l1 = [41.9757030, -87.6921867]

d3 = {"housenumber": "5157",

"postcode": "60625",

"street": "North Lincoln Ave"}

d4 = {"amenity": "restaurant",

"cuisine": "mexican",

"name": "La Cabana De Don Luis",

"phone": "1 (773)-271-5176"}

d = {d1,d2,l1,d3,d4}

#Traceback (most recent call last):

# File "", line 1, in

# d = {d1,d2,l1,d3,d4}

#TypeError: unhashable type: 'dict'

###简单粗暴的合并，可惜这样的合并是不可行的

###尝试加上标签后进行合并

d = d1

d['created'] = d2

d['pos'] = l1

d['address'] = d3

d = dict(d,**d4)

pprint.pprint(d)

#{'address': {'housenumber': '5157',

# 'postcode': '60625',

# 'street': 'North Lincoln Ave'},

# 'amenity': 'restaurant',

# 'created': {'changeset': '17206049',

# 'timestamp': '2013-08-03T16:43:42Z',

# 'uid': '1219059',

# 'user': 'linuxUser16',

# 'version': '2'},

# 'cuisine': 'mexican',

# 'id': '2406124091',

# 'name': 'La Cabana De Don Luis',

# 'phone': '1 (773)-271-5176',

# 'pos': [41.975703, -87.6921867],

# 'type': 'node',

# 'visible': 'true'}

###成功完成复杂字典的合并，但是有个问题，顺序不对。在一些特定应用场景中，字典中的数据结构

###是被严格要求的。那么需要继续进行带有顺序要求的控制。

d = {'created':d2,'pos':l1,'address':d3}

pprint.pprint(d)

#{'address': {'housenumber': '5157',

# 'postcode': '60625',

# 'street': 'North Lincoln Ave'},

# 'created': {'changeset': '17206049',

# 'timestamp': '2013-08-03T16:43:42Z',

# 'uid': '1219059',

# 'user': 'linuxUser16',

# 'version': '2'},

# 'pos': [41.975703, -87.6921867]}

###成功完成了按顺序的合并，但是d1和d4的字典却无法进行可控的合并，采用dict()函数合并后，

###元素会添加在最后，这就又回到最初的情况

二、由上一个实验可知，两个字典直接合并可行，但结构顺序无法控制，需要对一些结构进行再分解。

d1 = {"id": "2406124091"}

d2 = {"type": "node"}

d3 = {"visible":"true"}

d4 = {"version":"2",

"changeset":"17206049",

"timestamp":"2013-08-03T16:43:42Z",

"user":"linuxUser16",

"uid":"1219059"}

l1 = [41.9757030, -87.6921867]

d5 = {"housenumber": "5157",

"postcode": "60625",

"street": "North Lincoln Ave"}

d6 = {"amenity": "restaurant"}

d7 = {"cuisine": "mexican"}

d8 ={"name": "La Cabana De Don Luis"}

d9 = {"phone": "1 (773)-271-5176"}

拆分完之后是这个样子：

d = {'id':d1,'type':d2,'visible':d3,'created':d4,'pos':l1,'address':d5,

'amenity':d6,'cuisine':d7,'name':d8,'phone':d9}

import pprint

pprint.pprint(d)

#{'address': {'housenumber': '5157',

# 'postcode': '60625',

# 'street': 'North Lincoln Ave'},

# 'amenity': {'amenity': 'restaurant'},

# 'created': {'changeset': '17206049',

# 'timestamp': '2013-08-03T16:43:42Z',

# 'uid': '1219059',

# 'user': 'linuxUser16',

# 'version': '2'},

# 'cuisine': {'cuisine': 'mexican'},

# 'id': {'id': '2406124091'},

# 'name': {'name': 'La Cabana De Don Luis'},

# 'phone': {'phone': '1 (773)-271-5176'},

# 'pos': [41.975703, -87.6921867],

# 'type': {'type': 'node'},

# 'visible': {'visible': 'true'}}

###没有出现我们想要的结果，除了结构是混乱的之外，重新构建的字典中，数据结构也出现的明显的

###错误。尝试另外一种构建方式：

d = {d1,d2,d3,'created':d4,'pos':l1,'address':d5,d6,d7,d8,d9}

d = {d1,d2,d3,d4,l1,d5,d6,d7,d8,d9}

#Traceback (most recent call last):

# File "", line 1, in

# d = {d1,d2,d3,d4,l1,d5,d6,d7,d8,d9}

#TypeError: unhashable type: 'dict'

d = {d1,d2,d3,d4,'pos':l1,d5,d6,d7,d8,d9}

# File "", line 1

# d = {d1,d2,d3,d4,'pos':l1,d5,d6,d7,d8,d9}

# ^

#SyntaxError: invalid syntax

###这种构建方式，过于异想天开了，语法是错误的。

###dict()函数合并的方式我不打算尝试了，应为其中的l1是list，这个是无法用这个函数合并的。

三、直接固定字典的格式，然后对其填充数值或者内容

d = {

"id": "",

"type": "",

"visible":"",

"created": {

"version":"",

"changeset":"",

"timestamp":"",

"user":"",

"uid":""

"pos": [0,0],

"address": {

"housenumber": "",

"postcode": "",

"street": ""

"amenity": "",

"cuisine": "",

"name": "",

"phone": ""

}

d['id']='2406124091'

d['address']['housenumber']='123456'

pprint.pprint(d['id'])

#'2406124091'

pprint.pprint(d['address'])

#{'housenumber': '123456', 'postcode': '', 'street': ''}

###成功完成写入操作，这种方式需要配合循环和条件判断语句使用。好处是，让数据结构得以固定。

最后，编程这事，没有唯一答案，条条道路通罗马。找到自己最喜欢最顺手的方式是最好的。

其实还有一种方法是使用eval()函数，但介于广大高手们痛恨和鄙视使用该函数，所以这里就不在对这个函数的用法进行探讨。

weixin_39943868

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python中字典数据的特点_Python字典(Dictionary) 在数据分析中的操作

今天来聊聊python中的字典在数据分析中的应用，为了贴近实战关于简单结构的字典就略过。今天要聊的字典结构是如下这类复杂结构：{"id": "2406124091","type": "node","visible":"true","created": {"version":"2","changeset":"17206049","timestamp":"2013-08-03T16:43:42Z","...
复制链接

扫一扫