01数据整理及邻接权重矩阵构建

最近看完了空间计量经济学的理论部分,因此打算开始学习一下实战,实战所使用的主要是GEODA家族的软件包们,首先还是打算先学习python的pysal包,毕竟还是更喜欢代码,而且相较于GEODA和GEODASPACE,写代码还是会更灵活一点。这一部分也打算写一个系列,这是第一块,数据读取及预处理,以及权重矩阵的一些知识和代码,这个系列主要侧重于代码,理论的话基本就不涉及啦,需要的可以学习下沈体雁,于瀚辰老师写的《空间计量经济学》。主要是借助《空间计量分析软件》一本书来学习,书中的pysal的版本应该是1.x的,而现在pysal更新到2.x了,很多书中的代码不一定可以,但是为了方便学习,我还是安装了pysal1.14.4,学会了之后改到2.x想必也是轻松的。

import numpy as np
import pysal as ps
import libpysal as lps
E:\Anaconda\lib\site-packages\pysal\__init__.py:65: VisibleDeprecationWarning: PySAL's API will be changed on 2018-12-31. The last release made with this API is version 1.14.4. A preview of the next API version is provided in the `pysalnext` package. The API changes and a guide on how to change imports is provided at https://migrating.pysal.org
  ), VisibleDeprecationWarning)
E:\Anaconda\lib\site-packages\libpysal\examples\remotes.py:26: UserWarning: Remote data sets not available. Check connection.
  warnings.warn("Remote data sets not available. Check connection.")

1.读取数据

读取数据并将数据转换成numpy格式。

db = ps.open(ps.examples.get_path('NAT.dbf'),'r')
db
DataTable: E:\Anaconda\lib\site-packages\pysal\examples\nat\NAT.dbf
print(len(db))
db.header
3085

['NAME',
 'STATE_NAME',
 'STATE_FIPS',
 'CNTY_FIPS',
 'FIPS',
 'STFIPS',...]
db.by_col('HR90')
[0.0,
 15.885623511,
 6.4624531472,
 6.9965017491,
 ...]
y = np.array([db.by_col('HR90')]).T
y
array([[ 0.        ],
       [15.88562351],
       [ 6.46245315],
       ...,
       [ 4.36732988],
       [ 3.72771194],
       [ 2.04885495]])
x_names = ['RD90','UE90']
x1=np.array([db.by_col(var) for var in x_names]).T
x1
array([[-0.80277374,  3.89479009],
       [-0.13548304, 16.8115942 ],
       [-0.27654392, 10.70079408],
       ...,
       [-1.33977047,  4.1028878 ],
       [-1.74736678,  3.35398107],
       [-0.36114466,  5.48855304]])
x1.shape
(3085, 2)

2.邻接权重矩阵

下面建立的邻接权重矩阵的拓扑图如下图所示:
在这里插入图片描述

# 对一个3*3的方针从0开始按顺序编号,此外,加一个独立的9方块
# 邻接权重矩阵的结构,字典的形式,键是观测值id,值是与观测值id相邻接的观测值id
neighbors = {0:[3,1],1:[0,4,2],2:[1,5],3:[0,6,4],4:[1,3,7,5],5:[2,4,8],6:[3,7],7:[4,6,8],8:[5,7],9:[]} 
# 与邻接权重矩阵结构相一致的权重,也是字典的形式,键是观测值id,值是和neighbors值对应的权重
weights = {0:[5,3],1:[3,2,4],2:[4,1],3:[5,7,2],4:[2,2,3,1],5:[1,1,8],6:[7,3],7:[3,3,8],8:[8,8],9:[]}
# 观测值序列的顺序
id_order = [9,0,3,6,1,4,7,2,5,8]
#创建pysal中的邻接权重矩阵
w = ps.W(neighbors=neighbors ,weights=weights,id_order=id_order)
w
E:\Anaconda\lib\site-packages\pysal\weights\weights.py:186: UserWarning: There is one disconnected observation (no neighbors)
  warnings.warn("There is one disconnected observation (no neighbors)")
E:\Anaconda\lib\site-packages\pysal\weights\weights.py:187: UserWarning: Island id: 9
  warnings.warn("Island id: %s" % str(self.islands[0]))

<pysal.weights.weights.W at 0x31a1e1b388>
print("邻接权重矩阵的权重:",w.weights)
print("邻接权重矩阵的观测序列:",w.id_order)
print("邻接权重矩阵的观测点个数:",w.n)
邻接权重矩阵的权重: {0: [5, 3], 1: [3, 2, 4], 2: [4, 1], 3: [5, 7, 2], 4: [2, 2, 3, 1], 5: [1, 1, 8], 6: [7, 3], 7: [3, 3, 8], 8: [8, 8], 9: []}
邻接权重矩阵的观测序列: [9, 0, 3, 6, 1, 4, 7, 2, 5, 8]
邻接权重矩阵的观测点个数: 10
# 输出字典,键值对的键按照观测序列顺序排列,键是观测序列id,值是该观测id的顺序是第几个
w.id2i
{9: 0, 0: 1, 3: 2, 6: 3, 1: 4, 4: 5, 7: 6, 2: 7, 5: 8, 8: 9}
# transform输出当前的权重值类型,默认为'O',即二进制
w.transform
'O'
w.transform = 'R' # R是行标准化
w.weights
('WARNING: ', 9, ' is an island (no neighbors)')

{0: [0.625, 0.375],
 1: [0.3333333333333333, 0.2222222222222222, 0.4444444444444444],
 2: [0.8, 0.2],
 3: [0.35714285714285715, 0.5, 0.14285714285714285],
 4: [0.25, 0.25, 0.375, 0.125],
 5: [0.1, 0.1, 0.8],
 6: [0.7, 0.3],
 7: [0.21428571428571427, 0.21428571428571427, 0.5714285714285714],
 8: [0.5, 0.5],
 9: []}
w.transformations  # 输出历史改动信息
{'O': {0: [5, 3],
  1: [3, 2, 4],
  2: [4, 1],
  3: [5, 7, 2],
  4: [2, 2, 3, 1],
  5: [1, 1, 8],
  6: [7, 3],
  7: [3, 3, 8],
  8: [8, 8],
  9: []},
 'R': {0: [0.625, 0.375],
  1: [0.3333333333333333, 0.2222222222222222, 0.4444444444444444],
  2: [0.8, 0.2],
  3: [0.35714285714285715, 0.5, 0.14285714285714285],
  4: [0.25, 0.25, 0.375, 0.125],
  5: [0.1, 0.1, 0.8],
  6: [0.7, 0.3],
  7: [0.21428571428571427, 0.21428571428571427, 0.5714285714285714],
  8: [0.5, 0.5],
  9: []}}
# 创建一个完整的numpy数组,包含两个元素,第一个是空间权重矩阵,第二个是id_order信息
# 第0行就是id为9的node的邻接权重信息,第0行的顺序也是按照id_order的
w.full()
(array([[0.        , 0.        , 0.        , 0.        , 0.        ,
         0.        , 0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.625     , 0.        , 0.375     ,
         0.        , 0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.35714286, 0.        , 0.5       , 0.        ,
         0.14285714, 0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.7       , 0.        , 0.        ,
         0.        , 0.3       , 0.        , 0.        , 0.        ],
        [0.        , 0.33333333, 0.        , 0.        , 0.        ,
         0.22222222, 0.        , 0.44444444, 0.        , 0.        ],
        [0.        , 0.        , 0.25      , 0.        , 0.25      ,
         0.        , 0.375     , 0.        , 0.125     , 0.        ],
        [0.        , 0.        , 0.        , 0.21428571, 0.        ,
         0.21428571, 0.        , 0.        , 0.        , 0.57142857],
        [0.        , 0.        , 0.        , 0.        , 0.8       ,
         0.        , 0.        , 0.        , 0.2       , 0.        ],
        [0.        , 0.        , 0.        , 0.        , 0.        ,
         0.1       , 0.        , 0.1       , 0.        , 0.8       ],
        [0.        , 0.        , 0.        , 0.        , 0.        ,
         0.        , 0.5       , 0.        , 0.5       , 0.        ]]),
 [9, 0, 3, 6, 1, 4, 7, 2, 5, 8])
# 将字典存储的空间权重对象转换成稀疏空间权重对象(WSP类)
w.towsp()
<pysal.weights.weights.WSP at 0x31a1e44dc8>
# 直接从shape文件读入邻接权重矩阵,且读入的空间邻接是queen邻接的
wq = ps.queen_from_shapefile(ps.examples.get_path('NAT.shp'),idVariable='FIPSNO')
wq
<pysal.weights.Contiguity.Queen at 0x319b3e0908>
wq.weights
{27077: [1.0, 1.0, 1.0],
 27007: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
 ...}
wq.transform
'O'
wq.transform='R'
wq.s0 #权重矩阵元素之和
3085.0
wq.towsp()
<pysal.weights.weights.WSP at 0x31a39d39c8>
# 生成空间滞后变量
print("使用之前生成的x1做实验:\n",x1)
vlag = ps.lag_spatial(wq , x1)
vlag
使用之前生成的x1做实验:
 [[-0.80277374  3.89479009]
 [-0.13548304 16.8115942 ]
 [-0.27654392 10.70079408]
 ...
 [-1.33977047  4.1028878 ]
 [-1.74736678  3.35398107]
 [-0.36114466  5.48855304]]


array([[-0.3140736 ,  6.35152828],
       [-0.18933189,  8.36436575],
       [-0.24911753, 10.81645517],
       ...,
       [-0.11334696,  5.52220645],
       [-1.8902103 ,  2.58195495],
       [-0.60620542,  4.46367014]])
print(x1.shape)
print(vlag.shape)
(3085, 2)
(3085, 2)
  • 4
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值