投一篇文章,审稿人提意见“No estimate of assortativity has been made. What about a k-core analysis?”。
下面对 assortativity作一番了解。
中文怎么称呼?
我也不知道中文怎么称呼更为恰当,所以就用 assortativity 来讨论
assortativity究竟是什么?
首先,按照我自己的理解,对assortativity做一个解释
① 有一堆被观测的对象,每1个都具备某种属性,比如 身高,体重,有多少钱等等 ~~ 这个属性必须被量化,同时,是可以比较“大小多少”的
② 然后,按照某种规则,从这群对象中选2个出来 ~~ 比如相互存在link的2个,这个是复杂网络领域的玩儿法;比如空间上靠近的一对儿,这个是空间自相关分析领域的玩儿法 ~~ 看看这种属性在数值上是否接近,差得多不多
③ 然后,就这么一对儿一对儿地看一遍,最后给出一个判断:
你们这帮(人,节点,XX),assortative(同类相聚)! 或者, disassortavie(异类相聚) !
论文上的解释如下
Assortativity is expressed as a scalar value, ρ, in the range −1 ≤ρ≤ 1. Degree assortativity is identified as ρD.
用ρ来度量,取值在-1到1之间。网络的每一种属性(包含但不限于degree)都可以计算assortativity。
A network is said to be assortative when high-degree nodes are, on average, connected to other nodes with high-degree and low-degree nodes are, on average, connected to other nodes with low degree.
如果一个网络在degree这一属性上assortative的话,意味着,网络中的高度值节点,倾向于与高度值节点相连;低度值节点,倾向于与低度值节点相连。
这也就是通常所说的“物以类聚”。成绩好的跟成绩好的一块儿玩,差的跟差的一块儿玩。
A network is said to be disassortative when, on average, high-degree nodes are connected to nodes with low(er) degree and, on average, low-degree nodes are connected to nodes with high(er) degree.
如果一个网络在degree这一属性上disassortative的话,意味着,网络中的高度值节点,倾向于与低度值节点相连;低度值节点,倾向于与高度值节点相连。
这就有点儿“竞争选择”的意思,类似于轻轨上占座,倾向于寻找人少的地方;产品布局,倾向于寻找对手占有率不太高的地方。
Assortativity measures the similarity of connections in the graph with respect to the node degree.
再附上一个networkx官方文档里面的解释:assortativity用于测量连接的相似性,考察度值这个方面(解释了跟解释一样,所以还是得看上面文献里面的解释,可谓是清楚明白)
计算方法
通用公式,对于任意2个随机变量,做assortativity
The original definition of assortativity (Newman [1]), for non-weighted, non-directed networks, is
based on the correlation between random variables. We define the linear correlation coefficient between two random variables X and Y as follows:
推导得到关于网络度值的the linear degree correlation coefficient:
assortativity的networkx计算
networkx是一个强大的复杂网络分析包,不出意外,assortativity的计算也被纳入。
https://networkx.github.io/documentation/stable/reference/algorithms/assortativity.html #这个是官方文档
主要运用这个函数来计算
degree_assortativity_coefficient()
PS: nodes这个参数,需要是list或者可迭代对象,如果缺省,就是计算网络所有节点的assortativity;如果给1个list,就是计算这部分节点的assortativity
(审稿人提出“what about a k-core analysis?”),即是可以把k-core的节点拧出来单独算一个assortativity
用的是这个公式,来自于文献③中的公式(21)
degree_assortativity_coefficient
(G, x='out', y='in', weight=None, nodes=None)[source]Compute degree assortativity of graph.
Assortativity measures the similarity of connections in the graph with respect to the node degree.
Parameters:
- G (NetworkX graph)
- x (string (‘in’,’out’)) – The degree type for source node (directed graphs only).
- y (string (‘in’,’out’)) – The degree type for target node (directed graphs only).
- weight (string or None, optional (default=None)) – The edge attribute that holds the numerical value used as a weight. If None, then each edge has weight 1. The degree is the sum of the edge weights adjacent to the node.
- nodes (list or iterable (optional)) – Compute degree assortativity only for nodes in container. The default is all nodes.
Returns: r – Assortativity of graph by degree.
Return type:
下面这个跟上面这个一样,但是运用了可能更快的算法(可以测试一下,是否)
degree_pearson_correlation_coefficient
(G, x='out', y='in', weight=None, nodes=None)[source]Compute degree assortativity of graph.
Assortativity measures the similarity of connections in the graph with respect to the node degree.
This is the same as degree_assortativity_coefficient but uses the potentially faster scipy.stats.pearsonr function.
Parameters:
- G (NetworkX graph)
- x (string (‘in’,’out’)) – The degree type for source node (directed graphs only).
- y (string (‘in’,’out’)) – The degree type for target node (directed graphs only).
- weight (string or None, optional (default=None)) – The edge attribute that holds the numerical value used as a weight. If None, then each edge has weight 1. The degree is the sum of the edge weights adjacent to the node.
- nodes (list or iterable (optional)) – Compute pearson correlation of degrees only for specified nodes. The default is all nodes.
Returns: r – Assortativity of graph by degree.
Return type:
至于 attribute_assortativity_coefficient() 和 numeric_assortativity_coefficient()
我的理解,是为了计算degree以外其他属性的assortativity,需要给1个attribute的参数,str或者num来
Node attribute key
The corresponding attribute value must be an integer
assortativity的普适性
assortativity是一个具备“普适性”的指标测度,implying,任何attributes都可以用于计算assortativity。
在复杂网络这个领域里边,比如度值,比如中介中心度,比如接近中心度,比如xxx……任何测度都可以拿来算一算
assortativity与空间自相关
从网络的视角,来做assortativity分析这个事儿,就是把存在link(edge)的节点(选择来类比的那一对儿的方式),彼此是如何联系的(正负相关关系)来做一做
从空间的视角,来做assortativity分析这个事儿,就是把空间上隔得近的节点(选择来类比的那一对儿的方式),彼此是如何联系的(正负相关关系)来做一做
其差异,就是类比对象的选择方式不同而已 ~~ 网络视角是选取了本来就存在link的对象,空间自相关是选取了空间上接近的对象,然后分析其某种属性(高矮胖瘦,收入多寡,度值啥的),是否与刚才那种关系存在关联。
空间自相关用的因变量就是距离,看看距离近了,是不是你俩的某种属性就接近一些?或者说,你俩某种属性接近,是不是空间上距离得比较近呢?网络assortativity用的因变量就是存在link,看看你俩存在link,是不是某种属性就接近一些?或者说你俩某种属性相近,是不是倾向于存在link啊?是一个意思 ~~~ 两者在计量公式上,其实是非常之接近的
有一篇中文的文章,讲这个讲得还不错,可以看看
《两个空间变量空间相关性的分析》
https://baike.baidu.com/item/%E7%A9%BA%E9%97%B4%E8%87%AA%E7%9B%B8%E5%85%B3%E5%88%86%E6%9E%90/5579665 #百度这篇也不错
如果某一变量的值随着测定距离的缩小而变得更相似,这一变量呈空间正相关;若所测值随距离的缩小而更为不同,则称之为空间负相关;若所测值不表现出任何空间依赖关系,那么,这一变量表现出空间不相关性或空间随机性。
……判别斑块的大小以及某种格局出现的尺度……
空间自相关是跟栅格选择有关系的。极端情况下,栅格选择足够小,每个栅格里面,最多就只有1个被观测对象。那就不用是average值了,直接就是单个对象的值了。
以下4篇文章,讲得比较透彻(我的引用基本来自于下面):
1. R. Noldus, and P. Van Mieghem, "Assortativity in complex networks," Journal of Complex Networks, vol. 3, no. 4, pp. 507-542, 2015.
2. G. Thedchanamoorthy, M. Piraveenan, D. Kasthuriratna, and U. Senanayake, "Node assortativity in complex networks: An alternative approach," 2014 International Conference on Computational Science, vol. 29, pp. 2449-2461, 2014.
3. M. E. J. Newman, "Mixing patterns in networks," Physical Review E, vol. 67, no. 2, 2003.
4. M. E. J. Newman, "Assortative mixing in networks," Physical Review Letters, vol. 89, no. 20, 2002.
PS:
①是综述类文献,梳理了assortativity的来龙去脉;
④是鼻主类的文章,“the original definition of assortativity……”