计算网络节点模块内连通度(within modular degree)和模块间连通度(between modular degree)

道阻且长1994

已于 2024-04-24 21:04:03 修改

阅读量8.6k

点赞数 10

分类专栏：微生物共现网络构建及分析文章标签：学习 r语言

于 2022-03-10 17:49:12 首次发布

本文链接：https://blog.csdn.net/weixin_43367441/article/details/123403796

版权

微生物共现网络构建及分析专栏收录该内容

9 篇文章 92 订阅

订阅专栏

计算网络节点模块内连通度和模块间连通度

1.为什么要计算网络节点模块内连通度和模块间连通度

我们可以通过了解节点的模块内连通度和模块间连通度来对节点的重要性进行度量，具有较低的模块内连通度和模块间连通度的节点的重要性则相对较低,具体划分如下表（其中 $Z_i$ 为节点i的模块内连通度的z-scores， $P_i$ 为根据节点i的模块间连通度计算的参与系数，participation coefficient）：

	$Z_i$ < 2.5	$Z_i$ > 2.5
$P_i$ < 0.62	peripherals	modular hubs
$P_i$ > 0.62	connectors	network hubs

	$Z_i$ < 2.5	$Z_i$ > 2.5
$P_i$ < 0.62	不重要的节点	key species
$P_i$ > 0.62	key species	key species

2.网络节点模块内连通度（within modular degree）和模块间连通度(among modular degree)计算公式

模块内连通度z-scores计算公式：
$Z_i = \frac{K_i - \overline{K_{s_i}}}{\sigma_{K_{s_i}}}$
其中,
$Z_i$ 为节点 i 模块内连通度的z-scores；
$K_i$ 为节点 i 模块内连通度；
$\overline{K_{s_i}}$ 为节点 i 所在模块 s 所有节点的 $K_i$ 的平均值；
$\sigma_{K_{s_i}}$ 为节点 i 所在模块 s 所有节点的模块内连通度的标准差。

模块间连通度参与系数的计算公式：
$P_i =1- \sum_{s=1}^{N_m}(\frac{K_{is}}{K_i})^2$
其中,
$P_i$ 为节点 i 的参与系数；
$N_m$ 为模块数，number of modular；
$s$ 表示模块 $s$ ；
$K_i$ 为整个网络内的连通度；
$K_{is}$ 为节点 i 在各个模块中的连通度；

3.计算代码及注释

测试数据及源代码可参考：
计算网络节点模块内连通度和模块间连通度R代码及测试数据
更多R语言分析微生物生态学的资源可参考如下链接：
https://mbd.pub/o/bread/mbd-YpmTlZpr

#该代码参考自：https://github.com/cwatson/brainGraph/blob/master/R/vertex_roles.R
#g表示graph文件，其中graph的节点含有modular属性；
#也就是说，使用下面的函数计算前请先计算各节点的模块属性，并将其赋值给节点
#wtc <- cluster_louvain(igraph,NA)
#modularity(wtc)
#V(igraph)$module<-membership(wtc)

#计算模块内连接度的z-scores
within_module_deg_z_score <- function(g, A=NULL, weighted=FALSE) {
  stopifnot(is_igraph(g))
  if (is.null(A)) {
    if (isTRUE(weighted)) {
      A <- as_adj(g, sparse=FALSE, names=TRUE, attr='weight')
    } else {
      A <- as_adj(g, sparse=FALSE, names=TRUE)
    }
  }
  memb <- vertex_attr(g, "module")
  N <- max(memb)
  nS <- tabulate(memb)
  z <- Ki <- rep.int(0, dim(A)[1L])
  Ksi <- sigKsi <- rep.int(0, N)
  names(z) <- names(Ki) <- rownames(A)
  for (S in seq_len(N)) {
    x <- rowSums(A[memb == S, memb == S])
    Ki[memb == S] <- x
    Ksi[S] <- sum(x) / nS[S]
    sigKsi[S] <- sqrt(sum((x - Ksi[S])^2) / (nS[S]-1))
  }
  z <- (Ki - Ksi[memb]) / sigKsi[memb]
  z[is.infinite(z)] <- 0
  df <- data.frame(Ki,z,row.names = names(Ki))
  return(df)
}

#计算节点的参与系数
part_coeff <- function(g, A=NULL, weighted=FALSE) {
  stopifnot(is_igraph(g))
  if (is.null(A)) {
    if (isTRUE(weighted)) {
      A <- as_adj(g, sparse=FALSE, attr='weight')
    } else {
      A <- as_adj(g, sparse=FALSE)
    }
  }
  memb <- vertex_attr(g, "module")
  Ki <- colSums(A)
  N <- max(memb)
  Kis <- t(rowsum(A, memb))
  pi <- 1 - ((1 / Ki^2) * rowSums(Kis^2))
  names(pi) <- rownames(A)
  return(pi)
}

还有其它博客也介绍了 $Z_i$ 和 $P_i$ 的计算，例如：
使用ggClusterNet一条代码计算网络模块内连通度（Zi）和模块间连通度（Pi）
加载了ggClusterNet包后，只需要一行函数即可计算 $Z_i$ 和 $P_i$ ，但我仔细阅读了其计算的函数之后，对一些地方存在一些疑问（在代码中注释了），此时该函数计算的结果也与上面函数within_module_deg_z_score()计算结果不同，经过修改后的函数within_module_degree2()计算的结果与上面函数within_module_deg_z_score()计算的结果计算结果相同。
因此，在使用R包时，特别是一些未经严格检验的R包时，需要了解更多的细节，以免计算错误，得到错误的结果。

原函数如下：

#参考自：https://github.com/taowenmicro/ggClusterNet/blob/master/R/module.roles.R
network_degree <- function(comm_graph){
  ki_total <-NULL
  net_degree <- degree(comm_graph)
  for(i in 1:length(V(comm_graph))){    
    ki <- net_degree[i]    
    tmp <- data.frame(taxa=names(ki), total_links=ki)   
    if(is.null(ki_total)){ki_total<-tmp} else{ki_total <- rbind(ki_total, tmp)}   
  } 
  return(ki_total) 
}

#compute within-module degree for each of the features

within_module_degree <- function(comm_graph){
  mods <- vertex_attr(comm_graph, "module")
  vs <- as.list(V(comm_graph))
  modvs <- data.frame("taxon"= names(vs), "mod"=mods)
  #不理解此处为什么要用decompose.graph(),难道不应该直接用模块内的节点去取子图吗？我在within_module_degree2()中对我的想法进行了实现
  sg1 <- decompose.graph(comm_graph,mode="strong")
  df <- data.frame()
  for(mod in unique(modvs$mod)){ 
    mod_nodes <- subset(modvs$taxon,modvs$mod==mod) 
    neighverts <- unique(unlist(sapply(sg1,FUN=function(s){if(any(V(s)$name %in% mod_nodes)) V(s)$name else NULL})))  
    g3 <- induced.subgraph(graph=comm_graph,vids=neighverts)   
    mod_degree <- degree(g3) 
    for(i in mod_nodes){      
      ki <- mod_degree[which(names(mod_degree)==i)]     
      tmp <- data.frame(module=mod, taxa=names(ki), mod_links=ki)      
      df <- rbind(df,tmp)      
    }   
  } 
  return(df) 
}

within_module_degree2 <- function(comm_graph){ 
  mods <- vertex_attr(comm_graph, "module")
  vs <- as.list(V(comm_graph)) 
  modvs <- data.frame("taxon"= names(vs), "mod"=mods)
  df <- data.frame()
  for(mod in unique(modvs$mod)){  
    mod_nodes <- subset(modvs$taxon,modvs$mod==mod) 
    g3 <- induced.subgraph(graph=comm_graph,vids=mod_nodes)  
    mod_degree <- degree(g3) 
    for(i in mod_nodes){      
      ki <- mod_degree[which(names(mod_degree)==i)]      
      tmp <- data.frame(module=mod, taxa=names(ki), mod_links=ki)      
      df <- rbind(df,tmp)    
    }   
  } 
  return(df)
}
#calculate the degree (links) of each node to nodes in other modules.
among_module_connectivity <- function(comm_graph){ 
  mods <- vertex_attr(comm_graph, "module")  
  vs <- as.list(V(comm_graph)) 
  modvs <- data.frame("taxa"= names(vs), "mod"=mods)
  df <- data.frame() 
  for(i in modvs$taxa){   
    for(j in modvs$taxa){     
      if(are_adjacent(graph=comm_graph, v1=i , v2=j)){      
        mod <- subset(modvs$mod, modvs$taxa==j)       
        tmp <- data.frame(taxa=i, taxa2=j, deg=1, mod_links=mod)       
        df <- rbind(df, tmp)       
      }     
    }    
  }  
  out <- aggregate(list(mod_links=df$deg), by=list(taxa=df$taxa, module=df$mod), FUN=sum)  
  return(out)  
}

#compute within-module degree z-score which
#measures how well-connected a node is to other nodes in the module.

zscore <- function(mod.degree){  
  ksi_bar <- aggregate(mod_links ~ module, data=mod.degree, FUN = mean)  
  ksi_sigma <- aggregate(mod_links ~ module, data=mod.degree, FUN = sd) 
  z <- NULL 
  for(i in 1:dim(mod.degree)[1]){   
    mod_mean <- ksi_bar$mod_links[which(ksi_bar$module == mod.degree$module[i])]    
    mod_sig <- ksi_sigma$mod_links[which(ksi_bar$module == mod.degree$module[i])]    
    z[i] <- (mod.degree$mod_links[i] - mod_mean)/mod_sig   
  }  
  z <- data.frame(row.names=rownames(mod.degree), z, module=mod.degree$module)  
  return(z) 
}

#The participation coefficient of a node measures how well a  node is distributed
# in the entire network. It is close to 1 if its links are uniformly
#distributed among all the modules and 0 if all its links are within its own module.

participation_coeffiecient <- function(mod.degree, total.degree){ 
  p <- NULL  
  for(i in total.degree$taxa){   
    ki <- subset(total.degree$total_links, total.degree$taxa==i)   
    taxa.mod.degree <- subset(mod.degree$mod_links, mod.degree$taxa==i)   
    p[i] <- 1 - (sum((taxa.mod.degree)**2)/ki**2)   
  } 
  p <- as.data.frame(p)
  return(p)
}

觉得博主写地不错的话，可以买箱苹果，支持博主。

道阻且长1994

关注

10
点赞
踩
32

收藏

觉得还不错? 一键收藏
打赏
21
评论
计算网络节点模块内连通度(within modular degree)和模块间连通度(between modular degree)

我们可以通过了解节点的模块内连通度和模块间连通度来对节点的重要性进行度量，具有较低的模块内连通度和模块间连通度的节点的重要性则相对较低。
复制链接

扫一扫