PageRank-案例-机场

目录

Routes database

Content

数据集来源

代码

1) 导包

 2)读入数据

3)数据探索 

4) 提取起飞和目的

 5)构建有向图

6) 输出机场排名,按PR值降序

7)定义画网络图函数


Routes database

As of January 2012, the OpenFlights/Airline Route Mapper Route Database contains 59036 routes between 3209 airports on 531 airlines spanning the globe.

Content

The data is ISO 8859-1 (Latin-1) encoded.

Each entry contains the following information:

  • Airline 2-letter (IATA) or 3-letter (ICAO) code of the airline.
  • Airline ID Unique OpenFlights identifier for airline (see Airline).
  • Source airport 3-letter (IATA) or 4-letter (ICAO) code of the source airport.
  • Source airport ID Unique OpenFlights identifier for source airport (see Airport)
  • Destination airport 3-letter (IATA) or 4-letter (ICAO) code of the destination airport.
  • Destination airport ID Unique OpenFlights identifier for destination airport (see Airport)
  • Codeshare "Y" if this flight is a codeshare (that is, not operated by Airline, but another carrier), empty otherwise.
  • Stops Number of stops on this flight ("0" for direct)
  • Equipment 3-letter codes for plane type(s) generally used on this flight, separated by spaces

The special value \N is used for "NULL" to indicate that no value is available.

Notes:

  • Routes are directional: if an airline operates services from A to B and from B to A, both A-B and B-A are listed separately.
  • Routes where one carrier operates both its own and codeshare flights are listed only once.

数据集来源

Flight Route Database | Kaggle

代码

1) 导包

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import networkx as nx

 2)读入数据

data=pd.read_csv("d:/datasets/Flight Route Database.csv")

3)数据探索 

data.head()
data.info()

4) 提取起飞和目的

weight_=2
edges=[(i,j,weight_) for i,j  in data[[" source airport"," destination apirport"]].values ]  
#i为起飞,j为目的,weight为边的权重

 5)构建有向图

G = nx.DiGraph()  #实例化有向图
for edge in edges:
    G.add_edge(edge[0], edge[1])  #增加边
pagerank = nx.pagerank(G, alpha=0.85)   #计算PR值
G.add_weighted_edges_from(edges)   #边权重

6) 输出机场排名,按PR值降序

#pagerank为字典
sorted(pagerank.items(),key=lambda x:x[1],reverse=True)

7)定义画网络图函数

# 画网络图
def show_graph(graph, layout='spring_layout'):
    # 使用 Spring Layout 布局,类似中心放射状
    if layout == 'circular_layout':
        #positions=nx.
        positions=nx.circular_layout(graph)
    else:
        positions=nx.spring_layout(graph)
    # 设置网络图中的节点大小,大小与 pagerank 值相关,因为 pagerank 值很小所以需要 *200000
    nodesize = [x['pagerank']*200000 for v,x in graph.nodes(data=True)]
    # 设置网络图中的边长度
    edgesize = [e[2]['weight'] for e in graph.edges(data=True)]
    # 绘制节点
    nx.draw(graph, positions, node_size=nodesize, alpha=0.4)
    # 绘制边
    nx.draw_networkx_edges(graph, positions,  alpha=0.2)
    # 绘制节点的 label
    nx.draw_networkx_labels(graph, positions, font_size=10)

  8) 输出PR阈值为0.003的机场链接图

nx.set_node_attributes(G, name = 'pagerank', values=pagerank)
nx.set_edge_attributes(G, name = 'weight', values=2)
pagerank_threshold = 0.003
small_graph = G.copy()
# 剪掉 PR 值小于 pagerank_threshold 的节点
for n, p_rank in G.nodes(data=True):
    if p_rank['pagerank'] < pagerank_threshold:
        small_graph.remove_node(n)
# 画网络图, 采用 circular_layout 布局让筛选出来的点组成一个圆
show_graph(small_graph, 'circular_layout')

  • 1
    点赞
  • 12
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

ITLiu_JH

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值