复杂网络-标准公开数据集

 

SNAP(Stanford Large Network Dataset Collection)实验数据集 主要包含以下数据集:

 

Social networks

NameTypeNodesEdgesDescription
ego-FacebookUndirected4,03988,234Social circles from Facebook (anonymized)
ego-GplusDirected107,61413,673,453Social circles from Google+
ego-TwitterDirected81,3061,768,149Social circles from Twitter
soc-Epinions1Directed75,879508,837Who-trusts-whom network of Epinions.com
soc-LiveJournal1Directed4,847,57168,993,773LiveJournal online social network
soc-PokecDirected1,632,80330,622,564Pokec online social network
soc-Slashdot0811Directed77,360905,468Slashdot social network from November 2008
soc-Slashdot0922Directed82,168948,464Slashdot social network from February 2009
wiki-VoteDirected7,115103,689Wikipedia who-votes-on-whom network

 

Networks with ground-truth communities

NameTypeNodesEdgesCommunitiesDescription
com-LiveJournalUndirected, Communities3,997,96234,681,189287,512LiveJournal online social network
com-FriendsterUndirected, Communities65,608,3661,806,067,135957,154Friendster online social network
com-OrkutUndirected, Communities3,072,441117,185,0836,288,363Orkut online social network
com-YoutubeUndirected, Communities1,134,8902,987,6248,385Youtube online social network
com-DBLPUndirected, Communities317,0801,049,86613,477DBLP collaboration network
com-AmazonUndirected, Communities334,863925,872151,037Amazon product network

Communication networks

NameTypeNodesEdgesDescription
email-EuAllDirected265,214420,045Email network from a EU research institution
email-EnronUndirected36,692367,662Email communication network from Enron
wiki-TalkDirected2,394,3855,021,410Wikipedia talk (communication) network

Citation networks

NameTypeNodesEdgesDescription
cit-HepPhDirected, Temporal, Labeled34,546421,578Arxiv High Energy Physics paper citation network
cit-HepThDirected, Temporal, Labeled27,770352,807Arxiv High Energy Physics paper citation network
cit-PatentsDirected, Temporal, Labeled3,774,76816,518,948Citation network among US Patents

Collaboration networks

NameTypeNodesEdgesDescription
ca-AstroPhUndirected18,772396,160Collaboration network of Arxiv Astro Physics
ca-CondMatUndirected23,133186,936Collaboration network of Arxiv Condensed Matter
ca-GrQcUndirected5,24228,980Collaboration network of Arxiv General Relativity
ca-HepPhUndirected12,008237,010Collaboration network of Arxiv High Energy Physics
ca-HepThUndirected9,87751,971Collaboration network of Arxiv High Energy Physics Theory

Web graphs

NameTypeNodesEdgesDescription
web-BerkStanDirected685,2307,600,595Web graph of Berkeley and Stanford
web-GoogleDirected875,7135,105,039Web graph from Google
web-NotreDameDirected325,7291,497,134Web graph of Notre Dame
web-StanfordDirected281,9032,312,497Web graph of Stanford.edu

Product co-purchasing networks

NameTypeNodesEdgesDescription
amazon0302Directed262,1111,234,877Amazon product co-purchasing network from March 2 2003
amazon0312Directed400,7273,200,440Amazon product co-purchasing network from March 12 2003
amazon0505Directed410,2363,356,824Amazon product co-purchasing network from May 5 2003
amazon0601Directed403,3943,387,388Amazon product co-purchasing network from June 1 2003
amazon-metaMetadata548,5521,788,725Amazon product metadata: product info and all reviews on around 548,552 products.

Internet peer-to-peer networks

NameTypeNodesEdgesDescription
p2p-Gnutella04Directed10,87639,994Gnutella peer to peer network from August 4 2002
p2p-Gnutella05Directed8,84631,839Gnutella peer to peer network from August 5 2002
p2p-Gnutella06Directed8,71731,525Gnutella peer to peer network from August 6 2002
p2p-Gnutella08Directed6,30120,777Gnutella peer to peer network from August 8 2002
p2p-Gnutella09Directed8,11426,013Gnutella peer to peer network from August 9 2002
p2p-Gnutella24Directed26,51865,369Gnutella peer to peer network from August 24 2002
p2p-Gnutella25Directed22,68754,705Gnutella peer to peer network from August 25 2002
p2p-Gnutella30Directed36,68288,328Gnutella peer to peer network from August 30 2002
p2p-Gnutella31Directed62,586147,892Gnutella peer to peer network from August 31 2002

Road networks

NameTypeNodesEdgesDescription
roadNet-CAUndirected1,965,2065,533,214Road network of California
roadNet-PAUndirected1,088,0923,083,796Road network of Pennsylvania
roadNet-TXUndirected1,379,9173,843,320Road network of Texas

Autonomous systems graphs

NameTypeNodesEdgesDescription
as-733
(733 graphs)
Undirected103-6,474243-13,233733 daily instances(graphs) from November 8 1997 to January 2 2000
as-Skitter      Undirected1,696,41511,095,298Internet topology graph, from traceroutes run daily in 2005
as-Caida
(122 graphs)
Directed8,020-26,47536,406-106,762The CAIDA AS Relationships Datasets, from January 2004 to November 2007
Oregon-1
(9 graphs)
Undirected10,670-11,17422,002-23,409AS peering information inferred from Oregon route-views between March 31 and May 26 2001
Oregon-2
(9 graphs)
Undirected10,900-11,46131,180-32,730AS peering information inferred from Oregon route-views between March 31 and May 26 2001

Signed networks

NameTypeNodesEdgesDescription
soc-sign-epinionsDirected131,828841,372Epinions signed social network
wiki-ElecDirected, Bipartite~7,000~100,000Wikipedia adminship election data
soc-sign-Slashdot081106Directed77,357516,575Slashdot Zoo signed social network from November 6 2008
soc-sign-Slashdot090216Directed81,871545,671Slashdot Zoo signed social network from February 16 2009
soc-sign-Slashdot090221Directed82,144549,202Slashdot Zoo signed social network from February 21 2009

Location-based online social networks

NameTypeNodesEdgesDescription
loc-GowallaUndirected, Geo-Location196,591950,327Gowalla location based online social network
loc-BrightkiteUnirected, Geo-Location58,228214,078Brightkite location based online social network

Wikipedia networks and metadata

NameTypeNodesEdgesDescription
wiki-VoteDirected7,115103,689Wikipedia who-votes-on-whom network
wiki-TalkDirected2,394,3855,021,410Wikipedia talk (communication) network
wiki-ElecBipartite~7,000~100,000Wikipedia adminship election data
wiki-metaEdits2.3M users,
3.5M pages
250M editsComplete Wikipedia edit history (who edited what page)

Memetracker and Twitter

NameTypeNodesEdgesDescription
twitter7Tweets17,069,982 users476,553,560 tweetsA collection of 476 million tweets collected between June-Dec 2009
memetracker9Memes96 million418 million linksMemetracker phrases and hyperlinks between 96 million blog posts from Aug 2008 to Apr 2009
ksc-time-seriesTime
Series
2,000418 million linksTime series of volume of 1,000 most popular Memetracker phrases and 1,000 most popular Twitter hashtags

Online Communities

NameTypeNumber of itemsDescription
RedditReddit submissions132,308 submissionsResubmitted content on reddit.com
flickrImages2,316,948 related imagesImages sharing common metadata on Flickr

Online Reviews

NameTypeNumber of itemsDescription
BeerAdvocateBeer reviews1,586,259 beer reviewsBeer reviews from BeerAdvocate
RateBeerBeer reviews2,924,127 beer reviewsBeer reviews from RateBeer
CellarTrackerWine reviews2,025,995 wine reviewsWine reviews from CellarTracker
Amazon reviewsAmazon reviews (all categories)34,686,770 product reviewsReviews from Amazon
Fine FoodsFood reviews568,454 food reviewsFood reviews from Amazon
MoviesMovie reviews7,911,684 movie reviewsMovie reviews from Amazon

 

Network types

  • Directed : directed network
  • Undirected : undirected network
  • Bipartite : bipartite network
  • Multigraph : network has multiple edges between a pair of nodes
  • Temporal : for each node/edge we know the time when it appeared in the network
  • Labeled : network contains labels (weights, attributes) on nodes and/or edges

Network statistics

Dataset statistics
NodesNumber of nodes in the network
EdgesNumber of edges in the network
Nodes in largest WCCNumber of nodes in the largest weakly connected component
Edges in largest WCCNumber of edges in the largest weakly connected component
Nodes in largest SCCNumber of nodes in the largest strongly connected component
Edges in largest SCCNumber of edges in the largest strongly connected component
Average clustering coefficientAverage clustering coefficient
Number of trianglesNumber of triples of connected nodes (considering the network as undirected)
Fraction of closed trianglesNumber of connected triples of nodes / number of (undirected) length 2 paths
Diameter (longest shortest path)Maximum undirected shortest path length (sampled over 1,000 random nodes)
90-percentile effective diameter90-th percentile of undirected shortest path length distribution (sampled over 1,000 random nodes)
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值