【持续更新】multi-layer graph多层关系网络数据集

本文提供了多个多层关系网络数据集的详细信息,包括Epinions的用户评分数据、Jester的在线笑话推荐系统数据、Book-Crossing的用户评级数据以及多个社会网络数据集,如Twitter上的特殊事件互动数据。这些数据集涵盖了用户评分、社交媒体互动、图书评级和遗传互动等多个方面,适用于多层网络分析和研究。

表格

名字 点数1 边数 层数2 权重3
ratings_data 49290 + 139738 664823 5 Unweighted
jester_dataset_1_1 24983 + 100 2498300 -10.00 to +10.00 Unweighted
jester_dataset_1_2 23500 + 100 2350000 -10.00 to +10.00 Unweighted
jester_dataset_1_3 24930 + 100 2493000 -10.00 to +10.00 Unweighted
jester_dataset_2 63978 + ~150 1761439 -10.00 to +10.00 Unweighted
jester_dataset_3 66336 + 151 10016736 -10.00 to +10.00 Unweighted
BX-CSV-Dump4 278858(276271) + 271380 >1048576(Excel error) 10
201305.relationship5 ~200000 310150 4 Unweighted
201306.relationship ~200000 310150 4 Unweighted
201307.relationship ~200000 368048 4 Unweighted
201308.relationship ~200000 368048 4 Unweighted
201309.relationship ~200000 368048 4 Unweighted
201310.relationship ~200000 366566 4 Unweighted
201311.relationship ~200000 348878 4 Unweighted
201312.relationship ~200000 396030 4 Unweighted
Vickers-Chan-7thGraders_Multiplex_Social 29 29*3 3
Padgett-Florence-Families_Multiplex_Social 16 35 2
Lazega-Law-Firm_Multiplex_Social 71 2571 3 Unweighted
Krackhardt-High-Tech_Multiplex_Social 21 312 3
Kapferer-Tailor-Shop_Multiplex_Social 39 1018 4 Unweighted
CKM-Physicians-Innovation_Multiplex_Social 246 1551 3 Unweighted
CS-Aarhus_Multiplex_Social 61 620 5
EUAir_Multiplex_Transport 249 3588 37
London_Multiplex_Transport 323 441 3
NYClimateMarch2014_Multiplex_Social 102439 353495 3 Weighted
Cannes2013_Multiplex_Social 438537 991854 3 Weighted
MoscowAthletics2013_Multiplex_Social 88804 210250 3 Weighted
MLKing2013_Multiplex_Social 327707 396671 3 Weighted
ObamaInIsrael2013_Multiplex_Social 2281259 4061960 3 Weighted
Arabidopsis_Multiplex_Genetic 6980 18654 7 Unweighted
Homo_Multiplex_Genetic 18222 170899 7 Unweighted
AMiner-Coauthor ~1712433 4258615 1 Weighted
wikitree 1382751 9192212 Unweighted
coauthor 1629217

下面所有的说明摘自原网页说明,每个数据说明最下面为该数据集的说明网页和下载网页(如果两者不是同一个网页则会分开写)

ratings_data

The dataset was collected by Paolo Massa in a 5-week crawl (November/December 2003) from the Epinions.com Web site.

The dataset contains

  • 49,290 users who rated a total of
  • 139,738 different items at least once, writing
  • 664,824 reviews and
  • 487,181 issued trust statements.
    Users and Items are represented by anonimized numeric identifiers.

The dataset consists of 2 files.

it contains the ratings given by users to items.

Every line has the following format:

user_id item_id rating_value

For example,

23 387 5

represents the fact “user 23 has rated item 387 as 5”

Ranges:

user_id is in [1,49290]

item_id is in [1,139738]

rating_value is in [1,5]

http://www.trustlet.org/downloaded_epinions.html

http://www.trustlet.org/datasets/downloaded_epinions/

jester_dataset

Anonymous Ratings from the Jester Online Joke Recommender System

Dataset 1: Over 4.1 million continuous ratings (-10.00 to +10.00) of 100 jokes from 73,421 users: collected between April 1999 - May 2003.

Dataset 2: Over 1.7 million continuous ratings (-10.00 to +10.00) of 150 jokes from 59,132 users: collected between November 2006 - May 2009.

Dataset 2+: An updated version of Dataset 2 with over 500,000 new ratings from 79,681 total users: data collected from November 2006 - Nov 2012

Freely available for research use when acknowledged with the following reference:

Eigentaste: A Constant Time Collaborative Filtering Algorithm. Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins. Information Retrieval, 4(2), 133-151. July 2001.

As a courtesy, if you use the data, I would appreciate knowing your name, what research group you are in, and the publications that may result.

Dataset 1

Over 4.1 million continuous ratings (-10.00 to +10.00) of 100 jokes from 73,421 users: collected between April 1999 - May 2003

Save to disk, then unzip to obtain Excel files:

  • jester_dataset_1_1.zip: (3.9MB) Data from 24,983 users who have rated 36 or more jokes, a matrix with dimensions 24983 X 101.
  • jester_dataset_1_2.zip: (3.6MB) Data from 23,500 users who have rated 36 or more jokes, a matrix with dimensions 23500 X 101.
  • jester_dataset_1_3.zip: (2.1MB) Data from 24,938 users who have rated between 15 and 35 jokes, a matrix with dimensions 24,938 X 101.

Format:

  1. 3 Data files contain anonymous ratings data from 73,421 users.
  2. Data files are in .zip format, when unzipped, they are in Excel (.xls) format
  3. Ratings are real values ranging from -10.00 to +10.00 (the value “99” corresponds to “null” = “not rated”).
  4. One row per user
  5. The first column gives the number of jokes rated by that user. The next 100 columns give the ratings for jokes 01 - 100.
  6. The sub-matrix including only columns {5, 7, 8, 13, 15, 16, 17, 18, 19, 20} is dense. Almost all users have rated those jokes (see discussion of “universal queries” in the above paper).

The text of the jokes can be downloaded here: jester_dataset_1_joke_texts.zip (92KB)

Format:

  1. 100 files
  2. Each file has title init_.html, where _ is 1 to 100
  3. The titles correspond to the ID’s of the jokes in the Excel files above

Dataset 2

Over 1.7 million continuous ratings (-10.00 to +10.00) of 150 jokes from 59,132 users: collected between November 2006 - May 2009
Save to disk, then unzip: jester_dataset_2.zip (7.7MB)

Format:

  • jester_ratings.dat: Each row is formatted as [User ID] [Item ID] [Rating]
  • jester_items.dat: Maps item ID’s to jokes

Note that the ratings are real values ranging from -10.00 to +10.00. As of May 2009, the jokes {7, 8, 13, 15, 16, 17, 18, 19} are the “gauge set” (as discussed in the Eigentaste paper) and the jokes {1, 2, 3, 4, 5, 6, 9, 10, 11, 12, 14, 20, 27, 31, 43, 51, 52, 61, 73, 80, 100, 116} have been removed (i.e. they are never displayed or rated).

Dataset 2+

An updated version of Dataset 2 with over 500,000 new ratings from 79,681 total users: data collected from November 2006 - Nov 2012
Save to disk, then unzip: jester_dataset_2+.zip (5.1MB)

Format:

  • In this dataset we stripped out users that did not respond to the gauge set of question. The data is formated as an excel file representing a 66336 x 151 matrix with rows as users and columns as jokes.
  • 10 of the jokes don’t have ratings, their ids are: { 1, 2, 3, 4, 6, 9, 10, 11, 12, 14 }.
  • Each rating is from (-10.00 to +10.00) and 99 corresponds to a null rating (user did not rate that joke).

Note that the ratings are real values ranging from -10.00 to +10.00. As of May 2009, the jokes {7, 8, 13, 15, 16, 17, 18, 19} are the “gauge set” (as discussed in the Eigentaste paper) and the jokes {1, 2, 3, 4, 5, 6, 9, 10, 11, 12, 14, 20, 27, 31, 43, 51, 52, 61, 73, 80, 100, 116} have been removed (i.e. they are never displayed or rated).

http://eigentaste.berkeley.edu/dataset/

BX-CSV-Dump

Book-Crossing Dataset … mined by Cai-Nicolas Ziegler, DBIS Freiburg

Collected by Cai-Nicolas Ziegler in a 4-week crawl (August / September 2004) from the Book-Crossing community with kind permission from Ron Hornbaker, CTO of Humankind Systems. Contains 278,858 users (anonymized but with demographic information) providing 1,149,780 ratings (explicit / implicit) about 271,379 books.

[ ! ] Freely available for research use when acknowledged with the following reference (further details on the dataset are given in this publication):
Improving Recommendation Lists Through Topic Diversification,
Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, Georg Lausen; Proceedings of the 14th International World Wide Web Conference (WWW '05), May 10-14, 2005, Chiba, Japan. To appear.

Download: [ PDF Pre-Print ]

As a courtesy, if you use the data, I would appreciate knowing your name, what research group you are in, and the publications that may result.

Format
The Book-Crossing dataset comprises 3 tables.

  • BX-Users
    Contains the users. Note that user IDs (User-ID) have been anonymized and map to integers. Demographic data is provided (Location, Age) if available. Otherwise, these fields contain NULL-values.

  • BX-Books
    Books are identified by their respective ISBN. Invalid ISBNs have already been removed from the dataset. Moreover, some content-based information is given (Book-Title, Book-Author, Year-Of-Publication, Publisher), obtained from Amazon Web Services. Note that in case of several authors, only the first is provided. URLs linking to cover images are also given, appearing in three different flavours (Image-URL-S, Image-URL-M, Image-URL-L), i.e., small, medium, large. These URLs point to the Amazon web site.

  • BX-Book-Ratings
    Contains the book rating information. Ratings (Book-Rating) are either explicit, expressed on a scale from 1-10 (higher values denoting higher appreciation), or implicit, expressed by 0.

http://www2.informatik.uni-freiburg.de/~cziegler/BX/

relationship(链接已失效)

Internet AS-level Topology Archive

Introduction

This site serves as an archive of the historical Internet AS-level topology data for academic research, providing t

评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值