表格
| 名字 | 点数1 | 边数 | 层数2 | 权重3 |
|---|---|---|---|---|
| ratings_data | 49290 + 139738 | 664823 | 5 | Unweighted |
| jester_dataset_1_1 | 24983 + 100 | 2498300 | -10.00 to +10.00 | Unweighted |
| jester_dataset_1_2 | 23500 + 100 | 2350000 | -10.00 to +10.00 | Unweighted |
| jester_dataset_1_3 | 24930 + 100 | 2493000 | -10.00 to +10.00 | Unweighted |
| jester_dataset_2 | 63978 + ~150 | 1761439 | -10.00 to +10.00 | Unweighted |
| jester_dataset_3 | 66336 + 151 | 10016736 | -10.00 to +10.00 | Unweighted |
| BX-CSV-Dump4 | 278858(276271) + 271380 | >1048576(Excel error) | 10 | |
| 201305.relationship5 | ~200000 | 310150 | 4 | Unweighted |
| 201306.relationship | ~200000 | 310150 | 4 | Unweighted |
| 201307.relationship | ~200000 | 368048 | 4 | Unweighted |
| 201308.relationship | ~200000 | 368048 | 4 | Unweighted |
| 201309.relationship | ~200000 | 368048 | 4 | Unweighted |
| 201310.relationship | ~200000 | 366566 | 4 | Unweighted |
| 201311.relationship | ~200000 | 348878 | 4 | Unweighted |
| 201312.relationship | ~200000 | 396030 | 4 | Unweighted |
| Vickers-Chan-7thGraders_Multiplex_Social | 29 | 29*3 | 3 | |
| Padgett-Florence-Families_Multiplex_Social | 16 | 35 | 2 | |
| Lazega-Law-Firm_Multiplex_Social | 71 | 2571 | 3 | Unweighted |
| Krackhardt-High-Tech_Multiplex_Social | 21 | 312 | 3 | |
| Kapferer-Tailor-Shop_Multiplex_Social | 39 | 1018 | 4 | Unweighted |
| CKM-Physicians-Innovation_Multiplex_Social | 246 | 1551 | 3 | Unweighted |
| CS-Aarhus_Multiplex_Social | 61 | 620 | 5 | |
| EUAir_Multiplex_Transport | 249 | 3588 | 37 | |
| London_Multiplex_Transport | 323 | 441 | 3 | |
| NYClimateMarch2014_Multiplex_Social | 102439 | 353495 | 3 | Weighted |
| Cannes2013_Multiplex_Social | 438537 | 991854 | 3 | Weighted |
| MoscowAthletics2013_Multiplex_Social | 88804 | 210250 | 3 | Weighted |
| MLKing2013_Multiplex_Social | 327707 | 396671 | 3 | Weighted |
| ObamaInIsrael2013_Multiplex_Social | 2281259 | 4061960 | 3 | Weighted |
| Arabidopsis_Multiplex_Genetic | 6980 | 18654 | 7 | Unweighted |
| Homo_Multiplex_Genetic | 18222 | 170899 | 7 | Unweighted |
| AMiner-Coauthor | ~1712433 | 4258615 | 1 | Weighted |
| wikitree | 1382751 | 9192212 | Unweighted | |
| coauthor | 1629217 |
文章目录
- 表格
- ratings_data
- jester_dataset
- BX-CSV-Dump
- relationship(链接已失效)
- Vickers-Chan-7thGraders_Multiplex_Social
- Padgett-Florence-Families_Multiplex_Social
- Lazega-Law-Firm_Multiplex_Social
- Krackhardt-High-Tech_Multiplex_Social
- Kapferer-Tailor-Shop_Multiplex_Social
- CKM-Physicians-Innovation_Multiplex_Social
- CS-Aarhus_Multiplex_Social
- EUAir_Multiplex_Transport
- London_Multiplex_Transport
- NYClimateMarch2014_Multiplex_Social
- Cannes2013_Multiplex_Social
- MoscowAthletics2013_Multiplex_Social
- MLKing2013_Multiplex_Social
- ObamaInIsrael2013_Multiplex_Social
- Arabidopsis_Multiplex_Genetic
- Homo_Multiplex_Genetic
- AMiner-Coauthor
- wikitree
- coauthor
下面所有的说明摘自原网页说明,每个数据说明最下面为该数据集的说明网页和下载网页(如果两者不是同一个网页则会分开写)
ratings_data
The dataset was collected by Paolo Massa in a 5-week crawl (November/December 2003) from the Epinions.com Web site.
The dataset contains
- 49,290 users who rated a total of
- 139,738 different items at least once, writing
- 664,824 reviews and
- 487,181 issued trust statements.
Users and Items are represented by anonimized numeric identifiers.
The dataset consists of 2 files.
it contains the ratings given by users to items.
Every line has the following format:
user_id item_id rating_value
For example,
23 387 5
represents the fact “user 23 has rated item 387 as 5”
Ranges:
user_id is in [1,49290]
item_id is in [1,139738]
rating_value is in [1,5]
http://www.trustlet.org/downloaded_epinions.html
http://www.trustlet.org/datasets/downloaded_epinions/
jester_dataset
Anonymous Ratings from the Jester Online Joke Recommender System
Dataset 1: Over 4.1 million continuous ratings (-10.00 to +10.00) of 100 jokes from 73,421 users: collected between April 1999 - May 2003.
Dataset 2: Over 1.7 million continuous ratings (-10.00 to +10.00) of 150 jokes from 59,132 users: collected between November 2006 - May 2009.
Dataset 2+: An updated version of Dataset 2 with over 500,000 new ratings from 79,681 total users: data collected from November 2006 - Nov 2012
Freely available for research use when acknowledged with the following reference:
Eigentaste: A Constant Time Collaborative Filtering Algorithm. Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins. Information Retrieval, 4(2), 133-151. July 2001.
As a courtesy, if you use the data, I would appreciate knowing your name, what research group you are in, and the publications that may result.
Dataset 1
Over 4.1 million continuous ratings (-10.00 to +10.00) of 100 jokes from 73,421 users: collected between April 1999 - May 2003
Save to disk, then unzip to obtain Excel files:
- jester_dataset_1_1.zip: (3.9MB) Data from 24,983 users who have rated 36 or more jokes, a matrix with dimensions 24983 X 101.
- jester_dataset_1_2.zip: (3.6MB) Data from 23,500 users who have rated 36 or more jokes, a matrix with dimensions 23500 X 101.
- jester_dataset_1_3.zip: (2.1MB) Data from 24,938 users who have rated between 15 and 35 jokes, a matrix with dimensions 24,938 X 101.
Format:
- 3 Data files contain anonymous ratings data from 73,421 users.
- Data files are in .zip format, when unzipped, they are in Excel (.xls) format
- Ratings are real values ranging from -10.00 to +10.00 (the value “99” corresponds to “null” = “not rated”).
- One row per user
- The first column gives the number of jokes rated by that user. The next 100 columns give the ratings for jokes 01 - 100.
- The sub-matrix including only columns {5, 7, 8, 13, 15, 16, 17, 18, 19, 20} is dense. Almost all users have rated those jokes (see discussion of “universal queries” in the above paper).
The text of the jokes can be downloaded here: jester_dataset_1_joke_texts.zip (92KB)
Format:
- 100 files
- Each file has title init_.html, where _ is 1 to 100
- The titles correspond to the ID’s of the jokes in the Excel files above
Dataset 2
Over 1.7 million continuous ratings (-10.00 to +10.00) of 150 jokes from 59,132 users: collected between November 2006 - May 2009
Save to disk, then unzip: jester_dataset_2.zip (7.7MB)
Format:
- jester_ratings.dat: Each row is formatted as [User ID] [Item ID] [Rating]
- jester_items.dat: Maps item ID’s to jokes
Note that the ratings are real values ranging from -10.00 to +10.00. As of May 2009, the jokes {7, 8, 13, 15, 16, 17, 18, 19} are the “gauge set” (as discussed in the Eigentaste paper) and the jokes {1, 2, 3, 4, 5, 6, 9, 10, 11, 12, 14, 20, 27, 31, 43, 51, 52, 61, 73, 80, 100, 116} have been removed (i.e. they are never displayed or rated).
Dataset 2+
An updated version of Dataset 2 with over 500,000 new ratings from 79,681 total users: data collected from November 2006 - Nov 2012
Save to disk, then unzip: jester_dataset_2+.zip (5.1MB)
Format:
- In this dataset we stripped out users that did not respond to the gauge set of question. The data is formated as an excel file representing a 66336 x 151 matrix with rows as users and columns as jokes.
- 10 of the jokes don’t have ratings, their ids are: { 1, 2, 3, 4, 6, 9, 10, 11, 12, 14 }.
- Each rating is from (-10.00 to +10.00) and 99 corresponds to a null rating (user did not rate that joke).
Note that the ratings are real values ranging from -10.00 to +10.00. As of May 2009, the jokes {7, 8, 13, 15, 16, 17, 18, 19} are the “gauge set” (as discussed in the Eigentaste paper) and the jokes {1, 2, 3, 4, 5, 6, 9, 10, 11, 12, 14, 20, 27, 31, 43, 51, 52, 61, 73, 80, 100, 116} have been removed (i.e. they are never displayed or rated).
http://eigentaste.berkeley.edu/dataset/
BX-CSV-Dump
Book-Crossing Dataset … mined by Cai-Nicolas Ziegler, DBIS Freiburg
Collected by Cai-Nicolas Ziegler in a 4-week crawl (August / September 2004) from the Book-Crossing community with kind permission from Ron Hornbaker, CTO of Humankind Systems. Contains 278,858 users (anonymized but with demographic information) providing 1,149,780 ratings (explicit / implicit) about 271,379 books.
[ ! ] Freely available for research use when acknowledged with the following reference (further details on the dataset are given in this publication):
Improving Recommendation Lists Through Topic Diversification,
Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, Georg Lausen; Proceedings of the 14th International World Wide Web Conference (WWW '05), May 10-14, 2005, Chiba, Japan. To appear.
Download: [ PDF Pre-Print ]
As a courtesy, if you use the data, I would appreciate knowing your name, what research group you are in, and the publications that may result.
Format
The Book-Crossing dataset comprises 3 tables.
-
BX-Users
Contains the users. Note that user IDs (User-ID) have been anonymized and map to integers. Demographic data is provided (Location,Age) if available. Otherwise, these fields contain NULL-values. -
BX-Books
Books are identified by their respective ISBN. Invalid ISBNs have already been removed from the dataset. Moreover, some content-based information is given (Book-Title,Book-Author,Year-Of-Publication,Publisher), obtained from Amazon Web Services. Note that in case of several authors, only the first is provided. URLs linking to cover images are also given, appearing in three different flavours (Image-URL-S,Image-URL-M,Image-URL-L), i.e., small, medium, large. These URLs point to the Amazon web site. -
BX-Book-Ratings
Contains the book rating information. Ratings (Book-Rating) are either explicit, expressed on a scale from 1-10 (higher values denoting higher appreciation), or implicit, expressed by 0.
http://www2.informatik.uni-freiburg.de/~cziegler/BX/
relationship(链接已失效)
Internet AS-level Topology Archive
Introduction
This site serves as an archive of the historical Internet AS-level topology data for academic research, providing t

本文提供了多个多层关系网络数据集的详细信息,包括Epinions的用户评分数据、Jester的在线笑话推荐系统数据、Book-Crossing的用户评级数据以及多个社会网络数据集,如Twitter上的特殊事件互动数据。这些数据集涵盖了用户评分、社交媒体互动、图书评级和遗传互动等多个方面,适用于多层网络分析和研究。
最低0.47元/天 解锁文章
301

被折叠的 条评论
为什么被折叠?



