Big Data Pipeline
Data Sets
https://archive.ics.uci.edu/ml/datasets.html
https://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public/answer/Jeff-Hammerbacher#
https://github.com/holdenk/spark-testing-base/wiki/Dataset-Generator
https://grouplens.org/datasets/movielens/
参考资料
Spark机器学习
Nick Pentreath
人民邮电出版社
2015-09-01
Learning Jupyter
Dan Toomey
Packt Publishing Ltd.
November 2016
Spark快速大数据分析
Holden Karau
人民邮电出版社
2015-09-01