前两天在网络上搜集一些数据仓库的书,发现有个哥们写了一个非常详细的书单。这份书单可能是2019年最齐全的数据仓库,BI以及数据科学学习书单了,不敢独享,转载到这里方便大家一起学习。
由于本篇文章,在 wordpress.com 站点上,原文可能并不是每个人都可以访问,具体原因大家都懂的。所以我就一字不差都转载过来,包括作者自己写的一本入门级数据仓库的书。
作者:Vincent
原文:https://dwbi1.wordpress.com/data-warehousing-books/
Disappointed with the Google search result of “data warehousing books”, I try to put all data warehousing books that I know into this page. It is totally understandable why Google’s search result don’t include ETL or Dimensional Modeling, for example. Same thing with Amazon, see Note 1 below. Even data warehouse books as important as Inmon’s DW 2.0 was missed because the title doesn’t contain the word “Warehouse”.
For data modelling my all time favorite is the Kimball’s toolkit (#1 in the list). Devlin’s, Inmon’s and Imhoff’s classics (#3, #4 and #5 in the list) have broaden my horizon on the basic principles of DW design. For ODS design it’s #17 and the newest model is in #6. If you are building a DW on SQL Server platform, Mundy’s Toolkit (#2) is a treasure. On Oracle, it’s Hobbs (#54) and on Teradata it’s Coffing’s series (#58 to #63). #7 to #11 explain Kimball’s theory in more detail. Some of them are dimensional modelling (Adamson’s #8 is excellent), some are about ETL (Kimball’s #7 is a jewel). For methodology/project management #11 is the classic, #27 is a proven treasure and #83 for the iterative approach.
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling by Ralph Kimball and Margy Ross
Microsoft Data Warehouse Toolkit: With SQL Server 2005 and the Microsoft Business Intelligence Toolset by Joy Mundy, Warren Thornthwaite, and Ralph Kimball
Building the Data Warehouse by W. H. Inmon
Mastering Data Warehouse Design: Relational and Dimensional Techniques by Claudia Imhoff, Nicholas Galemmo, and Jonathan G. Geiger
Data Warehouse: From Architecture to Implementation by Barry Devlin
DW 2.0: The Architecture for the Next Generation of Data Warehousing by William H. Inmon
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data by Ralph Kimball and Joe Caserta
The Star Schema Handbook: The Complete Reference to Dimensional Data Warehouse Designby Christopher Adamson
The Data Webhouse Toolkit: Building the Web-enabled Data Warehouse by Ralph Kimball and Richard Merz
Data Warehouse Design Solutions by Christopher Adamson and Michael Venerable
The Data Warehouse Lifecycle Toolkit by Ralph Kimball, Margy Ross, Warren Thornthwaite, and Joy Mundy
Building a Data Warehouse: with Examples on SQL Server by Vincent Rainardi
Oracle Data Warehousing and Business Intelligence Solutions: With Business Intelligence Solutions by Robert Stackowiak, Joseph Rayman, and Rick Greenwald
Impossible Data Warehouse Situations: Solutions from the Experts (Information Technology)by Sid Adelman, Joyce Bischoff, Jill Dyché, and Douglas Hackney
Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance by Christopher Adamson
Data Warehouse Performance by W. H. Inmon, Ken Rudin, Christopher K. Buss, and Ryan Sousa
Building the Operational Data Store by W. H. Inmon, Claudia Imhoff, and Greg Battas
Rapid Data Warehouse Design: User-Focused Techniques for Designing Dimensional Data Warehouses by Lawrence Corr
Data Warehouse Design: Modern Principles and Methodologies by Matteo Golfarelli and Stefano Rizzi
Advanced Data Warehouse Design: From Conventional to Spatial and Temporal Applications (Data-centric Systems and Applications) by Elzbieta Malinowski and Esteban Zimányi
Designing a Data Warehouse – Supporting Customer Relationship Management by Chris Todman
Data Warehouses and OLAP: Concepts, Architectures and Solutions by Robert Wrembel and Christian Koncilia
Implementing a Data Warehouse: A Methodology That Worked by Bruce Russel Ullrey
Data Warehousing for Dummies by Thomas C. Hammergren
Improving Data Warehouse and Business Information Quality : Methods for Reducing Costs and Increasing Profits by Larry P English
Data Warehouse 100 Success Secrets – 100 Most Asked Questions on Data Warehouse Design, Projects, Business Intelligence, Architecture, Software and Models by Richard Martin
Data Warehouse Project Management by Sid Adelman and Larissa T. Moss
Data Warehouse Management Handbook by Kachur
Data Warehouse: Extract, Transform, Load, Metadata, Data Integration, Data Mining, Data Warehouse Appliance, Database Management System, Decision Support System by Frederic P. Miller, Agnes F. Vandome, and John McBrewster
Oracle Data Warehouse Tuning for 10g by Gavin JT Powell
Using the Data Warehouse by W. H. Inmon and Richard D. Hackathorn
Entity-attribute-value model: Data model, Data warehouse, Denormalization, Attribute- value system, Linked Data, Resource Description Framework, Semantic Web, Inner- platform effectby Frederic P. Miller, Agnes F. Vandome, and John McBrewster
Index Structures for Data Warehouses: v. 1859 (Lecture Notes in Computer Science) by Marcus Jürgens
Tivoli Data Warehouse Version 1.3: Planning And Implementation by IBM Redbooks and Vasfi Gucer
Data Warehouse Implementations: Critical Implementation Factors Study by Joe Ganczarski
The Enterprise Data Warehouse: Planning, Building and Implementation v. 1 by Eric Sperley and Hewlett-Packard
Data Warehousing in the Real World: A Step-by-step Guide for Building Decision Support Data Warehouses by S. Anahory and D. Murray
Filtering the Web to Feed Data Warehouses by Witold Abramowicz, Pawel J. Kalczynski, and Krzysztof Wecel
Data Warehouse: Practical Advice from the Experts by Joyce Bischoff and Ted Alexander
Leveraging DB2 Data Warehouse Edition for Business Intelligence by IBM Redbooks
Fundamentals of Data Warehouses by Matthias Jarke, Maurizio Lenzerini, Yannis Vassiliou, and P. Vassiliadis
Web-enabled Data Warehouse by William A. Giovinazzo
Decision Support and Data Warehouse Systems by Efrem G Mallach
Planning and Designing the Data Warehouse (The Data Warehousing Institute series) by Ramon Barquin and Herb Edelstein
Data Warehouse Design by William A. Giovinazzo
Building, Using and Managing the Data Warehouse (Data Warehousing Institute) by Ramon Barquin and Herb Edelstein
Building a Data Warehouse for Decision Support by Vidette Poe and Laura L. Reeves
Parallel Systems in the Data Warehouse (Data Warehousing Institute) by Steve Morse and David Isaac
Decision Support in the Data Warehouse (The Data Warehousing Institute series) by Hugh J. Watson and Paul Gray
Building a Better Data Warehouse by Don Meyer and Casey E. Cannon
The Data Model Resource Book: A Library of Logical Data and Data Warehouse Models by Len Silverston, W. H. Inmon, and Kent Graziano
Managing the Data Warehouse: Practical Techniques for Monitoring Operations and Performances Administering Data and Tools by W. H. Inmon, J. D. Welch, and Katherine L. Glassey
The Intranet Data Warehouse: Tools and Techniques for Building Intranet-enabled Data Warehouse by Richard Tanler
Oracle 10g Data Warehousing by Lilian Hobbs PhD, Susan Hillson MS in CIS Boston University, Shilpa Lawande, and Pete Smith
Oracle9iR2 Data Warehousing by Lilian Hobbs, Susan Hillson MS in CIS Boston University, and Shilpa Lawande
Oracle8i Data Warehousing by Lilian Hobbs PhD and Susan Hillson MS in CIS Boston University
Oracle8i Data Warehousing by Michael J. Corey, Michael Abbey, Ben Taub, and Ian Abramson
Tera-Tom on Teradata Basics by Tom Coffing and Gareth Walter
Tera-Tom on Teradata Physical Implementation by W. Coffing and Mark Ferguson
Tera-Tom on Teradata SQL by Tom Cofffing and Robert Hines
Tera-Tom on Teradata Database Administrator by Tom Coffing and Steve Wilmes
Tera-Tom on Teradata Designer by Tom Coffing and Todd Wilson
Tera-Tom on Teradata Application Development by Tom Coffing and Scott Smith
Tera-Tom on Teradata E-Business by Randy Volters and Tom Coffing
Teradata SQL Unleash the Power V2R6 by Thomas L. Coffing and Michael Larkins
Teradata Utilities – Breaking the Barriers by Tom Coffing, Morgan Jones, Mike Larkins, Steve Wilmes, Randy Volters
Netezza SQL – Harness the Power by Mike Larkins and Tom Coffing
Netezza Underground: The unauthorized tales of derring-do and adventures in resilient data warehousing solutions byDavid Birmingham
Teradata Users Guide: The Ultimate Companion by Tom Coffing, Leona Coffing, Chris Coffing, and Robert Hines
Teradata SQL Quick Reference Guide – Simplicity By Design by Tom Coffing, Todd Carroll, Robert Hines, and Mike Larkins
Secrets of Best Data Warehouses in the World by Rob Armstrong, Tom Coffing, and Rolf Hanusa
Common Warehouse Metamodel: An Introduction to the Standard for Data Warehouse Integration (Omg) by John Poole, Dan Chang, Douglas Tolbert, and David Mellor
50 Tb Data Warehouse Benchmark on IBM System Z by IBM Redbooks
E-Business Intelligence Front-End Tool Access to Os/390 Data Warehouse by IBM Redbooks
Rdb/vms: Developing a Data Warehouse by William H. Inmon and Chuck Kelley
Data Warehouses: More Than Just Mining by Barbara J. Bashein and M. Lynne Markus
Corporate Information with Sap(R)-Eis: Building a Data Warehouse and Mis-Application (Efficient business-computing) by Bernd-Ulrich Kaiser
Dimensional Data Warehousing with MySQL: A Tutorial by Djoni Darmawikarta
Data Warehousing Fundamentals: A Comprehensive Guide for IT Professionals by Paulraj Ponniah
Data Warehousing, Data Mining, and OLAP (Data Warehousing/Data Management) by Alex Berson and Stephen J. Smith
Data Warehousing: Architecture and Implementation by Mark W. Humphries, Michael W. Hawkins, and Michelle C. Dy
Data Warehousing 101: Concepts and Implementation by Arshad Khan
Agile Data Warehousing: Delivering World-Class Business Intelligence Systems Using Scrum and XP by Ralph Hughes
e-Data: Turning Data Into Information With Data Warehousing by Jill Dyché
Pentaho Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQL by Roland Bouman and Jos van Dongen
A Manager’s Guide to Data Warehousing by Laura Reeves
Data Warehousing with SAP Bw7 Bi in SAP Netweaver 2004s: Architecture, Concepts, and Implementation by Christian Mehrwald and Sabine Morlock
Data Warehousing: Using the Wal-Mart Model (The Morgan Kaufmann Series in Data Management Systems) by Paul Westerman
Oracle DBA Guide to Data Warehousing and Star Schemas by Bert Scalzo
Building and Maintaining a Data Warehouse by Fon Silvers
Evolving Application Domains of Data Warehousing and Mining: Trends and Solutions by Pedro Nuno San-Banto Furtado
Data Warehousing And Business Intelligence For e-Commerce (The Morgan Kaufmann Series in Data Management Systems) by Alan R. Simon and Steven L. Shaffer
Data Warehousing with Informix: Best Practices by Angela Sanchez
Data Warehousing: Concepts, Technologies, Implementations, and Management by Harry Singh
Data Warehousing in Action by Sean Kelly
High Performance Oracle Data Warehousing: All You Need to Master Professional Database Development Using Oracle by Donald K. Burleson
Implementing Enterprise Data Warehousing: A Guide for Executives by Alan Schlukbier
Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development: Innovative Methods and Applications (Advances in Data Warehousing and Mining (Adwm) Book Series) by Tho Manh Nguyen
New Trends in Data Warehousing and Data Analysis (Annals of Information Systems) by Stanislaw Kozielski and Robert Wrembel
Data Warehousing with Service-oriented Architecture: Designing and Implementing Prototype Models For an Integration of Near-Real-Time Data Warehousing Architecture with Service-oriented Architecture by Ronnie Abrahiem
Encyclopedia of Data Warehousing and Mining, Second Edition by John Wang
IBM Data Warehousing: With IBM Business Intelligence Tools by Michael L. Gonzales
Clickstream Data Warehousing by Mark Sweiger, Mark R. Madsen, Jimmy Langston, and Howard Lombard
Intelligent Data Warehousing: From Data Preparation to Data Mining by Zhengxin Chen
Data Stores, Data Warehousing, and the Zachman Framework: Managing Enterprise Knowledge (Mcgraw-Hill Series on Data Warehousing and Data Management) by William H. Inmon, John A. Zachman, and Jonathan G. Geiger
Progressive Methods in Data Warehousing and Business Intelligence: Concepts and Competitive Analytics (Advances in Data Warehousing and Mining) by David Taniar
Modern Data Warehousing, Mining, and Visualization: Core Concepts by George M. Marakas
AS/400 Data Warehousing: The Complete Guide to Implementation by Brian W. Kelly
Data Warehousing and Data Mining for Telecommunications (Artech House Computer Science Library) by Rob Mattison
Data Warehousing : Design, Development and Best Practices by Soumendra Mohanty
Exploration Warehousing: Turning Business Information into Business Opportunity by William H. Inmon, R. H. Terdeman, and Claudia Imhoff
The Data Model Resource Book: A Library of Logical Data and Data Warehouse Designs by Len Silverston, William H. Inmon, and Kent Graziano
Data Warehousing in the Real World (A Practical Guide for Building Decision Support Systems)by Dennis Murray Sam Anahory
Parallel Processing Techniques for Data Warehousing and Mining: Application and Challengesby Satchidananda Dehuri
Essential Oracle8i Data Warehousing: Designing, Building, and Managing Oracle Data Warehouses by Gary Dodge and Tim Gorman
The Essential Guide to Data Warehousing by Lou Agosta
Data Warehousing OLAP and Data Mining by S. Nagabhushana
Building the Customer-Centric Enterprise: Data Warehousing Techniques for Supporting Customer Relationship Management by Claudia Imhoff, Lisa Loftis, and Jonathan G. Geiger
Data Warehousing: The Ultimate Guide to Building Corporate Business Intelligence (HOTT Guide) by SCN Education B.V.
Data Warehousing and Knowledge Discovery: 9th International Conference, DaWaK 2007, Regensburg, Germany, September 3-7, 2007, Proceedings (Lecture Notes … Applications, incl. Internet/Web, and HCI) by Il Yeol Song, Johann Eder, and Tho Manh Nguyen
Clinical Data Mining and Warehousing, An Issue of Clinics in Laboratory Medicine (The Clinics: Internal Medicine) by James Harrison Jr. MD PhD
Using data warehousing to deliver integrated management information: Case studies of customer data integration using sales and marketing data marts by Shana Ponelis
Data Warehousing and Knowledge Discovery: 6th International Conference, DaWaK 2004, Zaragoza, Spain, September 1-3, 2004, Proceedings (Lecture Notes in Computer Science) by Yahiko Kambayashi, Mukesh Mohania, and Wolfram Wöß
Strategic Data Warehousing: Achieving Alignment with Business by Neera Bhansali
Strategic Data Warehousing Principles Using SAS Software by Peter R. Welbrock
Data Warehousing: The Route to Mass Communication by Sean Kelly
Data Warehousing for E-Business by R. H. Terdeman, Joyce Norris-Montanari, Dan Meers, and William H. Inmon
Data Warehousing and Knowledge Discovery: 10th International Conference, DaWak 2008 Turin, Italy, September 1-5, 2008, Proceedings (Lecture Notes in Computer … Applications, incl. Internet/Web, and HCI) by Il-Yeol Song, Johann Eder, and Tho Manh Nguyen
Data Warehousing and Data Mining Techniques for Cyber Security (Advances in Information Security) by Anoop Singhal
Data Warehousing and Decision Support : The State of the Art, Volume 1 by Pam Roth. Volume 2 is here.
Advances in Database Technologies: ER ’98 Workshops on Data Warehousing and Data Mining, Mobile Data Access, and Collaborative Work Support and Spatio-Temporal … (Lecture Notes in Computer Science) by Yahiko Kambayashi, Dik Lun Lee, Ee-Peng Lim, and Mukesh Kumar Mohania
Data Warehousing and Web Engineering by Shirley A. Becker
ERP and Data Warehousing in Organizations: Issues and Challenges by Gerald G. Grant
Data Warehousing and Knowledge Discovery: 8th International Conference, DaWaK 2006, Krakow, Poland, September 4-8, 2006, Proceedings (Lecture Notes in … Applications, incl. Internet/Web, and HCI) by A Min Tjoa and Juan Trujillo
Data Warehousing Advice for Managers by Patricia L. Ferdinandi
Data Warehousing and the Management Accountant (CIMA Research) by Ian Cobb
Data Warehousing and Knowledge Discovery: 4th International Conference, DaWaK 2002, Aix-en-Provence, France, September 4-6, 2002. Proceedings (Lecture Notes in Computer Science) by Yahiko Kambayashi, Werner Winiwarter, and Masatoshi Arikawa
Oracle Data Warehousing Unleashed by Michael Schrader, John Dakin, Kieron Hardy, and Matthew Townsend
Journal of Healthcare Information Management, E-Healthcare Data Warehousing Journal of Healthcare Information Management, No. 2: Journal of Healthcare … Health Care Information Mgmt) by Julie Foreman
Worldwide Data Warehousing Tools 2004 Vendor Shares by Dan Vesset
Constructing Data Warehouses with Metadata-driven Generic Operators by Dr Bin Jiang.
Testing the Data Warehouse Practicum by Doug Vucevic and Wayne Yaddow
Notes:
You may think that “data warehouse” search in Amazon would also include “data warehousing”. That was what I was thinking. But sadly no. I don’t hope Amazon search is smart enough to interpret that the term “ETL” or “Dimensional Model” has a lot to do with data warehousing either, hence my motive to create this list. Same for the term “ODS” and “data mart”.
Data warehouse book as important as Inmon’s DW 2.0 was missed because the title doesn’t contain “Warehous*”. Sad. And Data Warehousing 101: Concepts and Implementation by Arshad Khan was missed when we search “Data Warehouse” in Amazon.
I don’t limit myself on SQL Server. As you can see I also include Oracle ones. We can learn a lot about data warehousing from other platform, particularly the ETL. In fact I learnt a lot from a book called “Oracle 8i Data Warehousing” (Corey et al, not Hobbs & Hilson). Informix, DB2, MySQL, AS/400, SAS, are all in there now.
I don’t include data modelling book in the list if it’s a general one. I only include it if it’s dimensional model.
I don’t include “bundle”, e.g. several books packaged and sold as one. An example of a bundle is Kimball’s Toolkit bundle. The reason is because I have included the components individually.
I don’t include data mining book if it’s only data mining. But if contains data warehousing as well then I include it. See Alex Berson’s for example. Ditto for MDM, BI, OLAP, DQ and Text Analytics. I do include Decision Support though (well of course)
Can you believe it’s 123 books in data warehousing! That’s a lot of books for 1 area of study/work. And that exclude the things I mentioned above.
If there are many editions of the book (like Inmon classic) I only include the latest one. First edition is an absolute treasure sometimes, like Kimball’s 1996 but there you go. When it’s a rewrite using different version of the software, I include them. For example: Oracle 8i, 9i and 10g Data Warehousing.
I do include conference proceedings and lecture notes, despite that some people say they are not ‘real books’. I don’t care the physical form of it (thin, thick, non paper, etc), as long as the content is warehousing.
Apologies there are many DW books in German which I don’t include here. Primarily because this is an English blog and I can’t write in German. Perhaps somebody else could make a list of these German DW books (there are really a lot of them, check in Amazon).
I know there is a Data Warehousing book in MySQL. I know it exists because I know the author, who is also from Indonesia like me but he lives in Canada now. Djoni Darmawikarta. So I’ll find it and put it here too.
I own Barry Devlin’s warehousing book. Very old, the binder is almost off, but the content is illuminating. Primarily because it was written free from Inmon & Kimball influence, hence it defined its owned principles of design. I’ll add it here.
Intelligent Solution composed a comprehensive list of data warehousing articles, from 1993 to 2006.
My Book
I was sometimes asked by people who wanted to learn data warehousing to recommend a book for them. Some of them are database administrators/data architects (on various platforms) and some are developers (application developers and database developers). They know how to write SQL. They know how to create tables. They know how to query data. They are looking for a basic data warehousing book, which is practical and aimed for beginners. A book that can be used by new starters to build their first data warehouse, and the BI on top of it. A book that contains all the essential topics such as methodology, architecture, data modelling, ETL, data quality, reports, cubes and BI. A book that contains examples and illustrations from real projects which are easy to understand. For this reason I wrote a data warehousing book: Building a Data Warehouse: with Examples on SQL Server (#12).
It has 17 chapters:
Chapter 1 is about what a data warehouse is
Chapter 2 is about data warehouse architecture
Chapter 3 is about methodology / project management
Chapter 4 is about gathering requirements
Chapter 5 is about designing the data model, both dimensional and normalised
Chapter 6 is about the system architecture/servers and configuring the databases
Chapter 7 is about ETL (extracting data from source systems)
Chapter 8 is also about ETL (loading data into the warehouse)
Chapter 9 is about data quality
Chapter 10 is about metadata
Chapter 11 is about reports
Chapter 12 is about OLAP cubes
Chapter 13 is about BI (Business Intelligence)
Chapter 14 is about using a data warehouse for CRM
Chapter 15 is about unstructured data and data warehousing search
Chapter 16 is about testing
Chapter 17 is about operation and administration
It contains all the essential topics in data warehousing. In order for this book to be able to be used to build the reader’s first data warehouse, and the BI on top of it, I need to give a case study. A case study that contain examples which span across all those chapters. From designing the architecture, to building the cubes and reports. For this purpose I had to choose a platform. I chose SQL Server as the platform. Not only it has an excellent database engine, it also comes with the ETL, reports, OLAP cubes and data mining tool built-in. SQL Server 2005/2008 is a complete end-to-end data warehousing solution. So in chapter 6 I use SQL Server database server to create the databases. In chapter 7 & 8 I use SSIS for data extraction and data loading (ETL). In chapter 10 I used SQL Server database for metadata. In chapter 11 I used SSRS for reports. In chapter 12 I used SSAS for OLAP cubes. And in chapter 13 I used SSAS for data mining. I hope this book will serve its purpose in providing a basic data warehousing book, which is practical and aimed for beginners
想要看个简单版本?在看,转发,走起来~~