数据挖掘:实用机器学习工具与技术(英文版.第3版)

数据挖掘:实用机器学习工具与技术(英文版.第3版)

媒体评论

  “本书既含理论又有实践应用,并且关注实践是本书的一大特色。对于从事数据挖掘和机器学习方面工作的每位读者,我强烈推荐本书!”
  —— Dorian Pyle
  《Data Preparation for Data Mining》和《Business Modeling for Data Mining》的作者
  “本书在数据挖掘技术领域备受推崇,是数据挖掘分析师的必读之物!”
  —— Herb Edelstein
  Two Crows Consulting公司首席数据挖掘咨询顾问
  “这是我最喜爱的数据挖掘书籍之一,书中不仅循序渐进地介绍了各类算法,还辅以丰富实例,详细阐述了如何应用这些算法解决实际数据挖掘问题。这本书不但有益于学习使用Weka软件,而且还会帮助你了解各类机器学习算法。”
  —— Tom Breur
  XLNT Consulting公司首席咨询顾问
 

内容简介



  《数据挖掘:实用机器学习工具与技术(英文版.第3版)》是机器学习和数据挖掘领域的经典畅销教材,被众多国外名校选为教材。书中不仅详细介绍机器学习的基本理论,还对实际工作中应用的相关工具和技术提了一些建议。本版对上一版内容进行了全面更新,以反映自第2版出版以来数据挖掘领域的技术变革和新方法,包括数据转换、集成学习、大规模数据集、多示例学习方面的新材料,以及新版的weka机器学习软件。
  《数据挖掘:实用机器学习工具与技术(英文版.第3版)》逻辑严密、内容翔实、极富实践性,适合作为高等学校本科生或研究生的教材,也可供相关技术人员参考。
 

目录

《数据挖掘:实用机器学习工具与技术(英文版.第3版)》
preface
updated and revised content
second edition
third edition
acknowledgments
about the authors
part i introduction to data mining
chapter 1 what's it all about?
1.1 data mining and machine learning
describing structural patterns
machine learning
data mining
1.2 simple examples: theweather problem and others
the weather problem
contact lenses: an idealized problem
irises: a classic numeric dataset
cpu performance: introducing numeric prediction.
labor negotiations: a more realistic example
soybean classification: a classic machine learning success.
.1.3 fielded applications
web mining
decisions involving judgment
screening images
load forecasting
diagnosis
marketing and sales
other applications
1.4 machine learning and statistics
1.5 generalization as search
1,6 data mining and ethics
reidentification
using personal information
wider issues
1.7 further reading
chapter 2 input: concepts, instances, and attributes
2.1 what's a concept?
2.2 what's in an example?
relations
other example types
2.3 what's in an attribute?
2.4 preparing the input
gathering the data together
arff format
sparse data
attribute types
missing values
inaccurate values
getting to know your data
2.5 further reading
chapter 3 output: knowledge representation
3.1 tables
3.2 linear models
3.3 trees
3.4 rules
classification rules
association rules
rules with exceptions
more expressive rules
3.5 instance-based representation
3.6 clusters
3.7 further reading
chapter 4 aig0rithms: the basic methods
4.1 inferring rudimentary rules
missing values and numeric attributes
discussion
4.2 statistical modeling
missing values and numeric attributes
naive bayes for document classification
discussion
4.3 divide-and-conquer: constructing decision trees
calculating information
highly-branching attributes
4.4 covering algorithms: constructing rules
rules versus trees
a simple covering algorithm
rules versus decision lists
4.5 mining association rules
item sets
association rules
generating rules efficiently
discussion
4.8 linear models
numeric prediction: linear regression
linear classification: logistic regression
linear classification using the perceptron
linear classification using winnow
4.7 instance-based learning
distance function
finding nearest neighbors efficiently
discussion
4.8 clustering
iterative distance-based clustering
faster distance calculations
discussion
4.9 multi-instance learning
aggregating the input
aggregating the output
discussion
4.10 further reading
4.11 weka implementations
chapter 5 credibility: evaluating what's been learned
5.1 training and testing
5.2 predicting performance
5.3 cross-validation
5.4 other estimates
leave-one-out cross-validation
the bootstrap
5.5 comparing data mining schemes
5.b predicting probabilities
quadratic loss function
informational loss function
discussion
5.7 counting the cost
cost-sensitive classification
cost-sensitive learning
lift charts
roc curves
recall-precision curves
discussion
cost curves
5.0 evaluating numeric prediction
5.9 minimum description length principle
5.10 applying the mdl principle to clustering.
5.11 further reading
part ii advanced data mining
chapter 6 implementations: real machine learning schemes.
6.1 decision trees
numeric attributes
missing values
pruning
estimating error rates
complexity of decision tree induction
from trees to rules
c4.5: choices and options
cost-complexity pruning
discussion
6.2 classification rules
criteria for choosing tests
missing values, numeric attributes
generating good rules
using global optimization
obtaining rules from partial decision trees
rules with exceptions
discussion
6.3 association rules
building a frequent-pattern tree
finding large item sets
discussion
6.4 extending linear models
maximum-margin hyperplane
nonlinear class boundaries
support vector regression
kernel ridge regression
kernel perceptron
multilayer perceptrons
radial basis function networks
stochastic gradient descent
discussion
6.5 instance-based learning
reducing the number of exemplars
pruning noisy exemplars
weighting attributes
generalizing exemplars
distance functions for generalized
exemplars
generalized distance functions
discussion
6.6 numeric prediction with local linear models
model trees
building the tree
pruning the tree
nominal attributes
missing values
pseudocode for model tree induction
rules from model trees
locally weighted linear regression
discussion
6.7 bayesian networks
making predictions
learning bayesian networks
specific algorithms
data structures for fast learning
discussion
6.8 clustering
choosing the number of clusters
hierarchical clustering
example of hierarchical clustering
incremental clustering
category utility
probability-based clustering
the em algorithm
extending the mixture model
bayesian clustering
discussion
6.0 semisupervised learning
clustering for classification
co4raining
em and co-training
discussion
6.10 multi-instance learning
converting to single-instance learning
upgrading learning algorithms
dedicated multi-instance methods
discussion
6.11 weka implementations
chapter 7 data transformations
7.1 attribute selection
scheme-independent selection
searching the attribute space
scheme-specific selection
7.2 discretizing numeric attributes
unsupervised discretization
entropy-based discretization
other discretization methods
entropy-based versus error-based discretization
converting discrete attributes to numeric attributes
7.3 projections
principal components analysis
random projections
partial least-squares regression
text to attribute vectors
time series
7.4 sampling
reservoir sampling
7.5 cleansing
improving decision trees
robust regression
detecting anomalies
one-class learning
7.6 transforming multiple classes to binary ones
simple methods
error-correcting output codes
ensembles of nested dichotomies
7.7 calibrating class probabilities
7.8 further reading
7.9 weka implementations
chapter 8 ensemble learning
8.1 combining multiple models
8.9 bagging
bias-variance decomposition
bagging with costs
8.3 randomization
randomization versus bagging
rotation forests
8.4 boosting
adaboost
the power of boosting
8.5 additive regression
numeric prediction
additive logistic regression
8.6 interpretable ensembles
option trees
logistic model trees
8.7 stacking
8.8 further reading
8.9 weka implementations
chapter 9 moving on: applications and beyond
9.1 applying data mining
9.2 learning from massive datasets
9.3 data stream learning
9.4 incorporating domain knowledge
9.5 text mining
9.6 web mining
9.7 adversarial situations
0.8 ubiquitous data mining
9.9 further reading
part iii the weka data mining workbench
chapter 10 introduction to weka
10.1 what's in weka?
10.2 how do you use it?
10.3 what else can you do?
10.4 how do you get it?
chapter 11 the explorer
11.1 getting started
preparing the data
loading the data into the explorer
building a decision tree
examining the output
doing it again
working with models
when things go wrong
11.2 exploring the explorer
loading and filtering files
training and testing learning schemes
do it yourself: the user classifier
using a metalearner
clustering and association rules
attribute selection
visualization
11.3 filtering algorithms
unsupervised attribute filters
unsupervised instance filters
supervised filters
11.4 learning algorithms
bayesian classifiers
trees
rules
functions
neural networks
lazy classifiers
multi-instance classifiers
miscellaneous classifiers
11.5 metalearning algorithms
bagging and randomization
boosting
combining classifiers
cost-sensitive learning
optimizing performance
retargeting classifiers for different tasks
11.6 clustering algorithms
1 1.7 association-rule learners
11.8 attribute selection
attribute subset evaluators
single-attribute evaluators
search methods
chapter 12 the knowledge flow interface
12.1 getting started
12.2 components
12.3 configuring and connecting the components
12.4 incremental learning
chapter 13 the experimenter
13.1 getting started
running an experiment
analyzing the results
13.2 simple setup
13.3 advanced setup
13.4 the analyze panel
13.5 distributing processing over several machines
chapter 14 the command-line interface
14.1 getting started
14.2 the structure of weka
classes, instances, and packages
the weka. core package
the weka. classifiers package
other packages
javadoc indexes
14.3 command-line options
generic options
scheme-specific options
chapter 15 embedded machine learning
15.1 a simple data mining application
messageclassifiero
updatedatao
classifymessageo
chapter 16 writing new learning schemes
16.1 an example classifier
buildclassifiero
maketreeo
computelnfogaino
classifylnstanceo
tosourceo
main()
16.2 conventions for implementing classifiers
capabilities
chapter 17 tutorial exercises for the weka explorer
17.1 introduction to the explorer interface
loading a dataset
the dataset editor
applying a filter
the visualize panel
the classify panel
17.2 nearest-neighbor learning and decision trees
the glass dataset
attribute selection
class noise and nearest-neighbor learning
varying the amount of training data
interactive decision tree construction
17.3 classification boundaries
visualizing 1r
visualizing nearest-neighbor learning
visualizing naive bayes
visualizing decision trees and rule sets
messing with the data
17.4 preprocessing and parameter tuning
discretization
more on discretization
automatic attribute selection
more on automatic attribute selection
automatic parameter tuning
17.5 document classification
data with string attributes
classifying actual documents
exploring the stringtowordvector filter
17.6 mining association rules
association-rule mining
mining a real-world dataset
market basket analysis
references
index

fj.png数据挖掘-实用机器学习工具与技术(英文版第3版).jpg

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/16502878/viewspace-740001/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/16502878/viewspace-740001/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值