Courses of Data Mining & Machine Learning & Pattern Recognition

Data Mining

The subject of Knowledge Discovery and Data Mining (KDD) concerns the extraction of useful information from data. Since this is also the essence of many sub-areas of computer science, as well as the field of statistics, KDD can be said to lie at the intersection of statistics, machine learning, data bases, pattern recognition, information retrieval and artificial intelligence.

The subject matter of data mining is vast, making the task of task of learning about the subject itself a task of data mining! The one-semester course that I teach emphasizes the theory and algorithms of data mining. Such algorithms are concerned with deriving global models/ local patterns, visualization, and retrieval by content. Related courses that I teach are pattern recognition and machine learning which cover many other topics related to data mining.

Textbook: Principles of Data Mining by D. Hand, H. Mannila and P. Smith, MIT Press, 2001.


Lectures

The following are the presentation slides used in a classroom setting. They are given here as links to pdf files. Since these slides are continually updated you may wish to revisit them. The course is being taught in Spring 2010.

  1. Introduction to Data Mining
    1. Introduction to Data Mining
  2. Measurement and Data
    1. Measurements and Distances
  3. Visualization of Data
    1. Summarizing Data, Histograms, Scatter Plots
    2. Principal Components Analysis and Multidimensional Scaling
  4. Data Analysis and Uncertainty
    1. Random Variables
    2. Estimation
    3. Hypothesis Testing/Sampling
  5. Systematic Overview of Data Mining Algorithms
    1. Decision Trees and MLP
    2. Association Rules and Text Retrieval
  6. Models and Patterns
    1. Prediction Models
    2. Probability Models and Graphical Models
    3. Structured Data: Markov Models
    4. Pattern Structures
  7. Content-Based Information Retrieval
    1. Precision and Recall
    2. Text Retrieval: Term Frequency and Inverse Document Frequency
    3. Text Retrieval: Latent Semantic Indexing and Probabilistic Retrieval
    4. Content-Based Image Retrieval
Data Mining


============================================================================================

Machine Learning

Machine learning is an exciting topic about designing machines that can learn from examples. The course covers the necessary theory, principles and algorithms for machine learning. The methods are based on statistics and probability-- which have now become essential to designing systems exhibiting artificial intelligence. The course emphasisizes Bayesian techniques and probabilistic graphical models (PGMs). The material is complementary to a course on Data Mining where statistical concepts are used to analyze data for human, rather than machine, use.

The textbooks for different parts of the course are "Pattern Recognition and Machine Learning" by Chris Bishop (Springer 2006) and "Probabilistic Graphical Models" by Daphne Koller and Nir Friedman (MIT Press 2009).

Lecture Slides for Machine Learning and Probabilistic Graphical Models

Following are course topics with pointers to lecture overhead slides and some lecture video files. Previously taught as a single semester course it is now divided into two successive courses taught during Fall and Spring semesters. 

Note about slides and video: The slides for Sections 1-7 are from Fall 2010. Slides for Sections 8, 9 and 11 are from Spring 2011. The videos are from Fall 2008 for Sections 1-7 and the videos for Sections 8,9, and 11 are from Spring 2011. Thus the correspondence may be somewhat off. Lecture slides are frequently updated as the course progresses. Chapters 1-14 (Topic titles in Red) are more recently taught versions.


  1. Introduction
    1. Machine Learning-Overview(3MB)
    2. Text Classification Example(225KB)
    3. Regression Example(562KB)
    4. Probability Theory(950KB)
    5. Decision-Theory(212KB)
    6. Information Theory(160KB)
    7. MATLAB Introduction(347KB)
  2. Probability Distributions
    1. Discrete Distributions(291KB)
    2. Gaussian Distribution(833KB)
  3. Linear Models for Regression
    1. Regression with Basis Functions(1.2MB)
    2. Bias-Variance(616KB)
    3. Bayesian Regression(1MB)
    4. Bayesian Model Comparison(300KB)
    5. Evidence Approximation(650KB)
  4. Linear Models for Classification
    1. Introduction(88KB)
    2. Discriminant Functions(2.4MB)
    3. Generative Models(1.4MB)
    4. Logistic Regression(1.4MB)
    5. Laplace Approximation (519KB)
    6. Bayesian Logistic Regression(845KB)
  5. Neural Networks
    1. Introduction(551KB)
    2. Training(848KB)
    3. Error Backpropagation(821KB)
    4. The Hessian Matrix(288KB)
    5. Regularization in Neu Networks(1.2MB)
    6. Mixture Density Networks (634KB)
    7. Bayesian Neural Networks(716KB)
  6. Kernel Methods
    1. Kernel Methods(1MB)
    2. Radial Basis Function Networks(549KB)
    3. Gaussian Processes(1.4 MB)
  7. Sparse Kernel Machines
    1. Support Vector Machines(958KB)
  8. Probabilistic Graphical Models (Directed)
    1. Bayesian Networks(1.4MB)
    2. Querying Probability Distributions(460KB)
    3. Genetic Inheritance Example(207KB)
    4. Graphs and Distributions(305KB)
    5. Reasoning Patterns & D-Separation(393KB)
    6. Conditional Independence(830KB)
    7. Semantics of Bayesian Networks(385KB)
  9. Probabilistic Graphical Models (Undirected)
    1. Undirected Graphical Models(690KB)
    2. Independencies in Markov Networks(526KB)
    3. Constructing Markov Networks(251KB)
    4. Alternate Parameterizations of MNs(2.1MB)
    5. MRFs in Computer Vision(1.2MB)
    6. From BNs to MNs(144KB)
    7. Partially Directed Models & CRFs(916KB)
  10. Inference in Graphical Models
    1. Introduction(1MB)
    2. Factor Graphs(1.1MB)
    3. Max Sum Algorithm(734KB)
    4. Loopy Belief Propagation(78KB)
  11. Learning Graphical Models
    1. Learning PGMs: Overview(695KB)
    2. Learning as Optimization(398KB)
    3. Parameter Estimation(1MB)
    4. Bayesian Estimation in Bay.Nets(980KB)
  12. Mixture Models and EM
    1. K-means Clustering(1.1MB)
    2. Mixtures of Gaussians(1MB)
    3. Latent Variable View of EM(516KB)
  13. Approximate Inference
    1. Approximate Inference(3.2MB)
  14. Sampling Methods
    1. Basic Sampling Methods(375KB)
    2. Monte Carlo Methods(426KB)
  15. Continuous Latent Variables
    1. Principal Components Analysis
    2. Nonlinear Latent Variable Models
  16. Sequential Data
    1. Markov Models(433KB)
    2. Hidden Markov Models(1.3MB)
    3. Extensions to HMMs(287KB)
    4. Linear Dynamical Systems(217KB)
    5. Conditional Random Fields(1.6MB)
  17. Combining Models
    1. Boosting(pdf, 156KB)
  18. Concept Learning
    1. Hypothesis Space (pdf, 111KB)
    2. Candidate Elimination (pdf,236KB)
  19. Decision Trees
    1. Information Gain and ID3(pdf, 286KB)
    2. Data Sets and Data Mining(pdf, 332KB)
    3. Overfitting and Pruning(pdf, 536KB)
  20. Computational Learning Theory
    1. PAC Learning(pdf, 98KB)
    2. VC Dimension(pdf, 321KB)
    3. Mistake Bound(pdf, 51KB)

Lec 1.1 video(zip, 138MB)

Lec 1.3 video(zip, 134MB)
Lec 1.4 video(zip, 133MB)
Lec 1.5 video(zip, 135MB)
Lec 1.6 video(zip, 135MB)
Lec 1.7 video(zip, 277MB)




Lec 3 video(zip, 315MB)





Lec 4 vid: 1 2(zip,268 286MB) 






Lec 5.1 video(zip, 140MB)

Lec 5.3 video(zip, 144MB)

Lec 5.5 video(zip, 138MB) 



Lec 6.1 video(zip, 145MB)



Lec 7.1 video(zip, 311MB)

Lec 8.1 video(zip, 134MB) 
Lec 8.2 video(zip, 127MB) 
Lec 8.3/8.4 video(zip, 140MB) 
Lec 8.3/8.4 video(zip, 140MB) 
Lec 8.5 video(zip, 135MB) 
Lec 8.6 video(zip, 150MB)








Lec 10.1 video(zip, 132MB)











Lec 12.3 video(zip, 152MB) 

Lec 13.1 video(zip, 108MB)

Lec 14.1 video(zip, 145MB)


See Data Mining Course Slides





















Machine Learning

============================================================================================

Pattern Recognition

This is the website for a course on pattern recognition as taught in a first year graduate course (CSE555). The material presented here is complete enough so that it can also serve as a tutorial on the topic.

Pattern recognition techniques are concerned with the theory and algorithms of putting abstract objects, e.g., measurements made on physical objects, into categories. Typically the categories are assumed to be known in advance, although there are techniques to learn the categories (clustering). Methods of pattern recognition are useful in many applications such as information retrieval, data mining, document image analysis and recognition, computational linguistics, forensics, biometrics and bioinformatics. You may find the websites of related courses that I teach on Data Mining and Machine Learning useful as supplementary material.

Much of the topics concern statistical classification methods. They include generative methods such as those based on Bayes decision theory and related techniques of parameter estimation and density estimation. Next come discriminative methods such as nearest-neighbor classification, support vector machines. Artificial neural networks, classifier combination and clustering are other major components of pattern recognition.

A course in probability is helpful as a pre-requisite.

Applications of pattern recognition techniques are demonstrated by projects in fingerprint recognition, handwriting recognition and handwriting verification.

Reference Textbooks:
(i) Pattern Classification (2nd. Edition) by R. O. Duda, P. E. Hart and D. Stork, Wiley 2002, 
(ii) Pattern Recognition and Machine Learning by C. Bishop, Springer 2006, and 
(iii) Statistics and the Evaluation of Evidence for Forensic Scientists by C. Aitken and F. Taroni, Wiley, 2004.


Lectures

Following are the lecture overheads used in class as pdf files.
The lectures slides are frequently updated. This course was last taught in Spring 2007.

  1. Introduction
  2. Bayes Decision Theory
    1. Bayes Decision Rule
    2. Minimum Error Rate Classification
    3. Normal Density and Discriminant Functions
    4. Error Integrals and Bounds
    5. Bayesian Networks, Compound Decision Theory
  3. Generative Methods
    1. Maximum-Likelihood and Bayesian Parameter Estimation
      1. Maximum-Likelihood Estimation
      2. Bayesian Parameter Estimation
      3. Sufficient Statistics
      4. Some Common Statistical Distributions
      5. Dimensionality and Computational Complexity
      6. Principal Components Analysis
      7. Fisher Linear Discriminant
      8. Expectation Maximization
      9. Sequential Data and Hidden Markov Models
    2. Nonparametric Techniques
      1. Density Estimation
  4. Discriminative Methods
    1. Distance-based Methods
      1. Nearest neighbor Classification
      2. Metrics and Tangent Distance
      3. Fuzzy Classification
    2. Linear Discriminant Functions
      1. Hyperplane Geometry
      2. Gradient Descent and Perceptrons
      3. Minimum Squared Error Procedures
      4. Support Vector Machines
    3. Artificial Neural Networks
      1. Biological Motivation and Back-Propagation

Patter Recognition


http://www.cedar.buffalo.edu/~srihari/

学海无涯,回头是岸,O(∩_∩)O~

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值