Data Mining
The subject of Knowledge Discovery and Data Mining (KDD) concerns the extraction of useful information from data. Since this is also the essence of many sub-areas of computer science, as well as the field of statistics, KDD can be said to lie at the intersection of statistics, machine learning, data bases, pattern recognition, information retrieval and artificial intelligence. The subject matter of data mining is vast, making the task of task of learning about the subject itself a task of data mining! The one-semester course that I teach emphasizes the theory and algorithms of data mining. Such algorithms are concerned with deriving global models/ local patterns, visualization, and retrieval by content. Related courses that I teach are pattern recognition and machine learning which cover many other topics related to data mining. Textbook: Principles of Data Mining by D. Hand, H. Mannila and P. Smith, MIT Press, 2001. |
Lectures The following are the presentation slides used in a classroom setting. They are given here as links to pdf files. Since these slides are continually updated you may wish to revisit them. The course is being taught in Spring 2010.
|
============================================================================================
Machine Learning
============================================================================================
This is the website for a course on pattern recognition as taught in a first year graduate course (CSE555). The material presented here is complete enough so that it can also serve as a tutorial on the topic. Pattern recognition techniques are concerned with the theory and algorithms of putting abstract objects, e.g., measurements made on physical objects, into categories. Typically the categories are assumed to be known in advance, although there are techniques to learn the categories (clustering). Methods of pattern recognition are useful in many applications such as information retrieval, data mining, document image analysis and recognition, computational linguistics, forensics, biometrics and bioinformatics. You may find the websites of related courses that I teach on Data Mining and Machine Learning useful as supplementary material. Much of the topics concern statistical classification methods. They include generative methods such as those based on Bayes decision theory and related techniques of parameter estimation and density estimation. Next come discriminative methods such as nearest-neighbor classification, support vector machines. Artificial neural networks, classifier combination and clustering are other major components of pattern recognition. A course in probability is helpful as a pre-requisite. Applications of pattern recognition techniques are demonstrated by projects in fingerprint recognition, handwriting recognition and handwriting verification. Reference Textbooks: |
Lectures Following are the lecture overheads used in class as pdf files.
|
学海无涯,回头是岸,O(∩_∩)O~