1. Introduction (about machine learning)
2. Concept Learning and the General-to-Specific Ordering
3. Decision Tree Learning
4. Artificial Neural Networks
5. Evaluating Hypotheses
6. Bayesian Learning
7. Computational Learning Theory
8. Instance-Based Learning
9. Genetic Algorithms
10. Learning Sets of Rules
11. Analytical Learning
12. Combining Inductive and Analytical Learning
13. Reinforcement Learning
10. Learning Sets of Rules
One of the most expressive and human readable representations for learned hypothe- ses is sets of if-then rules. This chapter explores several algorithms for learning such sets of rules.
10.1 INTRODUCTION
As shown in Chapter 3, one way to learn sets of rules is to first learn a decision tree, then translate the tree into an equivalent set of rules-one rule for each leaf node in the tree. A second method, illustrated in Chapter 9, is to use a genetic algorithm that encodes each rule set as a bit string and uses genetic search operators to explore this hypothesis space. In this chapter we explore a variety of algorithms that directly learn rule sets and that differ from these algorithms in two key respects. First, they are designed to learn sets of first-order rules(一阶规则) that contain variables. This is significant because first-order rules are much more expressive than propositional rules(命题规则). Second, the algorithms discussed here use sequential covering algorithms(序列覆盖算法) that learn one rule at a time to incrementally grow the final set of rules.
In this chapter we begin by considering algorithms that learn sets of propositional rules; that is, rules without variables. Algorithms for searching the hypothesis space to learn disjunctive sets of rules are most easily understood in this setting. We then consider extensions of these algorithms to learn first-order rules. Two general approaches to inductive logic programming(归纳逻辑编程) are then considered, and the fundamental relationship between inductive and deductive inference(归纳推理和演绎推理) is explored.
10.2 SEQUENTIAL COVERING ALGORITHMS
Here we consider a family of algorithms for learning rule sets based on the strategy of learning one rule, removing the data it covers, then iterating this process. Such algorithms are called sequential covering algorithms. A prototypical sequential covering algorithm is described in Table 10.1.
This sequential covering algorithm is one of the most widespread approaches to learning disjunctive sets of rules. It reduces the problem of learning a disjunctive set of rules to a sequence of simpler problems, each requiring that a single conjunctive rule be learned. Because it performs a greedy search, formulating a sequence of rules without backtracking, it is not guaranteed to find the smallest or best set of rules that c