这篇其实也属于这门课程的初识篇。
所有发现的pattern都有趣吗?
一个数据挖掘系统可能会产生成千上万个patterns,但并不是所有的都是interesting的。所以提供了建议的方法:
1 easily unterstood by humans 可以很容易被人理解
2 vaild on new or test data with some degree of certainty 对新数据或测试数据有一定的准确性
3 potentially useful, novel, or vaildates some hypothesis 可能有用的,新颖的,或验证用户试图缺人的某些假设
PS:objective vs. subjective interestingness measure:
Objective: based on statistics and strcutures of patterns, e.g.,support, confidence, etc
Subjective: based on user's belief in the data, e.g., unexpectednedd, novelty, actionability, etc
换句话说,