模式识别科学发展与现状(6.结束语)

 

6 Discussion and Conclusions

6 讨论和结论

 

Recognition of patterns and inference skills lie at the core of human learning. It is a human activity that we try to imitate by mechanical means. There are no physical laws that assign observations to classes. It is the human consciousness that groups observations together. Although their connections and interrelations are often hidden, some understanding may be gained in the attempt of imitating this process. The human process of learning patterns from examples may follow along the lines of trial and error. By freeing our minds of fixed beliefs and petty details we may not only understand single observations but also induce principles and formulate concepts that lie behind the observed facts. New ideas can be born then. These processes of abstraction and concept formation are necessary for development and survival. In practice,(semi-)automatic learning systems are built by imitating such abilities in order to gain understanding of the problem, explain the underlying phenomena and develop good predictive models.

模式识别和推理技能是人类学习能力的核心所在。我们试图通过机器工具来模仿这样的人类活动,但是还没找到把观察数据对应到相关种类的物理规律。人类的知觉认识能够把观察数据划分一起,虽然机器识别和人类识别的联系和相互关系是不明显的,但是通过模仿人类识别过程可以让我们得到一些在识别方法上的理解。人类从用例中学习模式的过程是经过尝试和纠正错误的过程,通过充分发挥我们的智力,依靠已掌握的真理及细节,我们不仅能够理解观察到的数据,还能够推理和形式化隐藏在观察数据后面的法则及概念,然后能够产生新的想法,这些抽象和概念形成的过程是人类发展和生存的需要。实际上,(半)自动化学习系统正是通过模仿这样的能力建立起来的,通过模仿来获取对问题的理解、解释潜在的现象和开发出具有很好预测能力的模型。

 

It has, however, to be strongly doubted whether statistics play an important role in the human learning process. Estimation of probabilities, especially in multivariate situations is not very intuitive for majority of people. Moreover,the amount of examples needed to build a reliable classifier by statistical means is much larger than it is available for humans. In human recognition,proximity based on relations between objects seems to come before features are searched and may be, thereby, more fundamental. For this reason and the above observation, we think that the study of proximities, distances and domain based classifiers are of great interest. This is further encouraged by the fact that such representations offer a bridge between the possibilities of learning in vector spaces and the structural description of objects that preserve relations between objects inherent structure. We think that the use of proximities for representation, generalization and evaluation constitute the most intriguing issues in pattern recognition.

然而,让人强烈怀疑的是统计方法在人类学习过程中是否扮演一个重要的角色。概率估计,特别是在多元状态下完全不是成年人的本能。况且,通过统计手段来建立一个可靠的分类器,对于人类来说需要非常巨大的用例数目。因为这个原因和上面我们所观察到的,对于相似性、距离及有关分类器的其它研究是非常有意义的。如果能够找到都可以把向量空间的概率及对象的结构描述(对象内部结构间存在着相互关系)联系起来的表示方法,是更令人鼓舞的。我们认为有关表示方法、推广及评估方法的应用构成了模式识别中最引人兴趣的问题。

 

The existing gap between structural and statistical pattern recognition partially coincides with the gap between knowledge and observations. Prior knowledge and observations are both needed in a subtle interplay to gain new knowledge. The existing knowledge is needed to guide the deduction process and to generate the models and possible hypotheses needed by induction,transduction and abduction. But, above all, it is needed to select relevant examples and a proper representation. If and only if the prior knowledge is made sufficiently explicit to set this environment, new observations can be processed to gain new knowledge. If this is not properly done, some results may be obtained in purely statistical terms, but these cannot be integrated with what was already known and have thereby to stay in the domain of observations. The study of automatic pattern recognition systems makes perfectly clear that learning is possible, only if the Platonic and Aristotelian scientific approaches cooperate closely. This is what we aim for.

结构模式识别和统计模式识别间存在的差别部分地反映在知识和观察数据之间的差别上。先验知识和观察数据在识别中都需要,且互相影响。已知的知识被用来进行推理和建立生成模型,归纳推理、转化推理和溯因推理过程中还需要一些可能性的假设。但是,除此之外,还需要选择相关的用例和合适的表示方法。如果且只是如果先验知识在解决问题中是充分且明确的,则新的观察能够被用来产生新的发现。如果无法做到这样,则可以在纯粹的统计方法中得到结果,但是无法结合已经知道的知识,导致只局限于在观察数据上。自动模式识别系统的研究中已完全清楚:只有把柏拉图和亚里士多德科学研究方法紧密结合起来,(模仿人类的)学习才有可能实现。这是我们要达到的目标。

 

References

[1] A.G. Arkadev and E.M. Braverman. Computers and Pattern Recognition.

Thompson, Washington, DC, 1966.

[2] M. Basu and T.K. Ho, editors. Data Complexity in Pattern Recognition.

Springer, 2006.

[3] R. Bergmann. Developing Industrial Case-Based Reasoning Applications.

Springer, 2004.

[4] C.M. Bishop. Neural Networks for Pattern Recognition. Oxford

University Press, 1995.

[5] H. Bunke. Recent developments in graph matching. In International

Conference on Pattern Recognition, volume 2, pages 117–124, 2000.

[6] H. Bunke, S. G¨unter, and X. Jiang. Towards bridging the gap between

statistical and structural pattern recognition: Two new concepts in graph

matching. In International Conference on Advances in Pattern Recognition,

pages 1–11, 2001.

[7] H. Bunke and K. Shearer. A graph distance metric based on the maximal

common subgraph. Pattern Recognition Letters, 19(3-4):255–259, 1998.

[8] V.S. Cherkassky and F. Mulier. Learning from data: Concepts, Theory

and Methods. John Wiley & Sons, Inc., New York, NY, USA, 1998.

The Science of Pattern Recognition. Achievements and Perspectives 255

[9] T.M. Cover. Geomerical and statistical properties of systems of linear

inequalities with applications in pattern recognition. IEEE Transactions

on Electronic Computers, EC-14:326–334, 1965.

[10] T.M. Cover and P.E. Hart. Nearest Neighbor Pattern Classification.

IEEE Transactions on Information Theory, 13(1):21–27, 1967.

[11] T.M. Cover and J.M. van Campenhout. On the possible orderings in the

measurement selection problem. IEEE Transactions on Systems, Man,

and Cybernetics, SMC-7(9):657–661, 1977.

[12] I.M. de Diego, J.M. Moguerza, and A. Mu˜noz. Combining kernel information

for support vector classification. In Multiple Classifier Systems,

pages 102–111. Springer-Verlag, 2004.

[13] A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete

data via the EM algorithm. Journal of the Royal Statistical Society,

Series B, 39(1):1–38, 1977.

[14] L. Devroye, L. Gy¨orfi, and G. Lugosi. A Probabilistic Theory of Pattern

Recognition. Springer-Verlag, 1996.

[15] R.O. Duda, P.E. Hart, and D.G. Stork. Pattern Classification. John

Wiley & Sons, Inc., 2nd edition, 2001.

[16] R.P.W. Duin. Four scientific approaches to pattern recognition. In Fourth

Quinquennial Review 1996-2001. Dutch Society for Pattern Recognition

and Image Processing, pages 331–337. NVPHBV, Delft, 2001.

[17] R.P.W. Duin and E. P_ ekalska. Open issues in pattern recognition. In

Computer Recognition Systems, pages 27–42. Springer, Berlin, 2005.

[18] R.P.W. Duin, E. P_ ekalska, P. Pacl′ık, and D.M.J. Tax. The dissimilarity

representation, a basis for domain based pattern recognition? In L. Goldfarb,

editor, Pattern representation and the future of pattern recognition,

ICPR 2004 Workshop Proceedings, pages 43–56, Cambridge, United

Kingdom, 2004.

[19] R.P.W. Duin, E. P_ ekalska, and D.M.J. Tax. The characterization of classi-

fication problems by classifier disagreements. In International Conference

on Pattern Recognition, volume 2, pages 140–143, Cambridge, United

Kingdom, 2004.

[20] R.P.W. Duin, F. Roli, and D. de Ridder. A note on core research issues for

statistical pattern recognition. Pattern Recognition Letters, 23(4):493–

499, 2002.

[21] S. Edelman. Representation and Recognition in Vision. MIT Press,

Cambridge, 1999.

[22] B. Efron and R.J. Tibshirani. An Introduction to the Bootstrap. Chapman

& Hall, London, 1993.

[23] P. Flach and A. Kakas, editors. Abduction and Induction: essays on their

relation and integration. Kluwer Academic Publishers, 2000.

[24] A. Fred and A.K. Jain. Data clustering using evidence accumulation. In

International Conference on Pattern Recognition, pages 276–280, Quebec

City, Canada, 2002.

256 Robert P.W. Duin and El˙zbieta P_ekalska

[25] A. Fred and A.K. Jain. Robust data clustering. In Conf. on Computer

Vision and Pattern Recognition, pages 442 –451, Madison - Wisconsin,

USA, 2002.

[26] K.S. Fu. Syntactic Pattern Recognition and Applications. Prentice-Hall,

1982.

[27] K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic

Press, 1990.

[28] G.M. Fung and O.L. Mangasarian. A Feature Selection Newton Method

for Support Vector Machine Classification. Computational Optimization

and Aplications, 28(2):185–202, 2004.

[29] L. Goldfarb. On the foundations of intelligent processes – I. An evolving

model for pattern recognition. Pattern Recognition, 23(6):595–616, 1990.

[30] L. Goldfarb, J. Abela, V.C. Bhavsar, and V.N. Kamat. Can a vector

space based learning model discover inductive class generalization in a

symbolic environment? Pattern Recognition Letters, 16(7):719–726, 1995.

[31] L. Goldfarb and D. Gay. What is a structural representation? Fifth

variation. Technical Report TR05-175, University of New Brunswick,

Fredericton, Canada, 2005.

[32] L. Goldfarb and O. Golubitsky. What is a structural measurement

process? Technical Report TR01-147, University of New Brunswick,

Fredericton, Canada, 2001.

[33] L. Goldfarb and J. Hook. Why classical models for pattern recognition are

not pattern recognition models. In International Conference on Advances

in Pattern Recognition, pages 405–414, 1998.

[34] T. Graepel, R. Herbrich, and K. Obermayer. Bayesian transduction. In

Advances in Neural Information System Processing, pages 456–462, 2000.

[35] T. Graepel, R. Herbrich, B. Sch¨olkopf, A. Smola, P. Bartlett, K.-R.

M¨uller, K. Obermayer, and R. Williamson. Classification on proximity

data with LP-machines. In International Conference on Artificial Neural

Networks, pages 304–309, 1999.

[36] U. Grenander. Abstract Inference. John Wiley & Sons, Inc., 1981.

[37] P. Gr¨unwald, I.J. Myung, and Pitt M., editors. Advances in Minimum

Description Length: Theory and Applications. MIT Press, 2005.

[38] B. Haasdonk. Feature space interpretation of SVMs with indefinite kernels.

IEEE Transactions on Pattern Analysis and Machine Intelligence,

25(5):482–492, 2005.

[39] I. Hacking. The emergence of probability. Cambridge University Press,

1974.

[40] G. Harman and S. Kulkarni. Reliable Reasoning: Induction and Statistical

Learning Theory. MIT Press, to appear.

[41] S. Haykin. Neural Networks, a Comprehensive Foundation, second

edition. Prentice-Hall, 1999.

[42] D. Heckerman. A tutorial on learning with Bayesian networks.

In M. Jordan, editor, Learning in Graphical Models, pages 301–354. MIT

Press, Cambridge, MA, 1999.

The Science of Pattern Recognition. Achievements and Perspectives 257

[43] T.K. Ho and M. Basu. Complexity measures of supervised classification

problems. IEEE Transactions on Pattern Analysis and Machine Intelligence,

24(3):289–300, 2002.

[44] A. K. Jain and B. Chandrasekaran. Dimensionality and sample size considerations

in pattern recognition practice. In P. R. Krishnaiah and L. N.

Kanal, editors, Handbook of Statistics, volume 2, pages 835–855. North-

Holland, Amsterdam, 1987.

[45] A.K. Jain, R.P.W. Duin, and J. Mao. Statistical pattern recognition:

A review. IEEE Transactions on Pattern Analysis and Machine Intelligence,

22(1):4–37, 2000.

[46] T. Joachims. Transductive inference for text classification using support

vector machines. In I. Bratko and S. Dzeroski, editors, International

Conference on Machine Learning, pages 200–209, 1999.

[47] T. Joachims. Transductive learning via spectral graph partitioning. In

International Conference on Machine Learning, 2003.

[48] T.S. Kuhn. The Structure of Scientific Revolutions. University of Chicago

Press, 1970.

[49] L.I. Kuncheva. Combining Pattern Classifiers. Methods and Algorithms.

Wiley, 2004.

[50] J. Laub and K.-R. M¨uller. Feature discovery in non-metric pairwise data.

Journal of Machine Learning Research, pages 801–818, 2004.

[51] A. Marzal and E. Vidal. Computation of normalized edit distance

and applications. IEEE Transactions on Pattern Analysis and Machine

Intelligence, 15(9):926–932, 1993.

[52] R.S. Michalski. Inferential theory of learning as a conceptual basis for

multistrategy learning. Machine Learning, 11:111–151, 1993.

[53] T. Mitchell. Machine Learning. McGraw Hill, 1997.

[54] Richard E. Neapolitan. Probabilistic reasoning in expert systems: theory

and algorithms. John Wiley & Sons, Inc., New York, NY, USA, 1990.

[55] C.S. Ong, S. Mary, X.and Canu, and Smola A.J. Learning with nonpositive

kernels. In International Conference on Machine Learning, pages

639–646, 2004.

[56] E. P_ ekalska and R.P.W. Duin. The Dissimilarity Representation for

Pattern Recognition. Foundations and Applications. World Scientific,

Singapore, 2005.

[57] E. P_ ekalska, R.P.W. Duin, S. G¨unter, and H. Bunke. On not making dissimilarities

Euclidean. In Joint IAPR International Workshops on SSPR

and SPR, pages 1145–1154. Springer-Verlag, 2004.

[58] E. P_ ekalska, P. Pacl′ık, and R.P.W. Duin. A Generalized Kernel Approach

to Dissimilarity Based Classification. Journal of Machine Learning

Research, 2:175–211, 2002.

[59] E. P_ ekalska, M. Skurichina, and R.P.W. Duin. Combining Dissimilarity

Representations in One-class Classifier Problems. In Multiple Classifier

Systems, pages 122–133. Springer-Verlag, 2004.

258 Robert P.W. Duin and El˙zbieta P_ekalska

[60] L.I. Perlovsky. Conundrum of combinatorial complexity. IEEE Transactions

on Pattern Analysis and Machine Intelligence, 20(6):666–670, 1998.

[61] P. Pudil, J. Novovi′cova, and J. Kittler. Floating search methods in feature

selection. Pattern Recognition Letters, 15(11):1119–1125, 1994.

[62] B. Ripley. Pattern Recognition and Neural Networks. Cambridge

University Press, Cambridge, 1996.

[63] C.P. Robert. The Bayesian Choice. Springer-Verlag, New York, 2001.

[64] K.M. Sayre. Recognition, a study in the philosophy of artificial intelligence.

University of Notre Dame Press, 1965.

[65] M.I. Schlesinger and Hlav′ac. Ten Lectures on Statistical and Structural

Pattern Recognition. Kluwer Academic Publishers, 2002.

[66] B. Sch¨olkopf and A.J. Smola. Learning with Kernels. MIT Press,

Cambridge, 2002.

[67] J. Shawe-Taylor and N. Cristianini. Kernel methods for pattern analysis.

Cambridge University Press, UK, 2004.

[32] M. Stone. Cross-validation: A review. Mathematics, Operations and

Statistics, (9):127–140, 1978.

[69] D.M.J. Tax. One-class classification. Concept-learning in the absence

of counter-examples. PhD thesis, Delft University of Technology, The

Netherlands, 2001.

[70] D.M.J. Tax and R.P.W. Duin. Support vector data description. Machine

Learning, 54(1):45–56, 2004.

[71] F. van der Heiden, R.P.W. Duin, D. de Ridder, and D.M.J. Tax.

Classification, Parameter Estimation, State Estimation: An Engineering

Approach Using MatLab. Wiley, New York, 2004.

[72] V. Vapnik. Estimation of Dependences based on Empirical Data. Springer

Verlag, 1982.

[73] V. Vapnik. Statistical Learning Theory. John Wiley & Sons, Inc., 1998.

[74] L.-X. Wang and J.M. Mendel. Generating fuzzy rules by learning

from examples. IEEE Transactions on Systems, Man, and Cybernetics,

22(6):1414–1427, 1992.

[75] S. Watanabe. Pattern Recognition, Human and Mechanical. John Wiley

& Sons, 1985.

[76] A. Webb. Statistical Pattern Recognition. John Wiley & Sons, Ltd., 2002.

[77] S.M.Weiss and C.A. Kulikowski. Computer Systems That Learn. Morgan

Kaufmann, 1991.

[78] R.C. Wilson and E.R. Hancock. Structural matching by discrete relaxation.

IEEE Transactions on Pattern Analysis and Machine Intelligence,

19(6):634–648, 1997.

[79] R.C. Wilson, B. Luo, and E.R. Hancock. Pattern vectors from algebraic

graph theory. IEEE Transactions on Pattern Analysis and Machine

Intelligence, 27(7):1112–1124, 2005.

[80] S. Wolfram. A new kind of science. Wolfram Media, 2002.

[81] D.H. Wolpert. The Mathematics of Generalization. Addison-Wesley,

1995.

The Science of Pattern Recognition. Achievements and Perspectives 259

[82] R.R. Yager, M. Fedrizzi, and J. (Eds) Kacprzyk. Advances in the

Dempster-Shafer Theory of Evidence. Wesley, 1994.

[83] C.H. Yu. Quantitative methodology in the perspectives of abduction,

deduction, and induction. In Annual Meeting of American Educational

Research Association, San Francisco, CA, 2006.

 

 

 

  • 0
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值