文本挖掘经典书籍推荐—THE TEXT MINING HANDBOOK(Advd. Approaches in Analyzing Unstructured Data)

文本挖掘经典书籍推荐—THE TEXT MINING HANDBOOK(Advd. Approaches in Analyzing Unstructured Data)

(2011-02-14 11:03:26) [编辑] [删除]
标签:

it

 

 

文本挖掘经典书籍推存

—THE TEXT MINING HANDBOOK(Advd. Approaches in Analyzing Unstructured Data)

这是2007年英国剑桥大学出版的一本教科书,作者Ronen Feldman是以色列人,合作者是James Sanger,住在美国麻省。下面列出了它的目录。对于文本挖掘高手,最后三章值得看看,分别讨论了可视化方法、关联分析及几个流行的应用(专利文献,生物医药及企业内部知识管理和外部商业情报)

I. Introduction to Text Mining 1
 I.1 Defining Text Mining 1
 I.2 General Architecture of Text Mining Systems 13
II. Core Text Mining Operations 19
 II.1 Core Text Mining Operations 19
 II.2 Using Background Knowledge for Text Mining 41
 II.3 Text Mining Query Languages 51
III. Text Mining Preprocessing Techniques 57
 III.1 Task-Oriented Approaches 58
 III.2 Further Reading 62
IV. Categorization 64
 IV.1 Applications of Text Categorization 65
 IV.2 Definition of the Problem 66
 IV.3 Document Representation 68
 IV.4 Knowledge Engineering Approach to TC 70
 IV.5 Machine Learning Approach to TC 70
 IV.6 Using Unlabeled Data to Improve Classification 78
 IV.7 Evaluation of Text Classifiers 79
 IV.8 Citations and Notes 80
V. Clustering 82
 V.1 Clustering Tasks in Text Analysis 82
 V.2 The General Clustering Problem 84
 V.3 Clustering Algorithms 85
 V.4 Clustering of Textual Data 88
 V.5 Citations and Notes 92

VI. Information Extraction 94
 VI.1 Introduction to Information Extraction 94
 VI.2 Historical Evolution of IE: The Message Understanding Conferences and Tipster 96
 VI.3 IE Examples 101
 VI.4 Architecture of IE Systems 104
 VI.5 Anaphora Resolution 109
 VI.6 Inductive Algorithms for IE 119
 VI.7 Structural IE 122
 VI.8 Further Reading 129
VII. Probabilistic Models for Information Extraction 131
 VII.1 Hidden Markov Models 131
 VII.2 Stochastic Context-Free Grammars 137
 VII.3 Maximal Entropy Modeling 138
 VII.4 Maximal Entropy Markov Models 140
 VII.5 Conditional Random Fields 142
 VII.6 Further Reading 145
VIII. Preprocessing Applications Using Probabilistic and Hybrid Approaches 146
 VIII.1 Applications of HMM to Textual Analysis 146
 VIII.2 Using MEMM for Information Extraction 152
 VIII.3 Applications of CRFs to Textual Analysis 153
 VIII.4 TEG: Using SCFG Rules for Hybrid Statistical–Knowledge-Based IE 155
 VIII.5 Bootstrapping 166
 VIII.6 Further Reading 175
IX. Presentation-Layer Considerations for Browsing and Query Refinement 177
 IX.1 Browsing 177
 IX.2 Accessing Constraints and Simple Specification Filters at the Presentation Layer 185
 IX.3 Accessing the Underlying Query Language 186
 IX.4 Citations and Notes 187
X. Visualization Approaches 189
 X.1 Introduction 189
 X.2 Architectural Considerations 192
 X.3 Common Visualization Approaches for Text Mining 194
 X.4 Visualization Techniques in Link Analysis 225
 X.5 Real-World Example: The Document Explorer System 235
XI. Link Analysis 244
 XI.1 Preliminaries 244

 XI.2 Automatic Layout of Networks 246
 XI.3 Paths and Cycles in Graphs 250
 XI.4 Centrality 251
 XI.5 Partitioning of Networks 259
 XI.6 Pattern Matching in Networks 272
 XI.7 Software Packages for Link Analysis 273
 XI.8 Citations and Notes 274
XII. Text Mining Applications 275
 XII.1 General Considerations 276
 XII.2 Corporate Finance: Mining Industry Literature for Business Intelligence 281
 XII.3 A “Horizontal” Text Mining Application: Patent Analysis Solution Leveraging a Commercial Text Analytics Platform 297
 XII.4 Life Sciences Research: Mining

[end]

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值