1. 信息检索基本模型
For a given information problem, the purpose of the system is to capture wanted items and to filter out unwanted items
建index
The indexing language is either pre-specified (controlled, 由专家决定的) or taken freely from the text of the information items and information requests (uncontrolled)
What are the components of an IR system?
- Document processing (indexing)
- Query input
- Document-query “matching”
- Output module
- Feedback module
- User interface
2、IR基本文件结构
倒排索引
对倒排文件的操作
用布尔表达式 AND OR NOT XOR
对 Shakespear play来建立倒排索引,行为词语term,列为戏剧: