UIUC大学之Coursera课程Text Retrieval and Search Engines:Week 1 Practice Quiz

这是一份关于UIUC大学Coursera课程《Text Retrieval and Search Engines》第一周的实践测验,尽管已过截止日期,学生仍可参与以作学习之用。
摘要由CSDN通过智能技术生成

Week 1 Practice QuizHelp Center

Warning: The hard deadline has passed. You can attempt it, but you will not get credit for it. You are welcome to try it as a learning exercise.

Question 1

Consider the instantiation of the vector space model where documents and queries are represented as  term frequency vectors. Assume we have the following query and two documents: 

Q = “future of online education” 
D1 = “Coursera is shaping the future of online education; online education is affordable. ” 
D2 = “In the future, online education will dominate.” 

Let V(X) = [c1 c2 c3 c4] represent a part of the term frequency vector for document or query X, where c1, c2, c3, and c4 are the term weights corresponding to “future”, “of”, “online”, and “education”, respectively. Which of the following is true:

Question 2

Consider the same scenario as in question (1) with the dot product as the similarity measure. Which of the following is the true:

Question 3

Assume we have two documents with the same raw TF for all the query words (i.e. the query words appear with the same frequency in both documents). Then, using the Okapi/BM25 retrieval function, the longer document will have a lower score.

Question 4

If we remove the document length normalization term from the Okapi/BM25 retrieval function, and have two documents with the same raw TF for all the query words, then the longer document will have a higher score.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值