Week1:
private search VS enterprise search: different from web search, since no hyper links, you can not rank page the way web searching does.
How to classify, sequel is not applied to search structured data.
How to search structured data: searching structured data well remains a research problem
Local Sensitive Hash(LSH) reduce a d-dimensional objects' comparing time from 2exp(d) to d
Sparse Distributed Memory:(SDM) Normal Distribution shows the probability of each state are not equal.
SDM is different with LSH, SDM is more like human memory.
Week2:
What is information: Claude Shannon(1948). "information is related to surprise", a message informing use of an event that has probability p conveys
bits of information = - Log(2)p
so if p = 1, -Log(2)p is zero, means no news at all
As p tend to be zero, -Log(2)p tends to +infinit. means a break news
Mutual information: 交互信息
TF-IDF : See reference of link: 阮一峰
Bayes
条件概率: P(a,b) = P(a|b)*P(b) = P(b|a)*P(a)
Naive Bayesian classifier; 朴素贝叶斯分类, 区别于决策树模型(Decision Tree Model)可参考这里:
sentiment analysis via machine learning
The definition of mutual information:
关于朴素贝叶斯分类的详细数学介绍,可 参考这里