机器学习(Machine Learning and Data Mining)CS 5751——final复习记录(2)

机器学习(Machine Learning and Data Mining)CS 5751——final复习记录 因为是整理来给自己看的,所以都是大纲……


因为是整理来给自己看的,所以都是大纲……

关联规则挖掘

Association Rule Mining
给定一组transactions, 寻找rules,根据其他items来预测item的出现。
support和confidence
在这里插入图片描述
关联挖掘在于找出所有满足条件的关联规则:
(1)support大于minsup threshold
(2)confidence大于minconf threshold
具体步骤:
(1)frequent itemset generation
[找出频率大于minsup的itemsets]
(2)rule generation
[从binary partitioning里找出高confidence的规则]

关联性挖掘的问题

复杂度:O(NMw)
N:number of transactions[可以用DHP和vertical-based挖掘算法]
N:number of candidates【2^d】[修剪pruning]
reduce NM:使用高效的存储方法

处理问题:Apriori principle

Apriori principle:如果一个Itemset是频繁的的,那么他的子集也是频繁的。
也被

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Intrusion detection and analysis has received a lot of criticism and publicity over the last several years. The Gartner report took a shot saying Intrusion Detection Systems are dead, while others believe Intrusion Detection is just reaching its maturity. The problem that few want to admit is that the current public methods of intrusion detection, while they might be mature, based solely on the fact they have been around for a while, are not extremely sophisticated and do not work very well. While there is no such thing as 100% security, people always expect a technology to accomplish more than it currently does, and this is clearly the case with intrusion detection. It needs to be taken to the next level with more advanced analysis being done by the computer and less by the human. The current area of Intrusion Detection is begging for Machine Learning to be applied to it. Convergence of these two key areas is critical for it to be taken to the next level. The problem is that I have seen little research focusing on this, until now. After reading Machine Learning and Data Mining for Computer Security, I feel Dr Maloof has hit the target dead centre. While much research has been done across Computer Security independently and Machine Learning independently, for some reason no one wanted to cross-breed the two topics. Dr Maloof not only did a masterful job of focusing the book on a critical area that was in dire need of research, but he also strategically picked papers that complemented each other in a productive manner. Usually reading an edited volume like this, the chapters are very disjointed with no connection between them. While these chapters cover different areas of research, there is a hidden flow that complements the previous chapter with the next. While Dr Maloof points out in his Preface the intended audience, I feel that there are two additional critical groups. Firstly, I feel that any vendor or solution provider that is looking to provide a competitive a
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值