1、论文希望解决的问题:
The above algorithms still suffer the limitation of memory usage, we thus design an effificient sequence-utility (SU)
-Chain structure to keep more information for the later mining progress.
2、作者的目标:
we present an effificient sequence-utility (SU)-chain structure, which can be used to store more relevant information to improve mining performance.
3、问题定义:
High-utility sequential pattern mining (HUSPM):
High utility itemset mining (HUIM) [6], [12], [13], [18], [28], [31] is to consider the utility factor of the itemsets to reveal the high profifitable itemsets from the databases.
High-utility sequential pattern (HUSP) :
if its sequence utility is no less than the pre-defifined minimum utility value.
Problem Statement:
Given a quantitative sequence database and a user-defifined minimum utility threshold, the task of high utility
sequential pattern mining (HUSPM) is to fifind the complete set of
high utility sequential patterns (HUSPs) in which the utility value of
each sequence is no less than δ × u(D) from the quantitative database.
4、方法:
Based on the SU-Chain structure, the existing pruning strategies can also be utilized here to early prune the unpromising candidates and obtainthe satisfified HUSPs.
5、关键技术
(1)I-Concatenation and S-Concatenation are used to generate all the promising candidates.
(2) The SU-Chain is a set of projection sequences and utility-lists.
(3) 4 Pruning strategies
6、相关文献
**USpan[32]:**使用字符定量顺序树,无法估计上界值(lexicographic quantitative sequence tree);
**HUS-Span [29]:**两个更严谨的上界值(prefifix extension utility (PEU) 和reduced sequence utility (RSU))估计上界值;
**ProUM [7],HUSP-ULL [8]:**通过最新的映射机制和剪枝策略;
(性能对比实验:USpan [32], HUS-Span [29] HUSP-ULL [8] 。与HUSP-ULL相比,减少了内存的泄漏问题。)