《大数据:互联网大规模数据挖掘与分布式处理》(第二版)第六章习题答案

参考书籍:《大数据:互联网大规模数据挖掘与分布式处理》(第二版)

原版英文书籍:Mining of Massive Datasets

注:答案为本人自己做的,并非标准答案,仅供参考。
如有错误,请私信我,我将及时修改。

《大数据:互联网大规模数据挖掘与分布式处理》(第二版)第六章习题答案

注:本书包含大量习题,较难的习题或习题中较难的部分都会用!标记,最难的习题用!!标记。

习题6.1.1

程序:

# 购物篮T
T = []
# 初始化购物篮T
for value in range(1, 101):
    t = []
    for i in range(1, value + 1):
        if value % i == 0:
            t.append(i)
    T.append(t)
# 打印购物篮T
print("\n购物篮T: " + str(T))

# 候选1项集C1,key为项集,value为该项集对应的支持度
C1 = {}
# 求C1
for i in range(1, 101):
    count = 0
    for t in T:
        if i in t:
            count += 1
    C1[i] = count
# 打印候选1项集C1
print("\n候选1项集C1: " + str(C1))

# 支持度阈值s
s = 5
# 频繁1项集L1,key为项集,value为该项集对应的支持度
L1 = {}
# 求L1
for key, value in C1.items():
    if value >= 5:
        L1[key] = value
# 打印频繁1项集L1
print("\n频繁1项集L1: " + str(L1))

# 候选2项集C2,key为项集,value为该项集对应的支持度
C2 = {}
for i in range(1, 21):
    for j in range(1, 21):
        if i == j:
            continue
        count = 0
        pair = f"[{i}, {j}]"
        for t in T:
            if i in t and j in t:
                count += 1
        C2[pair] = count
# 打印候选2项集C2
print("\n候选2项集C2: " + str(C2))

# 支持度阈值s
s = 5
# 频繁2项集L2,key为项集,value为该项集对应的支持度
L2 = {}
# 求L2
for key, value in C2.items():
    if value >= 5:
        L2[key] = value
# 打印频繁2项集L2
print("\n频繁2项集L2: " + str(L2))

# 求所有购物篮中项的数目之和
sum = 0
for t in T:
    for t_value in t:
        sum += 1
print(f"\n所有购物篮中项的数目之和为:{sum}。")

# 求所有购物篮中最大的购物篮,即元素个数最多的购物篮
# t_len存储T中每个购物篮t的元素个数
t_len = []
for t in T:
    t_len.append(len(t))
# print(t_len)
max_t_len = max(t_len)
print(f"\n在最大的购物篮中,元素个数为:{max_t_len}。\n它们分别是:")
for i in range(100):
    if t_len[i] == max_t_len:
        print(f"购物篮t{i}{T[i]}")

运行结果:

在这里插入图片描述

( a ) 如果支持度阈值是5,求频繁1项集L1

L1 = {‘1’: 100, ‘2’: 50, ‘3’: 33, ‘4’: 25, ‘5’: 20, ‘6’: 16, ‘7’: 14, ‘8’: 12, ‘9’: 11, ‘10’: 10, ‘11’: 9, ‘12’: 8, ‘13’: 7, ‘14’: 7, ‘15’: 6, ‘16’: 6, ‘17’: 5, ‘18’: 5, ‘19’: 5, ‘20’: 5}

这是一个字典,其中 key为项集,value为该项集对应的支持度。

( b ) 如果支持度阈值是5,求频繁2项集L2
L2 = {‘[1, 2]’: 50, ‘[1, 3]’: 33, ‘[1, 4]’: 25, ‘[1, 5]’: 20, ‘[1, 6]’: 16, ‘[1, 7]’: 14, ‘[1, 8]’: 12, ‘[1, 9]’: 11, ‘[1, 10]’: 10, ‘[1, 11]’: 9, ‘[1, 12]’: 8, ‘[1, 13]’: 7, ‘[1, 14]’: 7, ‘[1, 15]’: 6, ‘[1, 16]’: 6, ‘[1, 17]’: 5, ‘[1, 18]’: 5, ‘[1, 19]’: 5, ‘[1, 20]’: 5, ‘[2, 3]’: 16, ‘[2, 4]’: 25, ‘[2, 5]’: 10, ‘[2, 6]’: 16, ‘[2, 7]’: 7, ‘[2, 8]’: 12, ‘[2, 9]’: 5, ‘[2, 10]’: 10, ‘[2, 12]’: 8, ‘[2, 14]’: 7, ‘[2, 16]’: 6, ‘[2, 18]’: 5, ‘[2, 20]’: 5, ‘[3, 4]’: 8, ‘[3, 5]’: 6, ‘[3, 6]’: 16, ‘[3, 9]’: 11, ‘[3, 12]’: 8, ‘[3, 15]’: 6, ‘[3, 18]’: 5, ‘[4, 5]’: 5, ‘[4, 6]’: 8, ‘[4, 8]’: 12, ‘[4, 10]’: 5, ‘[4, 12]’: 8, ‘[4, 16]’: 6, ‘[4, 20]’: 5, ‘[5, 10]’: 10, ‘[5, 15]’: 6, ‘[5, 20]’: 5, ‘[6, 9]’: 5, ‘[6, 12]’: 8, ‘[6, 18]’: 5, ‘[7, 14]’: 7, ‘[8, 16]’: 6, ‘[9, 18]’: 5, ‘[10, 20]’: 5}

这是一个字典,其中 key为项集,value为该项集对应的支持度。

( c ) 所有购物篮中项的数目之和是多少?

所有购物篮中项的数目之和为:482。

!习题6.1.2

问题:对于习题6.1.1中的项-购物篮数据,哪个购物篮是最大的?

在最大的购物篮中,元素个数为:12。
它们分别是:
购物篮t59:[1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30, 60]
购物篮t71:[1, 2, 3, 4, 6, 8, 9, 12, 18, 24, 36, 72]
购物篮t83:[1, 2, 3, 4, 6, 7, 12, 14, 21, 28, 42, 84]
购物篮t89:[1, 2, 3, 5, 6, 9, 10, 15, 18, 30, 45, 90]
购物篮t95:[1, 2, 3, 4, 6, 8, 12, 16, 24, 32, 48, 96]

习题6.1.3

只需要修改一下生成购物篮T的代码即可。

程序:

# 购物篮T
T = []
# 初始化购物篮T
for i in range(1, 101):
    t = []
    for value in range(i, 101):
        if value % i == 0:
            t.append(value)
    T.append(t)
# 打印购物篮T
print("\n购物篮T: " + str(T))

# 候选1项集C1,key为项集,value为该项集对应的支持度
C1 = {}
# 求C1
for i in range(1, 101):
    count = 0
    for t in T:
        if i in t:
            count += 1
    C1[str(i)] = count
# 打印候选1项集C1
print("\n候选1项集C1: " + str(C1))

# 支持度阈值s
s = 5
# 频繁1项集L1,key为项集,value为该项集对应的支持度
L1 = {}
# 求L1
for key, value in C1.items():
    if value >= 5:
        L1[key] = value
# 打印频繁1项集L1
print("\n频繁1项集L1: " + str(L1))

# 候选2项集C2,key为项集,value为该项集对应的支持度
C2 = {}
for i in range(1, 21):
    for j in range(i, 21):
        if i == j:
            continue
        count = 0
        pair = f"[{i}, {j}]"
        for t in T:
            if i in t and j in t:
                count += 1
        C2[pair] = count
# 打印候选2项集C2
print("\n候选2项集C2: " + str(C2))

# 支持度阈值s
s = 5
# 频繁2项集L2,key为项集,value为该项集对应的支持度
L2 = {}
# 求L2
for key, value in C2.items():
    if value >= 5:
        L2[key] = value
# 打印频繁2项集L2
print("\n频繁2项集L2: " + str(L2))

# 求所有购物篮中项的数目之和
sum = 0
for t in T:
    for t_value in t:
        sum += 1
print(f"\n所有购物篮中项的数目之和为:{sum}。")

# 求所有购物篮中最大的购物篮,即元素个数最多的购物篮
# t_len存储T中每个购物篮t的元素个数
t_len = []
for t in T:
    t_len.append(len(t))
# print(t_len)
max_t_len = max(t_len)
print(f"\n在最大的购物篮中,元素个数为:{max_t_len}。\n它是:")
for i in range(100):
    if t_len[i] == max_t_len:
        print(f"购物篮t{i}{T[i]}")

( a ) 如果支持度阈值是5,求频繁1项集L1

L1 = {‘12’: 6, ‘16’: 5, ‘18’: 6, ‘20’: 6, ‘24’: 8, ‘28’: 6, ‘30’: 8, ‘32’: 6, ‘36’: 9, ‘40’: 8, ‘42’: 8, ‘44’: 6, ‘45’: 6, ‘48’: 10, ‘50’: 6, ‘52’: 6, ‘54’: 8, ‘56’: 8, ‘60’: 12, ‘63’: 6, ‘64’: 7, ‘66’: 8, ‘68’: 6, ‘70’: 8, ‘72’: 12, ‘75’: 6, ‘76’: 6, ‘78’: 8, ‘80’: 10, ‘81’: 5, ‘84’: 12, ‘88’: 8, ‘90’: 12, ‘92’: 6, ‘96’: 12, ‘98’: 6, ‘99’: 6, ‘100’: 9}

这是一个字典,其中 key为项集,value为该项集对应的支持度。

( b ) 如果支持度阈值是5,求频繁2项集L2

L2 = {}

这是一个空字典,说明当支持度阈值是5时,频繁2项集L2为空集,算法到此结束。

( c ) 所有购物篮中项的数目之和是多少?

所有购物篮中项的数目之和为:482。

对于习题6.1.3中的项-购物篮数据,哪个购物篮是最大的?

在最大的购物篮中,元素个数为:100。
它是:
购物篮t0:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]

!习题6.1.4

每个项在购物篮中出现的概率的下界为1/10,因为支持度阈值为购物篮的1%,所以每个项的支持度都大于支持度阈值,故1-频繁项集L1为{1},{2},…,{10}。

习题6.1.5

习题6.1.1的购物篮T: [[1], [1, 2], [1, 3], [1, 2, 4], [1, 5], [1, 2, 3, 6], [1, 7], [1, 2, 4, 8], [1, 3, 9], [1, 2, 5, 10], [1, 11], [1, 2, 3, 4, 6, 12], [1, 13], [1, 2, 7, 14], [1, 3, 5, 15], [1, 2, 4, 8, 16], [1, 17], [1, 2, 3, 6, 9, 18], [1, 19], [1, 2, 4, 5, 10, 20], [1, 3, 7, 21], [1, 2, 11, 22], [1, 23], [1, 2, 3, 4, 6, 8, 12, 24], [1, 5, 25], [1, 2, 13, 26], [1, 3, 9, 27], [1, 2, 4, 7, 14, 28], [1, 29], [1, 2, 3, 5, 6, 10, 15, 30], [1, 31], [1, 2, 4, 8, 16, 32], [1, 3, 11, 33], [1, 2, 17, 34], [1, 5, 7, 35], [1, 2, 3, 4, 6, 9, 12, 18, 36], [1, 37], [1, 2, 19, 38], [1, 3, 13, 39], [1, 2, 4, 5, 8, 10, 20, 40], [1, 41], [1, 2, 3, 6, 7, 14, 21, 42], [1, 43], [1, 2, 4, 11, 22, 44], [1, 3, 5, 9, 15, 45], [1, 2, 23, 46], [1, 47], [1, 2, 3, 4, 6, 8, 12, 16, 24, 48], [1, 7, 49], [1, 2, 5, 10, 25, 50], [1, 3, 17, 51], [1, 2, 4, 13, 26, 52], [1, 53], [1, 2, 3, 6, 9, 18, 27, 54], [1, 5, 11, 55], [1, 2, 4, 7, 8, 14, 28, 56], [1, 3, 19, 57], [1, 2, 29, 58], [1, 59], [1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30, 60], [1, 61], [1, 2, 31, 62], [1, 3, 7, 9, 21, 63], [1, 2, 4, 8, 16, 32, 64], [1, 5, 13, 65], [1, 2, 3, 6, 11, 22, 33, 66], [1, 67], [1, 2, 4, 17, 34, 68], [1, 3, 23, 69], [1, 2, 5, 7, 10, 14, 35, 70], [1, 71], [1, 2, 3, 4, 6, 8, 9, 12, 18, 24, 36, 72], [1, 73], [1, 2, 37, 74], [1, 3, 5, 15, 25, 75], [1, 2, 4, 19, 38, 76], [1, 7, 11, 77], [1, 2, 3, 6, 13, 26, 39, 78], [1, 79], [1, 2, 4, 5, 8, 10, 16, 20, 40, 80], [1, 3, 9, 27, 81], [1, 2, 41, 82], [1, 83], [1, 2, 3, 4, 6, 7, 12, 14, 21, 28, 42, 84], [1, 5, 17, 85], [1, 2, 43, 86], [1, 3, 29, 87], [1, 2, 4, 8, 11, 22, 44, 88], [1, 89], [1, 2, 3, 5, 6, 9, 10, 15, 18, 30, 45, 90], [1, 7, 13, 91], [1, 2, 4, 23, 46, 92], [1, 3, 31, 93], [1, 2, 47, 94], [1, 5, 19, 95], [1, 2, 3, 4, 6, 8, 12, 16, 24, 32, 48, 96], [1, 97], [1, 2, 7, 14, 49, 98], [1, 3, 9, 11, 33, 99], [1, 2, 4, 5, 10, 20, 25, 50, 100]]

(a) {5, 7} → 2

可信度为1/2。

(b) {2, 3, 4} → 5

可信度为1/8。

习题6.1.6

习题6.1.3的购物篮T: [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100], [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100], [3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93, 96, 99], [4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76, 80, 84, 88, 92, 96, 100], [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100], [6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72, 78, 84, 90, 96], [7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84, 91, 98], [8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88, 96], [9, 18, 27, 36, 45, 54, 63, 72, 81, 90, 99], [10, 20, 30, 40, 50, 60, 70, 80, 90, 100], [11, 22, 33, 44, 55, 66, 77, 88, 99], [12, 24, 36, 48, 60, 72, 84, 96], [13, 26, 39, 52, 65, 78, 91], [14, 28, 42, 56, 70, 84, 98], [15, 30, 45, 60, 75, 90], [16, 32, 48, 64, 80, 96], [17, 34, 51, 68, 85], [18, 36, 54, 72, 90], [19, 38, 57, 76, 95], [20, 40, 60, 80, 100], [21, 42, 63, 84], [22, 44, 66, 88], [23, 46, 69, 92], [24, 48, 72, 96], [25, 50, 75, 100], [26, 52, 78], [27, 54, 81], [28, 56, 84], [29, 58, 87], [30, 60, 90], [31, 62, 93], [32, 64, 96], [33, 66, 99], [34, 68], [35, 70], [36, 72], [37, 74], [38, 76], [39, 78], [40, 80], [41, 82], [42, 84], [43, 86], [44, 88], [45, 90], [46, 92], [47, 94], [48, 96], [49, 98], [50, 100], [51], [52], [53], [54], [55], [56], [57], [58], [59], [60], [61], [62], [63], [64], [65], [66], [67], [68], [69], [70], [71], [72], [73], [74], [75], [76], [77], [78], [79], [80], [81], [82], [83], [84], [85], [86], [87], [88], [89], [90], [91], [92], [93], [94], [95], [96], [97], [98], [99], [100]]

(a) {24, 60} → 8

可信度为3/6=1/2。

(b) {2, 3, 4} → 5

可信度为1。

!!习题6.1.7

(a) 习题6.1.1中的数据

因为所有质数的公因数只有1和它本身。

所以,{a} → 1是可信度为100%的关联规则,其中a为1到100的质数。

(b)习题6.1.3中的数据

{2, 3, 4} → 5

在习题6.1.3产生的购物篮中,第一个购物篮包含1到100的所有元素。

设U为1到100构成的集合,i为1到100的随机数。

则U-i → i是可信度为100%的关联规则。

!习题6.1.8

对于习题6.1.4的数据,对于任意的关联规则I → j,其可信度等于包含j的购物篮的比例,所以I → j的兴趣度为0。

注:瞎写的。

习题6.2.1

a[100]存放的是{7, 8}项对的计数值。

!习题6.2.2

显然,k是否是整数与j无关,我们对i分情况讨论:

  1. i为奇数,则i-1为偶数,所以(i-1)(n-i/2)是整数,故k是整数;
  2. i为偶数,则i-1和n-i/2都是整数,所以k是整数。

综上,k必然是整数。

!习题6.2.3

( a ) 空间大小=4bits×I*(I-1)/2=2*I(I-1)bits。

( b ) 数目为 C I K C_I^K CIK

( c ) 如果在所有可能出现的 C n 2 C_n^2 Cn2个项对中在购物篮出现的比例显著小于1/3,那么三元组方式所需的空间小于三角矩阵方法。

!!习题6.2.4

将两个相同的三角矩阵作笛卡尔积,每个集合中相同元素只保留一个,保留那些互异的三元素集合,再把它们安排到一个一维数组中。

!习题6.2.5

(a)

{1, 2, 3, 5}

(b)

{60},{72},{84},{90},{96}

习题6.2.6

(a)

购物篮T:
[[1], [1, 2], [1, 3], [1, 2, 4], [1, 5], [1, 2, 3, 6], [1, 7], [1, 2, 4, 8], [1, 3, 9], [1, 2, 5, 10], [1, 11], [1, 2, 3, 4, 6, 12], [1, 13], [1, 2, 7, 14], [1, 3, 5, 15], [1, 2, 4, 8, 16], [1, 17], [1, 2, 3, 6, 9, 18], [1, 19], [1, 2, 4, 5, 10, 20], [1, 3, 7, 21], [1, 2, 11, 22], [1, 23], [1, 2, 3, 4, 6, 8, 12, 24], [1, 5, 25], [1, 2, 13, 26], [1, 3, 9, 27], [1, 2, 4, 7, 14, 28], [1, 29], [1, 2, 3, 5, 6, 10, 15, 30], [1, 31], [1, 2, 4, 8, 16, 32], [1, 3, 11, 33], [1, 2, 17, 34], [1, 5, 7, 35], [1, 2, 3, 4, 6, 9, 12, 18, 36], [1, 37], [1, 2, 19, 38], [1, 3, 13, 39], [1, 2, 4, 5, 8, 10, 20, 40], [1, 41], [1, 2, 3, 6, 7, 14, 21, 42], [1, 43], [1, 2, 4, 11, 22, 44], [1, 3, 5, 9, 15, 45], [1, 2, 23, 46], [1, 47], [1, 2, 3, 4, 6, 8, 12, 16, 24, 48], [1, 7, 49], [1, 2, 5, 10, 25, 50], [1, 3, 17, 51], [1, 2, 4, 13, 26, 52], [1, 53], [1, 2, 3, 6, 9, 18, 27, 54], [1, 5, 11, 55], [1, 2, 4, 7, 8, 14, 28, 56], [1, 3, 19, 57], [1, 2, 29, 58], [1, 59], [1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30, 60], [1, 61], [1, 2, 31, 62], [1, 3, 7, 9, 21, 63], [1, 2, 4, 8, 16, 32, 64], [1, 5, 13, 65], [1, 2, 3, 6, 11, 22, 33, 66], [1, 67], [1, 2, 4, 17, 34, 68], [1, 3, 23, 69], [1, 2, 5, 7, 10, 14, 35, 70], [1, 71], [1, 2, 3, 4, 6, 8, 9, 12, 18, 24, 36, 72], [1, 73], [1, 2, 37, 74], [1, 3, 5, 15, 25, 75], [1, 2, 4, 19, 38, 76], [1, 7, 11, 77], [1, 2, 3, 6, 13, 26, 39, 78], [1, 79], [1, 2, 4, 5, 8, 10, 16, 20, 40, 80], [1, 3, 9, 27, 81], [1, 2, 41, 82], [1, 83], [1, 2, 3, 4, 6, 7, 12, 14, 21, 28, 42, 84], [1, 5, 17, 85], [1, 2, 43, 86], [1, 3, 29, 87], [1, 2, 4, 8, 11, 22, 44, 88], [1, 89], [1, 2, 3, 5, 6, 9, 10, 15, 18, 30, 45, 90], [1, 7, 13, 91], [1, 2, 4, 23, 46, 92], [1, 3, 31, 93], [1, 2, 47, 94], [1, 5, 19, 95], [1, 2, 3, 4, 6, 8, 12, 16, 24, 32, 48, 96], [1, 97], [1, 2, 7, 14, 49, 98], [1, 3, 9, 11, 33, 99], [1, 2, 4, 5, 10, 20, 25, 50, 100]]

候选1项集C1:
{‘1’: 100, ‘2’: 50, ‘3’: 33, ‘4’: 25, ‘5’: 20, ‘6’: 16, ‘7’: 14, ‘8’: 12, ‘9’: 11, ‘10’: 10, ‘11’: 9, ‘12’: 8, ‘13’: 7, ‘14’: 7, ‘15’: 6, ‘16’: 6, ‘17’: 5, ‘18’: 5, ‘19’: 5, ‘20’: 5, ‘21’: 4, ‘22’: 4, ‘23’: 4, ‘24’: 4, ‘25’: 4, ‘26’: 3, ‘27’: 3, ‘28’: 3, ‘29’: 3, ‘30’: 3, ‘31’: 3, ‘32’: 3, ‘33’: 3, ‘34’: 2, ‘35’: 2, ‘36’: 2, ‘37’: 2, ‘38’: 2, ‘39’: 2, ‘40’: 2, ‘41’: 2, ‘42’: 2, ‘43’: 2, ‘44’: 2, ‘45’: 2, ‘46’: 2, ‘47’: 2, ‘48’: 2, ‘49’: 2, ‘50’: 2, ‘51’: 1, ‘52’: 1, ‘53’: 1, ‘54’: 1, ‘55’: 1, ‘56’: 1, ‘57’: 1, ‘58’: 1, ‘59’: 1, ‘60’: 1, ‘61’: 1, ‘62’: 1, ‘63’: 1, ‘64’: 1, ‘65’: 1, ‘66’: 1, ‘67’: 1, ‘68’: 1, ‘69’: 1, ‘70’: 1, ‘71’: 1, ‘72’: 1, ‘73’: 1, ‘74’: 1, ‘75’: 1, ‘76’: 1, ‘77’: 1, ‘78’: 1, ‘79’: 1, ‘80’: 1, ‘81’: 1, ‘82’: 1, ‘83’: 1, ‘84’: 1, ‘85’: 1, ‘86’: 1, ‘87’: 1, ‘88’: 1, ‘89’: 1, ‘90’: 1, ‘91’: 1, ‘92’: 1, ‘93’: 1, ‘94’: 1, ‘95’: 1, ‘96’: 1, ‘97’: 1, ‘98’: 1, ‘99’: 1, ‘100’: 1}

频繁1项集L1:
{‘1’: 100, ‘2’: 50, ‘3’: 33, ‘4’: 25, ‘5’: 20, ‘6’: 16, ‘7’: 14, ‘8’: 12, ‘9’: 11, ‘10’: 10, ‘11’: 9, ‘12’: 8, ‘13’: 7, ‘14’: 7, ‘15’: 6, ‘16’: 6, ‘17’: 5, ‘18’: 5, ‘19’: 5, ‘20’: 5}

候选2项集C2:
{‘[1, 2]’: 50, ‘[1, 3]’: 33, ‘[1, 4]’: 25, ‘[1, 5]’: 20, ‘[1, 6]’: 16, ‘[1, 7]’: 14, ‘[1, 8]’: 12, ‘[1, 9]’: 11, ‘[1, 10]’: 10, ‘[1, 11]’: 9, ‘[1, 12]’: 8, ‘[1, 13]’: 7, ‘[1, 14]’: 7, ‘[1, 15]’: 6, ‘[1, 16]’: 6, ‘[1, 17]’: 5, ‘[1, 18]’: 5, ‘[1, 19]’: 5, ‘[1, 20]’: 5, ‘[2, 3]’: 16, ‘[2, 4]’: 25, ‘[2, 5]’: 10, ‘[2, 6]’: 16, ‘[2, 7]’: 7, ‘[2, 8]’: 12, ‘[2, 9]’: 5, ‘[2, 10]’: 10, ‘[2, 11]’: 4, ‘[2, 12]’: 8, ‘[2, 13]’: 3, ‘[2, 14]’: 7, ‘[2, 15]’: 3, ‘[2, 16]’: 6, ‘[2, 17]’: 2, ‘[2, 18]’: 5, ‘[2, 19]’: 2, ‘[2, 20]’: 5, ‘[3, 4]’: 8, ‘[3, 5]’: 6, ‘[3, 6]’: 16, ‘[3, 7]’: 4, ‘[3, 8]’: 4, ‘[3, 9]’: 11, ‘[3, 10]’: 3, ‘[3, 11]’: 3, ‘[3, 12]’: 8, ‘[3, 13]’: 2, ‘[3, 14]’: 2, ‘[3, 15]’: 6, ‘[3, 16]’: 2, ‘[3, 17]’: 1, ‘[3, 18]’: 5, ‘[3, 19]’: 1, ‘[3, 20]’: 1, ‘[4, 5]’: 5, ‘[4, 6]’: 8, ‘[4, 7]’: 3, ‘[4, 8]’: 12, ‘[4, 9]’: 2, ‘[4, 10]’: 5, ‘[4, 11]’: 2, ‘[4, 12]’: 8, ‘[4, 13]’: 1, ‘[4, 14]’: 3, ‘[4, 15]’: 1, ‘[4, 16]’: 6, ‘[4, 17]’: 1, ‘[4, 18]’: 2, ‘[4, 19]’: 1, ‘[4, 20]’: 5, ‘[5, 6]’: 3, ‘[5, 7]’: 2, ‘[5, 8]’: 2, ‘[5, 9]’: 2, ‘[5, 10]’: 10, ‘[5, 11]’: 1, ‘[5, 12]’: 1, ‘[5, 13]’: 1, ‘[5, 14]’: 1, ‘[5, 15]’: 6, ‘[5, 16]’: 1, ‘[5, 17]’: 1, ‘[5, 18]’: 1, ‘[5, 19]’: 1, ‘[5, 20]’: 5, ‘[6, 7]’: 2, ‘[6, 8]’: 4, ‘[6, 9]’: 5, ‘[6, 10]’: 3, ‘[6, 11]’: 1, ‘[6, 12]’: 8, ‘[6, 13]’: 1, ‘[6, 14]’: 2, ‘[6, 15]’: 3, ‘[6, 16]’: 2, ‘[6, 17]’: 0, ‘[6, 18]’: 5, ‘[6, 19]’: 0, ‘[6, 20]’: 1, ‘[7, 8]’: 1, ‘[7, 9]’: 1, ‘[7, 10]’: 1, ‘[7, 11]’: 1, ‘[7, 12]’: 1, ‘[7, 13]’: 1, ‘[7, 14]’: 7, ‘[7, 15]’: 0, ‘[7, 16]’: 0, ‘[7, 17]’: 0, ‘[7, 18]’: 0, ‘[7, 19]’: 0, ‘[7, 20]’: 0, ‘[8, 9]’: 1, ‘[8, 10]’: 2, ‘[8, 11]’: 1, ‘[8, 12]’: 4, ‘[8, 13]’: 0, ‘[8, 14]’: 1, ‘[8, 15]’: 0, ‘[8, 16]’: 6, ‘[8, 17]’: 0, ‘[8, 18]’: 1, ‘[8, 19]’: 0, ‘[8, 20]’: 2, ‘[9, 10]’: 1, ‘[9, 11]’: 1, ‘[9, 12]’: 2, ‘[9, 13]’: 0, ‘[9, 14]’: 0, ‘[9, 15]’: 2, ‘[9, 16]’: 0, ‘[9, 17]’: 0, ‘[9, 18]’: 5, ‘[9, 19]’: 0, ‘[9, 20]’: 0, ‘[10, 11]’: 0, ‘[10, 12]’: 1, ‘[10, 13]’: 0, ‘[10, 14]’: 1, ‘[10, 15]’: 3, ‘[10, 16]’: 1, ‘[10, 17]’: 0, ‘[10, 18]’: 1, ‘[10, 19]’: 0, ‘[10, 20]’: 5, ‘[11, 12]’: 0, ‘[11, 13]’: 0, ‘[11, 14]’: 0, ‘[11, 15]’: 0, ‘[11, 16]’: 0, ‘[11, 17]’: 0, ‘[11, 18]’: 0, ‘[11, 19]’: 0, ‘[11, 20]’: 0, ‘[12, 13]’: 0, ‘[12, 14]’: 1, ‘[12, 15]’: 1, ‘[12, 16]’: 2, ‘[12, 17]’: 0, ‘[12, 18]’: 2, ‘[12, 19]’: 0, ‘[12, 20]’: 1, ‘[13, 14]’: 0, ‘[13, 15]’: 0, ‘[13, 16]’: 0, ‘[13, 17]’: 0, ‘[13, 18]’: 0, ‘[13, 19]’: 0, ‘[13, 20]’: 0, ‘[14, 15]’: 0, ‘[14, 16]’: 0, ‘[14, 17]’: 0, ‘[14, 18]’: 0, ‘[14, 19]’: 0, ‘[14, 20]’: 0, ‘[15, 16]’: 0, ‘[15, 17]’: 0, ‘[15, 18]’: 1, ‘[15, 19]’: 0, ‘[15, 20]’: 1, ‘[16, 17]’: 0, ‘[16, 18]’: 0, ‘[16, 19]’: 0, ‘[16, 20]’: 1, ‘[17, 18]’: 0, ‘[17, 19]’: 0, ‘[17, 20]’: 0, ‘[18, 19]’: 0, ‘[18, 20]’: 0, ‘[19, 20]’: 0}

频繁2项集L2:
{‘[1, 2]’: 50, ‘[1, 3]’: 33, ‘[1, 4]’: 25, ‘[1, 5]’: 20, ‘[1, 6]’: 16, ‘[1, 7]’: 14, ‘[1, 8]’: 12, ‘[1, 9]’: 11, ‘[1, 10]’: 10, ‘[1, 11]’: 9, ‘[1, 12]’: 8, ‘[1, 13]’: 7, ‘[1, 14]’: 7, ‘[1, 15]’: 6, ‘[1, 16]’: 6, ‘[1, 17]’: 5, ‘[1, 18]’: 5, ‘[1, 19]’: 5, ‘[1, 20]’: 5, ‘[2, 3]’: 16, ‘[2, 4]’: 25, ‘[2, 5]’: 10, ‘[2, 6]’: 16, ‘[2, 7]’: 7, ‘[2, 8]’: 12, ‘[2, 9]’: 5, ‘[2, 10]’: 10, ‘[2, 12]’: 8, ‘[2, 14]’: 7, ‘[2, 16]’: 6, ‘[2, 18]’: 5, ‘[2, 20]’: 5, ‘[3, 4]’: 8, ‘[3, 5]’: 6, ‘[3, 6]’: 16, ‘[3, 9]’: 11, ‘[3, 12]’: 8, ‘[3, 15]’: 6, ‘[3, 18]’: 5, ‘[4, 5]’: 5, ‘[4, 6]’: 8, ‘[4, 8]’: 12, ‘[4, 10]’: 5, ‘[4, 12]’: 8, ‘[4, 16]’: 6, ‘[4, 20]’: 5, ‘[5, 10]’: 10, ‘[5, 15]’: 6, ‘[5, 20]’: 5, ‘[6, 9]’: 5, ‘[6, 12]’: 8, ‘[6, 18]’: 5, ‘[7, 14]’: 7, ‘[8, 16]’: 6, ‘[9, 18]’: 5, ‘[10, 20]’: 5}

频繁3项集L3:
{1, 2, 3}, {1, 2, 4}, {1, 2, 5}, {1, 2, 6}, {1, 2, 7}, {1, 2, 8}, {1, 2, 9}, {1, 2, 10}, {1, 2, 12}, {1, 2, 14}, {1, 2, 16}, {1, 2, 18}, {1, 2, 20},
{1, 3, 4}, {1, 3, 5}, {1, 3, 6}, {1, 3, 9}, {1, 3, 12}, {1, 3, 15}, {1, 3, 18},
{1, 4, 5}, {1, 4, 6}, {1, 4, 8}, {1, 4, 10}, {1, 4, 12}, {1, 4, 16}, {1, 4, 20},
{1, 5, 10}, {1, 5, 15}, {1, 5, 20},
{1, 6, 9}, {1, 6, 12}, {1, 6, 18},
{1, 7, 14}, {1, 8, 16}, {1, 9, 18}, {1, 10, 20},
{2, 3, 4}, {2, 3, 6}, {2, 3, 9}, {2, 3, 12}, {2, 3, 18},
{2, 4, 5}, {2, 4, 6}, {2, 4, 8}, {2, 4, 10}, {2, 4, 12}, {2, 4, 16}, {2, 4, 20},
{2, 5, 10}, {2, 5, 20},
{2, 6, 9}, {2, 6, 12}, {2, 6, 18},
{2, 7, 14}, {2, 8, 16}, {2, 9, 18}, {2, 10, 20},
{3, 4, 12}, {3, 5, 15},
{4, 5, 10}, {4, 5, 20}, {4, 6, 12},
{5, 10, 20},
{6, 9, 18}

频繁4项集L4:
{1, 2, 3, 4}, {1, 2, 3, 6}, {1, 2, 3, 9}, {1, 2, 3, 12}, {1, 2, 3, 18},
{1, 2, 4, 5}, {1, 2, 4, 6}, {1, 2, 4, 8}, {1, 2, 4, 10}, {1, 2, 4, 12}, {1, 2, 4, 16}, {1, 2, 4, 20},
{1, 2, 5, 10}, {1, 2, 5, 20}, {1, 2, 6, 9}, {1, 2, 6, 12}, {1, 2, 6, 18},
{1, 2, 7, 14}, {1, 2, 8, 16}, {1, 2, 9, 18}, {1, 2, 10, 20},
{1, 3, 4, 12}, {1, 3, 5, 15},
{1, 4, 5, 10}, {1, 4, 5, 20}, {1, 4, 6, 12},
{1, 5, 10, 20}, {1, 6, 9, 18}
{2, 3, 4, 12}, {2, 4, 5, 10}, {2, 4, 5, 20}, {2, 4, 6, 12},
{2, 5, 10, 20}, {2, 6, 9, 18}
{4, 5, 10, 20}

频繁5项集L5:
{1, 2, 3, 4, 12}, {1, 2, 4, 5, 10}, {1, 2, 4, 5, 20}, {1, 2, 4, 6, 12}, {1, 2, 5, 10, 20}, {1, 2, 6, 9, 18},
{1, 4, 5, 10, 20}, {2, 4, 5, 10, 20}

频繁6项集L6:
{1, 2, 4, 5, 10, 20}

(b)

购物篮T:
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100], [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100], [3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93, 96, 99], [4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76, 80, 84, 88, 92, 96, 100], [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100], [6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72, 78, 84, 90, 96], [7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84, 91, 98], [8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88, 96], [9, 18, 27, 36, 45, 54, 63, 72, 81, 90, 99], [10, 20, 30, 40, 50, 60, 70, 80, 90, 100], [11, 22, 33, 44, 55, 66, 77, 88, 99], [12, 24, 36, 48, 60, 72, 84, 96], [13, 26, 39, 52, 65, 78, 91], [14, 28, 42, 56, 70, 84, 98], [15, 30, 45, 60, 75, 90], [16, 32, 48, 64, 80, 96], [17, 34, 51, 68, 85], [18, 36, 54, 72, 90], [19, 38, 57, 76, 95], [20, 40, 60, 80, 100], [21, 42, 63, 84], [22, 44, 66, 88], [23, 46, 69, 92], [24, 48, 72, 96], [25, 50, 75, 100], [26, 52, 78], [27, 54, 81], [28, 56, 84], [29, 58, 87], [30, 60, 90], [31, 62, 93], [32, 64, 96], [33, 66, 99], [34, 68], [35, 70], [36, 72], [37, 74], [38, 76], [39, 78], [40, 80], [41, 82], [42, 84], [43, 86], [44, 88], [45, 90], [46, 92], [47, 94], [48, 96], [49, 98], [50, 100], [51], [52], [53], [54], [55], [56], [57], [58], [59], [60], [61], [62], [63], [64], [65], [66], [67], [68], [69], [70], [71], [72], [73], [74], [75], [76], [77], [78], [79], [80], [81], [82], [83], [84], [85], [86], [87], [88], [89], [90], [91], [92], [93], [94], [95], [96], [97], [98], [99], [100]]

候选1项集C1:
{‘1’: 1, ‘2’: 2, ‘3’: 2, ‘4’: 3, ‘5’: 2, ‘6’: 4, ‘7’: 2, ‘8’: 4, ‘9’: 3, ‘10’: 4, ‘11’: 2, ‘12’: 6, ‘13’: 2, ‘14’: 4, ‘15’: 4, ‘16’: 5, ‘17’: 2, ‘18’: 6, ‘19’: 2, ‘20’: 6, ‘21’: 4, ‘22’: 4, ‘23’: 2, ‘24’: 8, ‘25’: 3, ‘26’: 4, ‘27’: 4, ‘28’: 6, ‘29’: 2, ‘30’: 8, ‘31’: 2, ‘32’: 6, ‘33’: 4, ‘34’: 4, ‘35’: 4, ‘36’: 9, ‘37’: 2, ‘38’: 4, ‘39’: 4, ‘40’: 8, ‘41’: 2, ‘42’: 8, ‘43’: 2, ‘44’: 6, ‘45’: 6, ‘46’: 4, ‘47’: 2, ‘48’: 10, ‘49’: 3, ‘50’: 6, ‘51’: 4, ‘52’: 6, ‘53’: 2, ‘54’: 8, ‘55’: 4, ‘56’: 8, ‘57’: 4, ‘58’: 4, ‘59’: 2, ‘60’: 12, ‘61’: 2, ‘62’: 4, ‘63’: 6, ‘64’: 7, ‘65’: 4, ‘66’: 8, ‘67’: 2, ‘68’: 6, ‘69’: 4, ‘70’: 8, ‘71’: 2, ‘72’: 12, ‘73’: 2, ‘74’: 4, ‘75’: 6, ‘76’: 6, ‘77’: 4, ‘78’: 8, ‘79’: 2, ‘80’: 10, ‘81’: 5, ‘82’: 4, ‘83’: 2, ‘84’: 12, ‘85’: 4, ‘86’: 4, ‘87’: 4, ‘88’: 8, ‘89’: 2, ‘90’: 12, ‘91’: 4, ‘92’: 6, ‘93’: 4, ‘94’: 4, ‘95’: 4, ‘96’: 12, ‘97’: 2, ‘98’: 6, ‘99’: 6, ‘100’: 9}

频繁1项集L1:
{‘12’: 6, ‘16’: 5, ‘18’: 6, ‘20’: 6, ‘24’: 8, ‘28’: 6, ‘30’: 8, ‘32’: 6, ‘36’: 9, ‘40’: 8, ‘42’: 8, ‘44’: 6, ‘45’: 6, ‘48’: 10, ‘50’: 6, ‘52’: 6, ‘54’: 8, ‘56’: 8, ‘60’: 12, ‘63’: 6, ‘64’: 7, ‘66’: 8, ‘68’: 6, ‘70’: 8, ‘72’: 12, ‘75’: 6, ‘76’: 6, ‘78’: 8, ‘80’: 10, ‘81’: 5, ‘84’: 12, ‘88’: 8, ‘90’: 12, ‘92’: 6, ‘96’: 12, ‘98’: 6, ‘99’: 6, ‘100’: 9}

候选2项集C2:
{‘[1, 2]’: 1, ‘[1, 3]’: 1, ‘[1, 4]’: 1, ‘[1, 5]’: 1, ‘[1, 6]’: 1, ‘[1, 7]’: 1, ‘[1, 8]’: 1, ‘[1, 9]’: 1, ‘[1, 10]’: 1, ‘[1, 11]’: 1, ‘[1, 12]’: 1, ‘[1, 13]’: 1, ‘[1, 14]’: 1, ‘[1, 15]’: 1, ‘[1, 16]’: 1, ‘[1, 17]’: 1, ‘[1, 18]’: 1, ‘[1, 19]’: 1, ‘[1, 20]’: 1, ‘[2, 3]’: 1, ‘[2, 4]’: 2, ‘[2, 5]’: 1, ‘[2, 6]’: 2, ‘[2, 7]’: 1, ‘[2, 8]’: 2, ‘[2, 9]’: 1, ‘[2, 10]’: 2, ‘[2, 11]’: 1, ‘[2, 12]’: 2, ‘[2, 13]’: 1, ‘[2, 14]’: 2, ‘[2, 15]’: 1, ‘[2, 16]’: 2, ‘[2, 17]’: 1, ‘[2, 18]’: 2, ‘[2, 19]’: 1, ‘[2, 20]’: 2, ‘[3, 4]’: 1, ‘[3, 5]’: 1, ‘[3, 6]’: 2, ‘[3, 7]’: 1, ‘[3, 8]’: 1, ‘[3, 9]’: 2, ‘[3, 10]’: 1, ‘[3, 11]’: 1, ‘[3, 12]’: 2, ‘[3, 13]’: 1, ‘[3, 14]’: 1, ‘[3, 15]’: 2, ‘[3, 16]’: 1, ‘[3, 17]’: 1, ‘[3, 18]’: 2, ‘[3, 19]’: 1, ‘[3, 20]’: 1, ‘[4, 5]’: 1, ‘[4, 6]’: 2, ‘[4, 7]’: 1, ‘[4, 8]’: 3, ‘[4, 9]’: 1, ‘[4, 10]’: 2, ‘[4, 11]’: 1, ‘[4, 12]’: 3, ‘[4, 13]’: 1, ‘[4, 14]’: 2, ‘[4, 15]’: 1, ‘[4, 16]’: 3, ‘[4, 17]’: 1, ‘[4, 18]’: 2, ‘[4, 19]’: 1, ‘[4, 20]’: 3, ‘[5, 6]’: 1, ‘[5, 7]’: 1, ‘[5, 8]’: 1, ‘[5, 9]’: 1, ‘[5, 10]’: 2, ‘[5, 11]’: 1, ‘[5, 12]’: 1, ‘[5, 13]’: 1, ‘[5, 14]’: 1, ‘[5, 15]’: 2, ‘[5, 16]’: 1, ‘[5, 17]’: 1, ‘[5, 18]’: 1, ‘[5, 19]’: 1, ‘[5, 20]’: 2, ‘[6, 7]’: 1, ‘[6, 8]’: 2, ‘[6, 9]’: 2, ‘[6, 10]’: 2, ‘[6, 11]’: 1, ‘[6, 12]’: 4, ‘[6, 13]’: 1, ‘[6, 14]’: 2, ‘[6, 15]’: 2, ‘[6, 16]’: 2, ‘[6, 17]’: 1, ‘[6, 18]’: 4, ‘[6, 19]’: 1, ‘[6, 20]’: 2, ‘[7, 8]’: 1, ‘[7, 9]’: 1, ‘[7, 10]’: 1, ‘[7, 11]’: 1, ‘[7, 12]’: 1, ‘[7, 13]’: 1, ‘[7, 14]’: 2, ‘[7, 15]’: 1, ‘[7, 16]’: 1, ‘[7, 17]’: 1, ‘[7, 18]’: 1, ‘[7, 19]’: 1, ‘[7, 20]’: 1, ‘[8, 9]’: 1, ‘[8, 10]’: 2, ‘[8, 11]’: 1, ‘[8, 12]’: 3, ‘[8, 13]’: 1, ‘[8, 14]’: 2, ‘[8, 15]’: 1, ‘[8, 16]’: 4, ‘[8, 17]’: 1, ‘[8, 18]’: 2, ‘[8, 19]’: 1, ‘[8, 20]’: 3, ‘[9, 10]’: 1, ‘[9, 11]’: 1, ‘[9, 12]’: 2, ‘[9, 13]’: 1, ‘[9, 14]’: 1, ‘[9, 15]’: 2, ‘[9, 16]’: 1, ‘[9, 17]’: 1, ‘[9, 18]’: 3, ‘[9, 19]’: 1, ‘[9, 20]’: 1, ‘[10, 11]’: 1, ‘[10, 12]’: 2, ‘[10, 13]’: 1, ‘[10, 14]’: 2, ‘[10, 15]’: 2, ‘[10, 16]’: 2, ‘[10, 17]’: 1, ‘[10, 18]’: 2, ‘[10, 19]’: 1, ‘[10, 20]’: 4, ‘[11, 12]’: 1, ‘[11, 13]’: 1, ‘[11, 14]’: 1, ‘[11, 15]’: 1, ‘[11, 16]’: 1, ‘[11, 17]’: 1, ‘[11, 18]’: 1, ‘[11, 19]’: 1, ‘[11, 20]’: 1, ‘[12, 13]’: 1, ‘[12, 14]’: 2, ‘[12, 15]’: 2, ‘[12, 16]’: 3, ‘[12, 17]’: 1, ‘[12, 18]’: 4, ‘[12, 19]’: 1, ‘[12, 20]’: 3, ‘[13, 14]’: 1, ‘[13, 15]’: 1, ‘[13, 16]’: 1, ‘[13, 17]’: 1, ‘[13, 18]’: 1, ‘[13, 19]’: 1, ‘[13, 20]’: 1, ‘[14, 15]’: 1, ‘[14, 16]’: 2, ‘[14, 17]’: 1, ‘[14, 18]’: 2, ‘[14, 19]’: 1, ‘[14, 20]’: 2, ‘[15, 16]’: 1, ‘[15, 17]’: 1, ‘[15, 18]’: 2, ‘[15, 19]’: 1, ‘[15, 20]’: 2, ‘[16, 17]’: 1, ‘[16, 18]’: 2, ‘[16, 19]’: 1, ‘[16, 20]’: 3, ‘[17, 18]’: 1, ‘[17, 19]’: 1, ‘[17, 20]’: 1, ‘[18, 19]’: 1, ‘[18, 20]’: 2, ‘[19, 20]’: 1}

频繁2项集L2:
Ø

!习题6.2.7

候选集空间开销= C 2 M 2 C_2M^2 C2M2×4bits=4M(2M-1) bits。

频繁项集空间开销=3× C N 2 C_N^2 CN2×4bits=6N(N-1) bits。

总开销=4M(2M-1)+6N(N-1) bits。

习题6.3.1

( a )

支持度
14
26
38
48
56
64
项对支持度
{1, 2}2
{1, 3}3
{1, 4}2
{1, 5}1
{1, 6}0
{2, 3}3
{2, 4}4
{2, 5}2
{2, 6}1
{3, 4}4
{3, 5}4
{3, 6}2
{4, 5}3
{4, 6}3
{5, 6}2

( b )

项对桶编号
{1, 2}2
{1, 3}3
{1, 4}4
{1, 5}5
{1, 6}6
{2, 3}6
{2, 4}8
{2, 5}10
{2, 6}1
{3, 4}1
{3, 5}4
{3, 6}7
{4, 5}9
{4, 6}2
{5, 6}8

( c )

由上述结果可得:

项对支持度桶编号
{1, 2}22
{1, 3}33
{1, 4}24
{1, 5}15
{1, 6}06
{2, 3}36
{2, 4}48
{2, 5}210
{2, 6}11
{3, 4}41
{3, 5}44
{3, 6}27
{4, 5}39
{4, 6}32
{5, 6}28

于是:

桶编号频繁度
00
15
25
33
46
51
63
72
86
93
102

因为支持度阈值为4,所以,桶1,2,4,8是频繁的。

( d )

项对支持度桶编号
{1, 2}22
{1, 4}24
{2, 4}48
{2, 6}11
{3, 4}41
{3, 5}44
{4, 6}32
{5, 6}28

在PCY算法的第二次扫描中,项对{1, 2},{1, 4},{2, 4},{2, 6},{3, 4},{3, 5},{4, 6},{5, 6}会被计数。

习题6.3.2

第二次扫描:

项对支持度桶编号
{1, 2}22
{1, 3}33
{1, 4}24
{1, 5}15
{1, 6}06
{2, 3}36
{2, 4}48
{2, 5}21
{2, 6}13
{3, 4}43
{3, 5}46
{3, 6}20
{4, 5}32
{4, 6}36
{5, 6}23

于是:

桶编号频繁度
02
12
25
311
42
51
610
70
84

因为支持度阈值为4,所以,桶2,3,6,8是频繁的。

在多阶段算法的第二次扫描中,项对{1, 2},{2, 4},{2, 6},{3, 4},{3, 5},{4, 5},{5, 6}会被计数。

第二次扫描减少了候选对集合的数目。

习题6.3.3

支持度
14
26
38
48
56
64
项对支持度桶1编号桶2编号
{1, 2}220
{1, 3}300
{1, 4}230
{1, 5}110
{1, 6}040
{2, 3}323
{2, 4}404
{2, 5}230
{2, 6}111
{3, 4}423
{3, 5}400
{3, 6}232
{4, 5}320
{4, 6}303
{5, 6}224

于是:

桶编号桶1频繁度桶2频繁度
048
121
251
333
412

支持度为2时,有5个项对会被计数,小于PCY。

故,支持度阈值至少为2。

!习题6.3.4

项名称到整数耗费存储空间:106×4 bits=4×10^6 bits。

项计数值耗费存储空间:8×106 bits。

桶计数哈希表耗费存储空间:P×4 bits=4P bits。

这三个结构共同存储在内存中,它们之和不能超过内存大小S,即:4×106+8×106+4P≤S

故,P≤(S-1.2×107)/4。

!习题6.3.5

多哈希算法能减少内存需求。

第一次扫描中使用的最优哈希表数目是(S-1.2×107)/8P。

!习题6.3.6

多阶段算法第三次扫描中的期望候选项对数是(S-1.2×107)/128P。

习题6.4.1

反例边界:{A, B, C},{E, F},{G},{H}。

习题6.4.2

( a )

样本数据上的频繁项集是{1},{2},{3},{4},{5},{6},{1, 2},{2, 3},{3, 4},{4, 5},{5, 6},{1, 2, 3},{2, 3, 4},{3, 4, 5},{4, 5, 6}。

( b )

反例边界是{1, 4},{1, 5},{1, 6},{2, 5},{2, 6},{3, 6}。

( c )

整个数据集扫描后的结果是:

  1. 样本数据上的频繁项集{1},{2},{3},{4},{5},{6},{1, 2},{2, 3},{3, 4},{4, 5},{5, 6},{1, 2, 3},{2, 3, 4},{3, 4, 5},{4, 5, 6},以及它们各自的计数值。
  2. 反例边界的计数值,都为0。

反例边界的计数值都为0,说明在反例边界中没有一个项集在整个数据集上是频繁的。

!!习题6.4.3

单个文件中出现项i的概率p=s/n。

项i频繁的概率为:

在这里插入图片描述

!!习题6.5.1

我们将对{i, j}进行评分的时间比例是(1-c)p

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UestcXiye

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值