keras 的 example 文件 babi_memnn.py 解析

最新推荐文章于 2023-09-24 10:44:44 发布

zhqh100

最新推荐文章于 2023-09-24 10:44:44 发布

阅读量242

点赞数

分类专栏： TensorFlow python

本文链接：https://blog.csdn.net/zhqh100/article/details/105193991

版权

python 同时被 2 个专栏收录

51 篇文章 0 订阅

订阅专栏

TensorFlow

48 篇文章 1 订阅

订阅专栏

该代码功能是实现一个阅读理解的神经网络，就是给一段材料，提一个问题，然后看是否能给出答案；

首先这个代码有一个bug，在Python2下应该可以运行，但是在Python3下会报错，

有人提交了pull request，https://github.com/keras-team/keras/pull/13519/commits/3fc48bcd9a9cc931c43cf4e9e63ae35b61af8910，

但是官方对这个工程已经不咋用心了，所以至今还未合并，

可以把第37行

return [x.strip() for x in re.split(r'(\W+)?', sent) if x.strip()]

中的问号去掉

数据集是 Facebook 的 babi 数据集，官方地址为 https://research.fb.com/downloads/babi/

我这人有一个毛病，就是看到一个名称，总喜欢搞懂这个名称本身是什么意思，不过搜了一下确实没有搜到，只是看到一个非官方猜测 https://www.quora.com/What-does-bAbI-stand-for ， babi 这个名称的含义：

babi，官方的叫法其实是 bAbI，发音，和意思，都是baby，大致就是婴儿学习的意思，而把baby，改为 bAbI，就是geek们把 AI 嵌入到了 baby 这个单词中，因为 AI 这两个字母刻意大写，而其余字符刻意小写

把数据集 https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz 下载之后，解压缩，打开 qa1_single-supporting-fact_train.txt 就可以大致明白是怎么回事了，

如第一个示例

1 Mary moved to the bathroom.
2 John went to the hallway.
3 Where is Mary? 	bathroom	1

1和2是材料，3是问题和答案

从这个数据集中，自己给每个单词和标点符号搞了一个编码：

{'.': 1, '?': 2, 'Daniel': 3, 'John': 4, 'Mary': 5, 'Sandra': 6, 'Where': 7, 'back': 8, 'bathroom': 9, 'bedroom': 10, 'garden': 11, 'hallway': 12, 'is': 13, 'journeyed': 14, 'kitchen': 15, 'moved': 16, 'office': 17, 'the': 18, 'to': 19, 'travelled': 20, 'went': 21}

可以看到总共是21个编码，再加上补齐的pad，数值为0，那总共就是22个编码

然后编码每个材料，问题，和答案：

(['Mary', 'moved', 'to', 'the', 'bathroom', '.', 'John', 'went', 'to', 'the', 'hallway', '.'], ['Where', 'is', 'Mary', '?'], 'bathroom')
[ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  5 16 19 18  9  1  4 21 19 18 12  1]
[ 7 13  5  2]
9

0代码啥也没有，pad的字符

输入的shape

inputs_train shape: (10000, 68)
queries_train shape: (10000, 4)
answers_train shape: (10000,)

神经网络结构

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_1 (InputLayer)            (None, 68)           0
__________________________________________________________________________________________________
input_2 (InputLayer)            (None, 4)            0
__________________________________________________________________________________________________
sequential_1 (Sequential)       multiple             1408        input_1[0][0]
__________________________________________________________________________________________________
sequential_3 (Sequential)       (None, 4, 64)        1408        input_2[0][0]
__________________________________________________________________________________________________
dot_1 (Dot)                     (None, 68, 4)        0           sequential_1[1][0]
                                                                 sequential_3[1][0]
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 68, 4)        0           dot_1[0][0]
__________________________________________________________________________________________________
sequential_2 (Sequential)       multiple             88          input_1[0][0]
__________________________________________________________________________________________________
add_1 (Add)                     (None, 68, 4)        0           activation_1[0][0]
                                                                 sequential_2[1][0]
__________________________________________________________________________________________________
permute_1 (Permute)             (None, 4, 68)        0           add_1[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 4, 132)       0           permute_1[0][0]
                                                                 sequential_3[1][0]
__________________________________________________________________________________________________
lstm_1 (LSTM)                   (None, 32)           21120       concatenate_1[0][0]
__________________________________________________________________________________________________
dropout_4 (Dropout)             (None, 32)           0           lstm_1[0][0]
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 22)           726         dropout_4[0][0]
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 22)           0           dense_1[0][0]
==================================================================================================
Total params: 24,750
Trainable params: 24,750
Non-trainable params: 0
__________________________________________________________________________________________________

——————————————————————

总目录