RNN中的batch_size 和 time_step的直观理解

最新推荐文章于 2024-06-10 09:50:18 发布

神鱼来啦

最新推荐文章于 2024-06-10 09:50:18 发布

阅读量8.2k

点赞数 37

分类专栏：深度学习

本文链接：https://blog.csdn.net/lnaruto1234/article/details/99672601

版权

深度学习专栏收录该内容

2 篇文章 0 订阅

订阅专栏

自然语言处理中非结构化数据有时候很头疼，拿到一大堆数据后，到底是怎么变成最终喂给网络的向量的呢？time_step和batch_size到底是什么样子的呢？？

我们先不去看变成向量后的batch_size和time_step,如果现在手头有那么一些句子如下

sentences = ["i love you", "he loves me", "she likes baseball", "i hate you", "sorry for that"]

这里一共五句话，如果我是要做情感分类，因为这里数据量比较小，所以我就把五句话当成一个批次，然后time_step为3

这样的话输入就是下面这样

sentences = ["i love you", "he loves me", "she likes baseball", "i hate you", "sorry for that"]


>>>                       batch_size 为5
time_step1:    i        he      she        i       sorry
time_step2:    love     love    likes      hate    for
time_step3:    you      me      baseball   you     that

然后，rnn计算损失的时候,就会在每一个time_step计算batch_size大小个样本的平均损失(也有时候是最后一个time_step统计损失).

现在我们把其变成计算机看的懂的语言,首先我们要有一个词表，如下：

word_list = " ".join(sentences).split()
word_list = list(set(word_list))
word_dict = {w: i for i, w in enumerate(word_list)}
vocab_size = len(word_dict)

>>>word_list
['awful','likes','is','she','for','hate','i','you','love','loves','that','this','sorry','he','me','baseball']
>>>word_dict
{'awful': 0,'likes': 1,'is': 2,'she': 3,'for': 4,'hate': 5,'i': 6,'you': 7,'love': 8,'loves': 9,'that': 10,'this': 11,'sorry': 12,'he': 13,'me': 14,'baseball': 15}

>>>vocab_size
16

然后我们要就要把每一个单词变成一个one-hot向量，维度为16。

比如：time_step1就是下面的样子：

array([[0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0],    #i
       [0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0],    #he
       [0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0],    #she
       [0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0],    #i
       [0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0]])   #sorry

这个就是batch_size为5 ，time_step1时刻喂给网络的值。

神鱼来啦

关注

37
点赞
踩
62

收藏

觉得还不错? 一键收藏
5
评论
RNN中的batch_size 和 time_step的直观理解

自然语言处理中非结构化数据有时候很头疼，拿到一大堆数据后，到底是怎么变成最终喂给网络的向量的呢？time_step和batch_size到底是什么样子的呢？？我们先不去看变成向量后的batch_size和time_step,如果现在手头有那么一些句子如下sentences = ["i love you", "he loves me", "she likes baseball", "i h...
复制链接

扫一扫

专栏目录