Transformer测试题

穗岁陶涛

已于 2024-06-22 15:15:21 修改

阅读量564

点赞数 9

文章标签： transformer 自然语言处理

于 2024-06-22 15:11:02 首次发布

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/m0_62603888/article/details/139882910

版权

1.	在实现位置编码时，以下哪一行代码使用正弦函数计算位置编码？ _C____ import torch import math class PositionalEncoding(nn.Module): def __init__(self, d_model, max_len=5000): super(PositionalEncoding, self).__init__() self.dropout = nn.Dropout(p=0.1) pe = torch.zeros(max_len, d_model) position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1) div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model)) pe[:, 0::2] = torch.sin(position * div_term) # (1) pe[:, 1::2] = torch.cos(position * div_term) # (2) pe = pe.unsqueeze(0).transpose(0, 1) self.register_buffer('pe', pe) def forward(self, x): x = x + self.pe[:x.size(0), :] return self.dropout(x) A. `position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)` B. `div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))` C. `pe[:, 0::2] = torch.sin(position * div_term)` D. `self.register_buffer('pe', pe)`
2.	在Transformer的多头注意力机制中，下列哪一行代码实现了将不同头的注意力输出拼接并线性变换？ _D____ import torch import torch.nn as nn class MultiHeadAttention(nn.Module): def __init__(self, d_model, num_heads): super(MultiHeadAttention, self).__init__() self.num_heads = num_heads self.d_model = d_model self.depth = d_model // num_heads self.wq = nn.Linear(d_model, d_model) self.wk = nn.Linear(d_model, d_model) self.wv = nn.Linear(d_model, d_model) self.dense = nn.Linear(d_model, d_model) def split_heads(self, x, batch_size): x = x.view(batch_size, -1, self.num_heads, self.depth) return x.transpose(1, 2) def forward(self, query, key, value, mask=None): batch_size = query.size(0) query = self.wq(query) # (1) key = self.wk(key) # (2) value = self.wv(value) # (3) query = self.split_heads(query, batch_size) # (4) key = self.split_heads(key, batch_size) # (5) value = self.split_heads(value, batch_size) # (6) scores = torch.matmul(query, key.transpose(-2, -1)) / torch.sqrt(torch.tensor(self.depth, dtype=torch.float32)) if mask is not None: scores = scores.masked_fill(mask == 0, -1e9) attention = torch.nn.functional.softmax(scores, dim=-1) x = torch.matmul(attention, value) # (7) x = x.transpose(1, 2).contiguous().view(batch_size, -1, self.d_model) # (8) output = self.dense(x) # (9) return output A. `query = self.wq(query)` B. `x = x.transpose(1, 2).contiguous().view(batch_size, -1, self.d_model)` C. `scores = torch.matmul(query, key.transpose(-2, -1)) / torch.sqrt(torch.tensor(self.depth, dtype=torch.float32))` D. `output = self.dense(x)`
3.	在下列代码片段中，哪一行代码实现了自注意力机制中的缩放点积注意力计算？A import torch import torch.nn.functional as F class ScaledDotProductAttention(nn.Module): def __init__(self, d_model): super(ScaledDotProductAttention, self).__init__() self.d_model = d_model def forward(self, query, key, value, mask=None): scores = torch.matmul(query, key.transpose(-2, -1)) / torch.sqrt(torch.tensor(self.d_model, dtype=torch.float32)) # (1) if mask is not None: scores = scores.masked_fill(mask == 0, -1e9) # (2) attention = F.softmax(scores, dim=-1) # (3) output = torch.matmul(attention, value) # (4) return output, attention A. `scores = torch.matmul(query, key.transpose(-2, -1)) / torch.sqrt(torch.tensor(self.d_model, dtype=torch.float32))` B. `scores = scores.masked_fill(mask == 0, -1e9)` C. `attention = \f.softmax(scores, dim=-1)` D. `output = torch.matmul(attention, value)`
4.	Transformer模型的多头注意力机制（Multi-Head Attention）主要用于：C A. 提高模型的并行计算能力 B. 增强模型的非线性能力 C. 捕捉不同子空间的特征表示 D. 减少模型的参数数量
5.	Transformer模型的编码器和解码器都包括以下哪种网络层？CD A. 卷积层 B. 循环层 C. 全连接层 D. 自注意力层
6.	Transformer模型中的Layer Normalization（层归一化）通常应用在：CD A. 自注意力层之后 B. 多头注意力层之前 C. 残差连接之前 D. 每个子层中间
7.	以下哪种方法常用于训练Transformer模型？C A. 梯度下降法 B. 动量优化 C. Adam优化器 D. 随机梯度下降
8.	序列到序列模型最初用于解决什么任务？B A. 图像分类 B. 机器翻译 C. 语音识别 D. 文本分类
9.	序列到序列模型中常用的编码器和解码器结构是：B A. 卷积神经网络 B. 循环神经网络 C. 生成对抗网络 D. 自编码器
10.	在机器翻译任务中，序列到序列模型中最常用的编码器和解码器是：B A. 卷积神经网络（CNN） B. 循环神经网络（RNN） C. 生成对抗网络（GAN） D. 图卷积网络（GCN）

关注

9
点赞
踩
13

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。