两道大模型算法面试手撕代码回忆与总结。
本期主要是回忆面蚂蚁和阿里大模型高级算法工程师时遇到的手撕代码题,并提供对应的代码。
一、蚂蚁P7大模型算法面试手撕代码
题目:写一个文本分类函数,分为三类:积极、消极、中性。判断一段文本的类别。
注意:这个题目需要注意否定副词对分类的影响。
解答:
这里参照苏神的思路来解答,框架图如下。
# s1:自建一个情感词典`` ``negdict = ["伤心", "难过", ...] #消极情感词典``posdict = ["开心", "高兴", ...] #积极情感词典``nodict = ["不", ...] #否定词词典``plusdict = ["很", "非常", ...] #程度副词词典`` ``# s2:根据设计的算法写函数`` ``def predict(s, negdict, posdict, nodict, plusdict):` `p = 0` `sd = list(jieba.cut(s))` `for i in range(len(sd)):` `if sd[i] in negdict:` `if i>0 and sd[i-1] in nodict:` `p = p + 1` `elif i>0 and sd[i-1] in plusdict:` `p = p - 2` `else: p = p - 1` `elif sd[i] in posdict:` `if i>0 and sd[i-1] in nodict:` `p = p - 1` `elif i>0 and sd[i-1] in plusdict:` `p = p + 2` `elif i>0 and sd[i-1] in negdict:` `p = p - 1` `elif i<len(sd)-1 and sd[i+1] in negdict:` `p = p - 1` `else: p = p + 1` `elif sd[i] in nodict:` `p = p - 0.5` `return p``
二、阿里另一个部门面试的手撕代码
题目:请写出简略版MHA的运算代码。
注意:一定要熟记,最好自己写来试一试;作者本来以为MHA用的滚瓜烂熟了,结果写的时候差点没写出来。
解答:
class MultiHeadAttention(nn.Module):` `def __init__(self, embedding_size, d_k, n_heads):` `super(MultiHeadAttention, self).__init__()` `# self.W_Q.weight: [ n_heads*d_k, EMBEDDING_SIZE ]` `self.W_Q = nn.Linear(EMBEDDING_SIZE, n_heads * d_k)`` ` `# self.W_K.weight: [ n_heads*d_k, EMBEDDING_SIZE ]` `self.W_K = nn.Linear(EMBEDDING_SIZE, n_heads * d_k)`` ` `# self.W_V.weight: [ n_heads*d_k, EMBEDDING_SIZE ]` `self.W_V = nn.Linear(EMBEDDING_SIZE, n_heads * d_k)`` ` `self.n_heads = n_heads` `self.d_model = embedding_size` `self.d_k = d_k`` ` `def forward(self, Q, attn_mask):` `# q: [batch_size,seq_length, EMBEDDING_SIZE]` `# residual, batch_size = Q, Q.size(0)`` ` `batch_size = Q.size(0)` `# (B, S, D) -proj-> (B, S, D) -split-> (B, S, H, W) -trans-> (B, H, S, W)`` ` `# q_s: [batch_size, n_heads, seq_length, d_k]` `q_s = self.W_Q(Q).view(batch_size, -1, self.n_heads, self.d_k).transpose(1,2)`` ` `# k_s: [batch_size, n_heads, seq_length, d_k]` `k_s = self.W_K(Q).view(batch_size, -1, self.n_heads, self.d_k).transpose(1,2)`` ` `# v_s: [batch_size, n_heads, seq_length, d_k]` `v_s = self.W_V(Q).view(batch_size, -1, self.n_heads, self.d_k).transpose(1,2)`` ` `attn_mask = attn_mask.eq(0)`` ` `# attn_mask : [batch_size, n_heads, seq_length, seq_length]` `attn_mask = attn_mask.unsqueeze(1).unsqueeze(3).repeat(1, self.n_heads, 1, k_s.size(2))`` ` `# Z : [batch_size, n_heads, seq_length, d_k]` `Z = ScaledDotProductAttention()(q_s, k_s, v_s, attn_mask)`` ` `# Z : [batch_size , seq_length , n_heads * d_k]` `Z = Z.transpose(1, 2).contiguous().view(batch_size, -1, self.d_k * self.n_heads)` ` # output : [batch_size , seq_length , embedding_size]` `output = nn.Linear(self.d_k * self.n_heads, self.d_model).to(device)(Z)`` ` `return output
所有资料 ⚡️ ,朋友们如果有需要全套 《LLM大模型入门+进阶学习资源包》,扫码获取~