- model inputs_id
inputs_id 为tensor,大小为batch_size*seq_len - hidden states 获取
input_ids = torch.tensor([tokenizer.encode(test_text, add_special_tokens=True)])
emb = model(input_ids, output_hidden_states=True)
hidden_states = emb.hidden_states
这里hidden_state为一个list,长度为model的layer数+1,第一层是embedding之后得到的结果
embedding_output = model.embeddings(
input_ids=input_ids
)
encoder_outputs = model.encoder(
embedding_output,
output_hidden_states=True
)
embedding_output == encoder_outputs.hidden_states[0]
>>> True