llama-2-7b-chat-hf 参数及size

「学习记录」

模型结构:

LlamaConfig {
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 11008,
  "max_position_embeddings": 2048,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 32,
  "pretraining_tp": 1,
  "rms_norm_eps": 1e-06,
  "rope_scaling": null,
  "rope_theta": 10000.0,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.38.2",
  "use_cache": true,
  "vocab_size": 32000
}

重要的:32层,32个attention heads,词表大小为 32000

各参数及size:

model.embed_tokens.weight torch.Size([32000, 4096])

model.layers.0.self_attn.q_proj.weight torch.Size([4096, 4096])
model.layers.0.self_attn.k_proj.weight torch.Size([4096, 4096])
model.layers.0.self_attn.v_proj.weight torch.Size([4096, 4096])
model.layers.0.self_attn.o_proj.weight torch.Size([4096, 4096])
model.layers.0.mlp.gate_proj.weight torch.Size([11008, 4096])
model.layers.0.mlp.up_proj.weight torch.Size([11008, 4096])
model.layers.0.mlp.down_proj.weight torch.Size([4096, 11008])
model.layers.0.input_layernorm.weight torch.Size([4096])
model.layers.0.post_attention_layernorm.weight torch.Size([4096])
…
model.layers.31.self_attn.q_proj.weight torch.Size([4096, 4096])
model.layers.31.self_attn.k_proj.weight torch.Size([4096, 4096])
model.layers.31.self_attn.v_proj.weight torch.Size([4096, 4096])
model.layers.31.self_attn.o_proj.weight torch.Size([4096, 4096])
model.layers.31.mlp.gate_proj.weight torch.Size([11008, 4096])
model.layers.31.mlp.up_proj.weight torch.Size([11008, 4096])
model.layers.31.mlp.down_proj.weight torch.Size([4096, 11008])
model.layers.31.input_layernorm.weight torch.Size([4096])
model.layers.31.post_attention_layernorm.weight torch.Size([4096])

model.norm.weight torch.Size([4096])
lm_head.weight torch.Size([32000, 4096])

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值