StarCoder 2 使用教程

最新推荐文章于 2025-03-31 13:35:26 发布

潘俭渝Erik

最新推荐文章于 2025-03-31 13:35:26 发布

阅读量973

点赞数 4

本文链接：https://blog.csdn.net/gitblog_00091/article/details/141119851

版权

StarCoder 2 使用教程

项目地址:https://gitcode.com/gh_mirrors/st/starcoder2

项目介绍

StarCoder 2 是一个代码生成模型系列，包括 3B、7B 和 15B 参数模型，训练数据涵盖了 600 多种编程语言以及如 Wikipedia、Arxiv 和 GitHub 问题等自然语言文本。这些模型采用了 Grouped Query Attention 技术，具有 16,384 个令牌的上下文窗口，并使用 4,096 个令牌的滑动窗口注意力。3B 和 7B 模型在超过 3 万亿个令牌上进行了训练，而 15B 模型则在超过 4 万亿个令牌上进行了训练。

项目快速启动

安装依赖

首先，确保你已经安装了必要的依赖包。你可以通过以下命令安装：

pip install -r requirements.txt

启动训练

使用以下命令启动训练过程：

accelerate launch finetune.py \
  --model_id "bigcode/starcoder2-3b" \
  --dataset_name "bigcode/the-stack-smol" \
  --subset "data/rust" \
  --dataset_text_field "content" \
  --split "train" \
  --max_seq_length 1024 \
  --max_steps 10000 \
  --micro_batch_size 1 \
  --gradient_accumulation_steps 8 \
  --learning_rate 2e-5 \
  --warmup_steps 20 \
  --num_proc "$(nproc)"