关于 Grok
Grok-1 is a 314 billion parameter Mixture-of-Experts model trained from scratch by xAI.
- github : https://github.com/xai-org/grok-1
- 模型介绍:https://x.ai/blog/grok-os
- x ai blog : https://x.ai/blog
- HuggingFace下载链接:https://huggingface.co/hpcai-tech/grok-1
- ModelScope下载链接:
https://www.modelscope.cn/models/colossalai/grok-1-pytorch/summary
相关文章
- Colossal-AI - grok-1使用例
https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/grok-1 - 马斯克如约开源 Grok,2天狂揽3.5万颗Star
https://mp.weixin.qq.com/s/A0f7jkO7B7xAPxhfhDgWzg - 马斯克的grok大模型终于开源了
https://mp.weixin.qq.com/s/p8IdfHvsi-bJ7ZaSxOH4aA
运行
https://github.com/xai-org/grok-1 这个存储库包含用于加载和运行 Grok-1 开放权重模型的 JAX示例代码。
下载 checkpoint,然后替换 checkpoints
中的 ckpt-0
文件夹,看 Downloading the weights
然后,运行下面代码来测试:
pip install -r requirements.txt
python run.py
脚本在测试输入上从模型加载 checkpoint 和样本。
由于模型的大尺寸(314B参数),需要具有足够GPU内存的机器来用示例代码测试模型。
该存储库中MoE层的实现效率不高。
选择该实现是为了避免 需要自定义内核 来验证模型的正确性。
Model Specifications
Grok-1 is currently designed with the following specifications:
- Parameters: 314B
- Architecture: Mixture of 8 Experts (MoE)
- Experts Utilization: 2 experts used per token
- Layers: 64
- Attention Heads: 48 for queries, 8 for keys/values
- Embedding Size: 6,144
- Tokenization: SentencePiece tokenizer with 131,072 tokens
- Additional Features:
- Rotary embeddings (RoPE)
- Supports activation sharding and 8-bit quantization
- Maximum Sequence Length (context): 8,192 tokens
下载权重
方式一:使用磁力链下载
You can download the weights using a torrent client and this magnet link:
magnet:?xt=urn:btih:5f96d43576e3d386c9ba65b883210a393b68210e&tr=https%3A%2F%2Facademictorrents.com%2Fannounce.php&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce
方式二:直接使用 🤗 HuggingFace Hub
https://huggingface.co/xai-org/grok-1
git clone https://github.com/xai-org/grok-1.git && cd grok-1
pip install huggingface_hub[hf_transfer]
huggingface-cli download xai-org/grok-1 --repo-type model --include ckpt-0/* --local-dir checkpoints --local-dir-use-symlinks False
2024-03-29(五)