lmg_Model Links and Torrents

该文章提供了一系列AI模型的更新信息,包括LLaMA的16位和4位权重,以及GPT4相关变体如AlpacaDente。这些模型有不同的GPU和CPU要求,部分已量子化为4位,同时提到了内存需求和过滤程度。用户可通过HuggingFace获取下载链接。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

/lmg/ Model Links and Torrents

  1. Changelog (MDY)
  2. 4-bit GPU Model Requirements
  3. 4-bit CPU/llama.cpp RAM Requirements
  4. LLaMA 16-bit Weights
  5. LLaMA 4-bit Weights
  6. BluemoonRP 13B (05/07/2023)
  7. Vicuna 13B Cocktail (05/07/2023)
  8. GPT4-x-AlpacaDente2-30B (05/05/2023)
  9. Vicuna 13B Free v1.1 (05/01/2023)
  10. Pygmalion/Metharme 7B (04/30/2023)
  11. GPT4-X-Alpasta 30B (04/29/2023)
  12. OpenAssistant LLaMa 30B SFT 6 (04/23/2023)
  13. SuperCOT (04/22/2023)
  14. Previous Model List

Changelog (MDY)

[05-07-2023] - Added Vicuna 13B Cocktail, bluemoonrp-13b & AlpacaDente2
[05-05-2023] - Added CPU quantization variation links
[05-02-2023] - Initial Rentry

4-bit GPU Model Requirements

VRAM Required takes full context (2048) into account. You may be able to load the model on GPU’s with slightly lower VRAM, but you will not be able to run at full context. If you do not have enough RAM to load model, it will load into swap. Groupsize models will increase VRAM usage, as will running a LoRA alongside the model.

Model ParametersVRAM RequiredGPU ExamplesRAM to Load
7B8GBRTX 1660, 2060, AMD 5700xt, RTX 3050, RTX 3060, RTX 30706 GB
13B12GBAMD 6900xt, RTX 2060 12GB, 3060 12GB, 3080 12GB, A200012GB
30B24GBRTX 3090, RTX 4090, A4500, A5000, 6000, Tesla V10032GB
65B42GBA100 80GB, NVIDIA Quadro RTX 8000, Quadro RTX A600064GB

4-bit CPU/llama.cpp RAM Requirements

5bit to 8bit Quantized models are becoming more common, and will obviously require more RAM. Will update these with the numbers when I have them.

Model4-bit5-bit8-bit
7B3.9 GB
13B7.8 GB
30B19.5 GB
65B38.5 GB

LLaMA 16-bit Weights

The original LLaMA weights converted to Transformers @ 16bit. A torrent is available as well, but it uses outdated configuration files that will need to be updated. Note that these aren’t for general use, as the VRAM requirements are beyond consumer scope.

Filtering Status : None

ModelTypeDownload
7B 16bitHF FormatHuggingFace
13B 16bitHF FormatHuggingFace
30B 16bitHF FormatHuggingFace
65B 16bitHF FormatHuggingFace
All the aboveHF Format[Torrent Magnet](magnet:?xt=urn:btih:8d634925911a03f787d9f68ac075a9b24281573a&dn=Safe-LLaMA-HF-v2 (4-04-23)&tr=http%3a%2f%2fbt2.archive.org%3a6969%2fannounce&tr=http%3a%2f%2fbt1.archive.org%3a6969%2fannounce)

LLaMA 4-bit Weights

The original LLaMA weights quantized to 4-bit. The GPU CUDA versions have outdated tokenizer and configuration files. It is recommended to either update them with this or use the universal LLaMA tokenizer.

Filtering Status : None

ModelTypeDownload
7B, 13B, 30B, 65BCPUTorrent Magnet
7B, 13B, 30B, 65BGPU CUDA (no groupsize)Torrent Magnet
7B, 13B, 30B, 65BGPU CUDA (128gs)Torrent Magnet
7B, 13B, 30B, 65BGPU TritonNeko Institute of Science HF page

BluemoonRP 13B (05/07/2023)

An RP/ERP focused finetune of LLaMA 13B finetuned on BluemoonRP logs. It is designed to simulate a 2-person RP session. Two versions are provided; a standard 13B with 2K context and an experimental 13B with 4K context. It has a non-standard format (LEAD/ASSOCIATE), so ensure that you read the model card and use the correct syntax.

Filtering Status : Very light

ModelTypeDownload
13BGPU & CPUhttps://huggingface.co/reeducator/bluemoonrp-13b

Vicuna 13B Cocktail (05/07/2023)

Vicuna 1.1 13B finetune incorporating various datasets in addition to the unfiltered ShareGPT. This is an experiment attempting to enhance the creativity of the Vicuna 1.1, while also reducing censorship as much as possible. All datasets have been cleaned. Additionally, only the “instruct” portion of GPTeacher has been used. It has a non-standard format (USER/ASSOCIATE), so ensure that you read the model card and use the correct syntax.

Filtering Status : Very light

ModelTypeDownload
13BGPU & CPUhttps://huggingface.co/reeducator/vicuna-13b-cocktail

GPT4-x-AlpacaDente2-30B (05/05/2023)

ChanSung’s Alpaca-LoRA-30B-elina merged with Open Assistant’s second Finetune. Testing in progress.

Filtering Status : Medium

ModelTypeDownload
30B GGMLCPUQ5
30BGPUQ4 CUDA

https://huggingface.co/askmyteapot/GPT4-x-AlpacaDente2-30b-4bit

Vicuna 13B Free v1.1 (05/01/2023)

A work-in-progress, community driven attempt to make an unfiltered version of Vicuna. It currently has an early stopping bug, and a partial workaround has been posted on the repo’s model card.

Filtering Status : Very light

ModelTypeDownload
13BGPU & CPUhttps://huggingface.co/reeducator/vicuna-13b-free

Pygmalion/Metharme 7B (04/30/2023)

Pygmalion 7B is a dialogue model that uses LLaMA-7B as a base. The dataset includes RP/ERP content. Metharme 7B is an experimental instruct-tuned variation, which can be guided using natural language like other instruct models.

PygmalionAI intend to use the same dataset on the higher parameter LLaMA models. No ETA as of yet.

Filtering Status : None

ModelTypeDownload
7B Pygmalion/MetharmeXORhttps://huggingface.co/PygmalionAI/
7B Pygmalion GGMLCPUQ4, Q5, Q8
7B Metharme GGMLCPUQ4, Q5
7B PygmalionGPUQ4 Triton, Q4 CUDA 128gs
7B MetharmeGPUQ4 Triton, Q4 CUDA

GPT4-X-Alpasta 30B (04/29/2023)

An attempt at improving Open Assistant’s performance as an instruct while retaining its excellent prose. The merge consists of Chansung’s GPT4-Alpaca Lora and Open Assistant’s native fine-tune.

It is an extremely coherent model for logic based instruct outputs. And while the prose is generally very good, it does suffer from the “Assistant” personality bleedthrough that plagues the OpenAssistant dataset, which can give you dry dialogue for creative writing/chatbot purposes. However, several accounts claim it’s nowhere near as bad as OA’s finetunes, and that the prose and coherence gains makes up for it.

Filtering Status : Medium

ModelTypeDownload
30B 4bitCPU & GPU CUDAhttps://huggingface.co/MetaIX/GPT4-X-Alpasta-30b-4bit

OpenAssistant LLaMa 30B SFT 6 (04/23/2023)

An open-source alternative to OpenAI’s ChatGPT/GPT 3.5 Turbo. However, it seems to suffer from overfitting and is heavily filtered. Not recommended for creative writing or chat bots, given the “assistant” personality constantly bleeds through, giving you dry dialogue.

Filtering Status : Heavy

ModelTypeDownload
30BXORhttps://huggingface.co/OpenAssistant/oasst-sft-6-llama-30b-xor
30B GGMLCPUQ4
30BGPUQ4 CUDA, Q4 CUDA 128gs

SuperCOT (04/22/2023)

SuperCOT is a LoRA trained with the aim of making LLaMa follow prompts for Langchain better, by infusing chain-of-thought datasets, code explanations and instructions, snippets, logical deductions and Alpaca GPT-4 prompts.

Though designed to improve Langchain, it’s quite versatile and works very well for other tasks like creative writing and chatbots. The author also pruned a number of filters from the datasets. As of early May 2023, it’s the most recommended model on /lmg/

Filtering Status : Very Light

ModelTypeDownload
Original LoRALoRAhttps://huggingface.co/kaiokendev/SuperCOT-LoRA
13B GGMLCPUQ4, Q8
30B GGMLCPUQ4, Q5, Q8
13BGPUQ4 CUDA 128gs
30BGPUQ4 CUDA, Q4 CUDA 128gs
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

沧海之巅

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值