未经审查的 Mistral v0.2 Dolphin LLM - 不会拒绝任何东西！

lcwmgecom

已于 2024-04-09 04:41:54 修改

阅读量1k

点赞数 12

分类专栏： AI 文章标签：人工智能 llama

于 2024-04-09 04:30:31 首次发布

本文链接：https://blog.csdn.net/klam2020/article/details/137531891

版权

AI 专栏收录该内容

12 篇文章

订阅专栏

本文介绍了Dolphin-2.8模型，它是基于Mistral-7b-v0.2的微调版本，通过多个评估集展示了其性能。模型详细信息包括训练参数、评估指标和许可情况，强调了负责任使用的必要性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Dolphin 2.8 Mistral 7b v0.2 🐬

Crusoe Cloud - 提供优秀的按需10xL40S节点
Winston Sou - 与一位慷慨的匿名赞助商一起捐赠了大量个人拥有的计算资源！
Abacus AI - 我的雇主和很多方面的合作伙伴。
该模型基于 Mistral-7b-v0.2，这是 MistralAI 于 2024 年 3 月 23 日发布的新基础模型，但尚未在 HuggingFace 上发布。感谢@alpindale 的转换/发布。

基础模型具有 32k 上下文，全权重微调具有 16k 序列长度。

Crusoe Cloud 提供的 10x L40S 花了 3 天

Dolphin-2.8 具有多种指导、对话和编码技能。

海豚未经审查。我已经过滤了数据集以消除对齐和偏差。这使得模型更加合规。建议您在将模型公开为服务之前实现自己的对齐层。它将高度遵守任何要求，甚至是不道德的要求。请阅读我关于未经审查模型的博客文章。 https://erichartford.com/uncensored-models 您应对使用此模型创建的任何内容负责。负责任地享受。

Dolphin 已获得 Apache 2.0 许可。我授予任何用途的许可，包括商业用途。 Dolphin 使用 GPT4 等模型生成的数据进行训练。

Evals

{
  "arc_challenge": {
    "acc,none": 0.5921501706484642,
    "acc_stderr,none": 0.014361097288449701,
    "acc_norm,none": 0.6339590443686007,
    "acc_norm_stderr,none": 0.014077223108470139
  },
  "gsm8k": {
    "exact_match,strict-match": 0.4783927217589083,
    "exact_match_stderr,strict-match": 0.013759618667051773,
    "exact_match,flexible-extract": 0.5367702805155421,
    "exact_match_stderr,flexible-extract": 0.013735191956468648
  },
  "hellaswag": {
    "acc,none": 0.6389165504879506,
    "acc_stderr,none": 0.004793330525656218,
    "acc_norm,none": 0.8338976299541924,
    "acc_norm_stderr,none": 0.00371411888431746
  },
  "mmlu": {
    "acc,none": 0.6122347243982339,
    "acc_stderr,none": 0.003893774654142997
  },
  "truthfulqa_mc2": {
    "acc,none": 0.5189872652778472,
    "acc_stderr,none": 0.014901128316426086
  },
  "winogrande": {
    "acc,none": 0.7971586424625099,
    "acc_stderr,none": 0.011301439925936643
  }
}

https://download.csdn.net/download/klam2020/89104452?spm=1001.2014.3001.5501

See axolotl config [ axolotl version: 0.4.0]


base_model: alpindale/Mistral-7B-v0.2-hf
model_type: MistralForCausalLM
tokenizer_type: LlamaTokenizer
is_mistral_derived_model: true

load_in_8bit: false
load_in_4bit: false
strict: false

datasets:
  - path: /workspace/datasets/dolphin201-sharegpt2.jsonl
    type: sharegpt
  - path: /workspace/datasets/dolphin-coder-translate-sharegpt2.jsonl
    type: sharegpt
  - path: /workspace/datasets/dolphin-coder-codegen-sharegpt2.jsonl
    type: sharegpt
  - path: /workspace/datasets/m-a-p_Code-Feedback-sharegpt.jsonl
    type: sharegpt
  - path: /workspace/datasets/m-a-p_CodeFeedback-Filtered-Instruction-sharegpt.jsonl
    type: sharegpt
  - path: /workspace/datasets/not_samantha_norefusals.jsonl
    type: sharegpt
  - path: /workspace/datasets/openhermes2_5-sharegpt.jsonl
    type: sharegpt

chat_template: chatml

dataset_prepared_path: last_run_prepared
val_set_size: 0.001
output_dir: /workspace/dolphin-2.8-mistral-7b

sequence_len: 16384
sample_packing: true
pad_to_sequence_len: true

wandb_project: dolphin
wandb_entity:
wandb_watch:
wandb_run_id:
wandb_log_model:

gradient_accumulation_steps: 8
micro_batch_size: 3
num_epochs: 4
adam_beta2: 0.95
adam_epsilon: 0.00001
max_grad_norm: 1.0
lr_scheduler: cosine
learning_rate: 0.000005
optimizer: adamw_bnb_8bit

train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: false

gradient_checkpointing: true
gradient_checkpointing_kwargs:
  use_reentrant: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10

eval_steps: 73
eval_table_size:
eval_table_max_new_tokens:
eval_sample_packing: false
saves_per_epoch: 
save_steps: 73
save_total_limit: 2
debug:
deepspeed: deepspeed_configs/zero3_bf16.json
weight_decay: 0.1
fsdp:
fsdp_config:
special_tokens:
  eos_token: "<|im_end|>"
tokens:
  - "<|im_start|>"

workspace/dolphin-2.8-mistral-7b

该模型是alpindale/Mistral-7B-v0.2-hf 在 None 数据集上的微调版本。在评估集上取得了以下结果：

损失：0.4828

型号说明
需要更多信息

预期用途和限制
需要更多信息

训练和评估数据
需要更多信息

培训流程
训练超参数
训练期间使用了以下超参数：

learning_rate: 5e-06
train_batch_size: 3
eval_batch_size: 3
seed: 42
distributed_type: multi-GPU
num_devices: 10
gradient_accumulation_steps: 8
total_train_batch_size: 240
total_eval_batch_size: 30
optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 10
num_epochs: 4

培训成果

Training Loss	Epoch	Step	Validation Loss
1.1736	0.0	1	1.0338
0.6106	0.36	73	0.5439
0.5766	0.72	146	0.5171
0.5395	1.06	219	0.5045
0.5218	1.42	292	0.4976
0.5336	1.78	365	0.4915
0.5018	2.13	438	0.4885
0.5113	2.48	511	0.4856
0.5066	2.84	584	0.4838
0.4967	3.19	657	0.4834
0.4956	3.55	730	0.4830
0.5026	3.9	803	0.4828