0 前言
总目录 大模型安全相关研究:https://blog.csdn.net/WhiffeYF/article/details/142132328
https://github.com/llm-attacks/llm-attacks
https://github.com/lm-sys/FastChat
论文翻译:Universal and Transferable Adversarial Attacks on Aligned Language Models
b站视频:https://www.bilibili.com/video/BV1R553zkE92/
平台采用Autodl:https://www.autodl.com/home
PyTorch / 2.3.0 / 3.12(ubuntu22.04) / 12.1
vGPU-48GB
1 模型下载
Llama-2-7b-chat-hf 模型下载
使用魔塔社区下载:https://www.modelscope.cn/models/shakechen/Llama-2-7b-chat-hf/files
使用SDK下载下载:
开始前安装
source /etc/network_turbo
pip install modelscope
脚本下载
# source /etc/network_turbo
from modelscope import snapshot_download
# 指定模型的下载路径
cache_dir = '/root/autodl-tmp'
# 调用 snapshot_download 函数下载模型
model_dir = snapshot_download('shakechen/Llama-2-7b-chat-hf', cache_dir=cache_dir)
print(f"模型已下载到: {model_dir}")
2 llm-attacks 安装
source /etc/network_turbo
git clone https://github.com/llm-attacks/llm-attacks.git
requirements.txt需要做修改:
https://github.com/llm-attacks/llm-attacks/blob/main/requirements.txt
transformers==4.28.1
ml_collections
fschat==0.2.20
改成
transformers
接着继续安装
cd llm-attacks
pip install -e .
source /etc/network_turbo
pip install ml_collections
pip install pandas
加载模型位置修改:
https://github.com/llm-attacks/llm-attacks/blob/main/experiments/configs/individual_llama2.py
路径修改为:/root/autodl-tmp/shakechen/Llama-2-7b-chat-hf
3 FastChat 安装
source /etc/network_turbo
git clone https://github.com/lm-sys/FastChat.git
cd FastChat
pip install -e ".[model_worker,webui]"
4 Experiments
cd experiments/launch_scripts
bash run_gcg_individual.sh llama2 behaviors
5 结果