创新实训:模型预训练与微调--使用GUI对模型进行预训练微调与prompt提示词工程

上一篇部署好模型之后,这次我们使用GUI对模型进行预训练和微调:

当然也可以使用传统的命令号形式,如:

 llamafactory-cli train \
    --stage orpo \
    --do_train True \
    --model_name_or_path "Qwen/Qwen2-7B-Instruct" \
    --finetuning_type lora \
    --template default \
    --flash_attn auto \
    --dataset_dir "LLaMA-Factory/data" \
    --dataset test \
    --cutoff_len 1024 \
    --learning_rate 1e-05 \
    --num_train_epochs 5.0 \
    --max_samples 1 \
    --per_device_train_batch_size 1 \
    --gradient_accumulation_steps 8 \
    --lr_scheduler_type cosine \
    --max_grad_norm 1.0 \
    --logging_steps 5 \
    --save_steps 100 \
    --warmup_steps 0 \
    --optim adamw_torch \
    --report_to none \
    --output_dir "saves/Qwen/lora/train_2024-04-25-07-48-56" \
    --fp16 True \
    --lora_rank 8 \
    --lora_alpha 16 \
    --lora_dropout 0 \
    --lora_target q_proj,v_proj \
    --orpo_beta 0.1 \
    --plot_loss True 

 下面主要介绍如何使用GUI进行模型微调:

(1)首先确定数据集:

在lAMAfactory的data路径下,找到data-info.json文件,然后在后面添加:

  },
  "test": {
    "file_name": "training_data.json",
    "columns": {
      "prompt": "instruction",
      "query":"input",
      "response": "output"
  
    }
  }

之后按照格式,导入数据文件,一个数据例子如下:

instruction"Extract the following data from the given medical abstract and output in the specified JSON format: Fixed Data: - Total participants: The total number of participants in the study. - Intervention participants: The number of participants in the intervention group. - Control participants: The number of participants in the control group. - Age: The age range or average age of participants. - Intervention age: The age range or average age of participants in the intervention group. - Control age: The age range or average age of participants in the control group. - Eligibility: The eligibility criteria for participants. - Condition: The medical condition or conditions being studied. - Location: The location(s) where the study was conducted. - Ethnicity: The ethnicity of participants. - Intervention: The type of intervention used. - Control: The type of control used. - Outcome measure: The primary outcome measure(s) of the study. - Conclusion: The conclusion of the study. Variable Data (for each outcome event): - Outcome: The outcome event being described. - IV Bin Abs: The absolute number of intervention group participants with the outcome. - CV Bin Abs: The absolute number of control group participants with the outcome. - IV Bin Percent: The percentage of intervention group participants with the outcome. - CV Bin Percent: The percentage of control group participants with the outcome. - IV Cont Mean: The mean value of the outcome measure for the intervention group. - CV Cont Mean: The mean value of the outcome measure for the control group. - IV Cont Median: The median value of the outcome measure for the intervention group. - CV Cont Median: The median value of the outcome measure for the control group. - IV Cont SD: The standard deviation of the outcome measure for the intervention group. - CV Cont SD: The standard deviation of the outcome measure for the control group. Output in the following JSON format: { "fixed_data": { "total-participants": "", "intervention-participants": "", "control-participants": "", "age": [], "intervention-age": "", "control-age": "", "eligibility": "", "condition": [], "location": "", "ethnicity": "", "intervention": "", "control": "", "outcome-measure": "", "conclusion": "" }, "variable_data": [ { "outcome": "", "iv-bin-abs": "", "cv-bin-abs": "", "iv-bin-percent": "", "cv-bin-percent": "", "iv-cont-mean": "", "cv-cont-mean": "", "iv-cont-median": "", "cv-cont-median": "", "iv-cont-sd": "", "cv-cont-sd": "" } ] }"
input"Objective:To determine whether provision of web-based lifestyle advice and coronary heart disease risk information either based on phenotypic characteristics or phenotypic plus genetic characteristics affects changes in objectively measured health behaviours.Methods:A parallel-group, open randomised trial including 956 male and female blood donors with no history of cardiovascular disease (mean [SD] age=56.7 [8.8] years) randomised to four study groups: control group (no information provided); web-based lifestyle advice only (lifestyle group); lifestyle advice plus information on estimated 10-year coronary heart disease risk based on phenotypic characteristics (phenotypic risk estimate) (phenotypic group) and lifestyle advice plus information on estimated 10-year coronary heart disease risk based on phenotypic (phenotypic risk estimate) and genetic characteristics (genetic risk estimate) (genetic group). The primary outcome was change in physical activity from baseline to 12 weeks assessed by wrist-worn accelerometer.Results:928 (97.1%) participants completed the trial. There was no evidence of intervention effects on physical activity (difference in adjusted mean change from baseline): lifestyle group vs control group 0.09 milligravity (mg) (95% CI -1.15 to 1.33); genetic group vs phenotypic group -0.33 mg (95% CI -1.55 to 0.90); phenotypic group and genetic group vs control group -0.52 mg (95% CI -1.59 to 0.55) and vs lifestyle group -0.61 mg (95% CI -1.67 to 0.46). There was no evidence of intervention effects on secondary biological, emotional and health-related behavioural outcomes except self-reported fruit and vegetable intake.Conclusions:Provision of risk information, whether based on phenotypic or genotypic characteristics, alongside web-based lifestyle advice did not importantly affect objectively measured levels of physical activity, other health-related behaviours, biological risk factors or emotional well-being.Trial registration number:ISRCTN17721237; Pre-results."
output"{ "fixed_data": { "total-participants": "956", "intervention-participants": "928 (97.1%)", "control-participants": "", "age": "56.7 [8.8] years", "intervention-age": "", "control-age": "", "eligibility": "", "condition": "phenotypic characteristics, phenotypic plus genetic characteristics", "location": "", "ethnicity": "", "intervention": "web-based lifestyle advice only (lifestyle group)", "control": "no information provided", "outcome-measure": "change in physical activity from baseline to 12 weeks assessed by wrist-worn accelerometer", "conclusion": "Provision of risk information, whether based on phenotypic or genotypic characteristics, alongside web-based lifestyle advice did not importantly affect objectively measured levels of physical activity, other health-related behaviours, biological risk factors or emotional well-being" }, "variable_data": [ { "outcome": "lifestyle group vs control group", "iv-bin-abs": "0.09 milligravity (mg)", "cv-bin-abs": "", "iv-bin-percent": "", "cv-bin-percent": "", "iv-cont-mean": "", "cv-cont-mean": "", "iv-cont-median": "", "cv-cont-median": "", "iv-cont-sd": "", "cv-cont-sd": "" }, { "outcome": "genetic group vs phenotypic group", "iv-bin-abs": "", "cv-bin-abs": "-0.33 mg", "iv-bin-percent": "", "cv-bin-percent": "", "iv-cont-mean": "", "cv-cont-mean": "", "iv-cont-median": "", "cv-cont-median": "", "iv-cont-sd": "", "cv-cont-sd": "" }, { "outcome": "phenotypic group and genetic group vs control group", "iv-bin-abs": "", "cv-bin-abs": "-0.52 mg", "iv-bin-percent": "", "cv-bin-percent": "", "iv-cont-mean": "", "cv-cont-mean": "", "iv-cont-median": "", "cv-cont-median": "", "iv-cont-sd": "", "cv-cont-sd": "" }, { "outcome": "vs lifestyle group", "iv-bin-abs": "", "cv-bin-abs": "-0.61 mg", "iv-bin-percent": "", "cv-bin-percent": "", "iv-cont-mean": "", "cv-cont-mean": "", "iv-cont-median": "", "cv-cont-median": "", "iv-cont-sd": "", "cv-cont-sd": "" } ] }"

至此完成了数据集的配置

(2)使用GUI完成

首先使用指令启动

LLaMA Board 可视化微调(由 Gradio 驱动)

llamafactory-cli webui

之后使用autodl的ssh隧道工具,可以运行服务器中的web服务

通过代理之后,我们可以在web页面进行调试:

具体调试工作在下一篇说明。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值