多机多卡训练SD系列模型(三)
前言
dreambooth和直接微调的区别:秋叶说的)
Dreambooth训练Lora
和训练普通的lora方式一样,就是增加了正则化数据集
kohya-trainer的config解读
dataset_config
config = {
"general": {
"enable_bucket": True,
"caption_extension": caption_extension,
"shuffle_caption": True,
"keep_tokens": keep_tokens,
"bucket_reso_steps": 64,
"bucket_no_upscale": False,
},
"datasets": [
{
"resolution": resolution,
"min_bucket_reso": 320 if resolution > 640 else 256, #最小分桶
"max_bucket_reso": 1280 if resolution > 640 else 1024, #最大分桶
"caption_dropout_rate": 0, #描述词丢弃
"caption_tag_dropout_rate": 0,
"caption_dropout_every_n_epochs": 0,
"flip_aug": flip_aug, # 开启反转
"color_aug": False, #颜色增强
"face_crop_aug_range": None, #面部增强
"subsets": subsets,
}
],
}
enable_bucket
开启分桶,这也算是这个脚本比较方便的地方
bucket_reso_steps
分桶步数
新的训练脚本可以不进行数据剪裁,因为批量的数据剪裁通常都会降低画质,影响训练效果,手动剪裁又有些麻烦,就可以开启分桶,他会将大小基本一致的放入一个桶中丢去训练,
类似这样(这是四卡训练,所以会报四次),会让训练的时候更加好,lora训练使用了unet
分桶就能够让他在第一步中更好去进行,具体可以去看青龙在b站的lora底层原理。
这里可能会有个坑,就是在用webui打标的时候
第一个保持原始尺寸
如果不开启,就会按照上面设置的长宽进行剪裁,出来的图就会奇奇怪怪,并且分桶都直接丢到一个桶里。
flip_aug
开启水平翻转,开启后可以人为增加一倍的数据,但是对于有明显分边特征的建议不要开启,不然就会紊乱,比如左边脸有颗痣,开启后生成的图就会左一下右一下,包括刘海和大小眼。
config = {
"model_arguments": {
"v2": v2,
"v_parameterization": v_parameterization if v2 and v_parameterization else False,
"pretrained_model_name_or_path": pretrained_model_name_or_path,
"vae": vae,
},
"additional_network_arguments": {
"no_metadata": False,
"unet_lr": float(unet_lr) if train_unet else None,
"text_encoder_lr": float(text_encoder_lr) if train_text_encoder else None,
"network_weights": network_weight,
"network_module": network_module,
"network_dim": network_dim,
"network_alpha": network_alpha,
"network_args": network_args,
"network_train_unet_only": True if train_unet and not train_text_encoder else False,
"network_train_text_encoder_only": True if train_text_encoder and not train_unet else False,
"training_comment": None,
},
"optimizer_arguments": {
"min_snr_gamma": min_snr_gamma if not min_snr_gamma == -1 else None,
"optimizer_type": optimizer_type,
"learning_rate": unet_lr,
"max_grad_norm": 1.0,
"optimizer_args": eval(optimizer_args) if optimizer_args else None,
"lr_scheduler": lr_scheduler,
"lr_warmup_steps": lr_warmup_steps,
"lr_scheduler_num_cycles": lr_scheduler_num_cycles if lr_scheduler == "cosine_with_restarts" else None,
"lr_scheduler_power": lr_scheduler_power if lr_scheduler == "polynomial" else None,
},
"dataset_arguments": {
"cache_latents": True,
"debug_dataset": False,
"vae_batch_size": vae_batch_size,
},
"training_arguments": {
"output_dir": output_dir,
"output_name": project_name,
"save_precision": save_precision,
"save_every_n_epochs": save_n_epochs_type_value if save_n_epochs_type == "save_every_n_epochs" else None,
"save_n_epoch_ratio": save_n_epochs_type_value if save_n_epochs_type == "save_n_epoch_ratio" else None,
"save_last_n_epochs": None,
"save_state": None,
"save_last_n_epochs_state": None,
"resume": None,
"train_batch_size": train_batch_size,
"max_token_length": 225,
"mem_eff_attn": False,
"xformers": True,
"max_train_epochs": num_epochs,
"max_data_loader_n_workers": 8,
"persistent_data_loader_workers": True,
"seed": seed if seed > 0 else None,
"gradient_checkpointing": gradient_checkpointing,
"gradient_accumulation_steps": gradient_accumulation_steps,
"mixed_precision": mixed_precision,
"clip_skip": clip_skip if not v2 else None,
"logging_dir": logging_dir,
"log_prefix": project_name,
"noise_offset": noise_offset if noise_offset > 0 else None,
"lowram": lowram,
},
"sample_prompt_arguments": {
"sample_every_n_steps": None,
"sample_every_n_epochs": 1 if enable_sample_prompt else 999999,
"sample_sampler": sampler,
},
"dreambooth_arguments": {
"prior_loss_weight": 1.0,
},
"saving_arguments": {
"save_model_as": save_model_as
},
}
具体参数大同小异,主要的区别有dreambooth_arguments
中的prior_loss_weight
,是正则化数据集的初始loss值,越高正则化数据集对训练的污染越小,正常0.8-1.5左右。
sample_prompt_arguments
是用来验证模型的,就是训练过程中通过一个设定好的prompt周期性让模型生成图片,但是实际用起来比较抽象可以不启用
accelerate_conf = {
"config_file" : accelerate_config,
"num_cpu_threads_per_process" : 1,
}
train_conf = {
"sample_prompts" : sample_prompt,
"dataset_config" : dataset_config,
"config_file" : config_file
}
accelerate_config
是accelerate库的配置,可以先前以固定格式导入,也可以运行 accelerate config
来进行配置可以后面增加参数来指定配置文件生成路径,train_conf
中的就是上述几个配置文件的路径,配置文件可以统一写成toml的格式,例如
[model_arguments]
pretrained_model_name_or_path = ""
[additional_network_arguments]
network_module = "networks.lora"
network_dim = 128
network_alpha = 64
[optimizer_arguments]
optimizer_type = ""
unet_lr=
text_encoder_lr =
lr_warmup_steps =
max_grad_norm =
lr_scheduler_num_cycles =
lr_scheduler = ""
[dataset_arguments]
cache_latents =
[training_arguments]
output_dir = ""
output_name = ""
save_precision = "fp32"
save_every_n_epochs =
train_batch_size =
max_token_length =
xformers =
max_train_epochs =
max_data_loader_n_workers =
persistent_data_loader_workers =
clip_skip =
logging_dir =
lowram =
noise_offset =
[dreambooth_arguments]
prior_loss_weight =
[saving_arguments]
save_model_as = "safetensors"
和
[general]
enable_bucket = true
shuffle_caption = false
keep_tokens = 1
[[datasets]]
flip_aug = false
batch_size = 6
resolution = 512
[[datasets.subsets]]
image_dir = ""
caption_extension = ".txt"
num_repeats = 45
class_tokens = ""
[[datasets.subsets]]
image_dir = ""
caption_extension = ""
num_repeats =
class_tokens = "
accelerate_config
后续再讲
启动训练
可以写一个train.sh的脚本,然后运行就好了
可以用python run your_train.py
也可以用accelerate执行,示例代码:
accelerate launch --config_file="./config/accelerate_config.yaml" train_network.py \
--dataset_config="./config/dataset_config.toml" \
--config_file="./config/config_file.toml"\
--sample_prompts="./config/sample_prompt.txt"
然后运行./train.sh
就可以开始运行了
总结
主要写了kohya-trainer的训练脚本的使用,中间具体参数需要慢慢摸索,并且需要根据不同机器来运行不同的加速配置。