Datawhale X 魔搭 AI夏令营 - AIGC文生图方向 task02笔记-CSDN博客

本文链接：https://blog.csdn.net/yanghb17/article/details/141188576

系列文章目录

Datawhale AI夏令营第四期魔搭- AIGC文生图方向 task01笔记-CSDN博客

前言

提示：这里可以添加本文要记录的大概内容：

例如：随着人工智能的不断发展，机器学习这门技术也越来越重要，很多人都开启了学习机器学习，本文就介绍了机器学习的基础内容。

一、学习小帮手

俗话说：工欲善其事必先利其器。在学习之前推荐两款学习工具：通义千文、豆包。两者都是具有信息查询、语言理解、文本创作等多能力的AI助手。在学习可以帮助小伙帮提高工作效率。

比如：

还比如：

二、零入门AI生图

1.学习baseline + 助手辅助

首先通读代码，标记处不同的代码；其次借助辅助助手（通义千文）把不明白的代码搞明白；再则控制变量调参。

比如：最简单的cfg_scale=1：从1逐渐变大会是什么效果？；num_inference_steps=1100：从1变到1100，又会有什么效果？随机种子torch.manual_seed(0)，又是起到什么作用。

torch.manual_seed(0)
image = pipe(
    # prompt="二次元，一个紫色短发小女孩，在家中沙发上坐着，双手托着腮，很无聊，全身，粉色连衣裙",
    prompt="二次元，一个金色短发小男孩，在沙发上双腿盘坐，右手上拿着勺子，腿上放有半个西瓜，看电视",
    negative_prompt="丑陋、变形、嘈杂、模糊、低对比度",
    cfg_scale=1,
    num_inference_steps=1100, height=640, width=640,
)

比如稍微多一点难度的：弄明白每个参数对训练效果有什么影响（本人还在调参中哈哈，感觉免费的东西就只能）

options:
  -h, --help            show this help message and exit
  --pretrained_unet_path PRETRAINED_UNET_PATH
                        Path to pretrained model (UNet). For example, `models/
                        kolors/Kolors/unet/diffusion_pytorch_model.safetensors
                        `.
  --pretrained_text_encoder_path PRETRAINED_TEXT_ENCODER_PATH
                        Path to pretrained model (Text Encoder). For example,
                        `models/kolors/Kolors/text_encoder`.
  --pretrained_fp16_vae_path PRETRAINED_FP16_VAE_PATH
                        Path to pretrained model (VAE). For example,
                        `models/kolors/Kolors/sdxl-vae-
                        fp16-fix/diffusion_pytorch_model.safetensors`.
  --lora_target_modules LORA_TARGET_MODULES
                        Layers with LoRA modules.
  --dataset_path DATASET_PATH
                        The path of the Dataset.
  --output_path OUTPUT_PATH
                        Path to save the model.
  --steps_per_epoch STEPS_PER_EPOCH
                        Number of steps per epoch.
  --height HEIGHT       Image height.
  --width WIDTH         Image width.
  --center_crop         Whether to center crop the input images to the
                        resolution. If not set, the images will be randomly
                        cropped. The images will be resized to the resolution
                        first before cropping.
  --random_flip         Whether to randomly flip images horizontally
  --batch_size BATCH_SIZE
                        Batch size (per device) for the training dataloader.
  --dataloader_num_workers DATALOADER_NUM_WORKERS
                        Number of subprocesses to use for data loading. 0
                        means that the data will be loaded in the main
                        process.
  --precision {32,16,16-mixed}
                        Training precision
  --learning_rate LEARNING_RATE
                        Learning rate.
  --lora_rank LORA_RANK
                        The dimension of the LoRA update matrices.
  --lora_alpha LORA_ALPHA
                        The weight of the LoRA update matrices.
  --use_gradient_checkpointing
                        Whether to use gradient checkpointing.
  --accumulate_grad_batches ACCUMULATE_GRAD_BATCHES
                        The number of batches in gradient accumulation.
  --training_strategy {auto,deepspeed_stage_1,deepspeed_stage_2,deepspeed_stage_3}
                        Training strategy
  --max_epochs MAX_EPOCHS
                        Number of epochs.
  --modelscope_model_id MODELSCOPE_MODEL_ID
                        Model ID on ModelScope (https://www.modelscope.cn/).
                        The model will be uploaded to ModelScope automatically
                        if you provide a Model ID.
  --modelscope_access_token MODELSCOPE_ACCESS_TOKEN
                        Access key on ModelScope (https://www.modelscope.cn/).
                        Required if you want to upload the model to
                        ModelScope.