https://github.com/huggingface/diffusers/blob/main/examples/research_projects/diffusion_dpo/README.mdhttps://github.com/huggingface/diffusers/blob/main/examples/research_projects/diffusion_dpo/README.md1.introduction LLM两个阶段,1.预训练,2.对齐,微调以使其与人类偏