Kimi模型辅助SD创作，ComfyUI集成LLM获取创意灵感和提示语优化

最新推荐文章于 2025-03-03 14:55:32 发布

AI画师安琪

最新推荐文章于 2025-03-03 14:55:32 发布

阅读量1.3k

点赞数 17

文章标签： midjourney 人工智能 AI作画 stable diffusion AIGC 人工智能作画 ai

本文链接：https://blog.csdn.net/2401_85725028/article/details/140607291

版权

大家好我是安琪！！！

在数字图像生成领域，Kimi模型以其强大的图像生成能力而备受关注。而ComfyUI作为一款优秀的UI设计工具，通过集成LLM（Large Language Model），为用户提供了获取创意灵感和提示语优化的功能。现在，让我们一起探索Kimi模型与ComfyUI的结合，开启创意之旅。

今天介绍一款实用的LLM集成的ComfyUI插件：ComfyUI-Tara-LLM-Integration，这是一个集成了LLM大型语言模型的ComfyUI插件工具，使得我们能够在ComfyUI中集成LLM大语言模型的优势，完成更多创意创作，帮助我们获取创作灵感。在插件中已经集成了众多知名的LLM模型，如：OpenAI 的 GPT ， Groq托管的开源模型，Ollama本地部署LLM等。

本文将利用ComfyUI-Tara-LLM-Integration插件集成Kimi API辅助AIGC绘图创作提示词优化，获取创作灵感。在插件原生中是没有Kimi API的选项的，但由于Ollama和Kimi都是使用的标准OpenAI协议，因此可以用Ollama配置来实现Kimi API的集成。

本文作为LLM与SD绘图模型结合启发示例，不一定需要集成在ComfyUI中，使用在线LLM服务也会得到同样的效果。核心目的为使用LLM大语言模型能力，实现SD提示语优化和创意灵感。

Tara-LLM插件安装

首先请保持更新本地ComfyUI到最新版本，然后加载文末工作流文件，并通过ComfyUI插件管理器安装缺失节点，其次需要重启ComfyUI，即可开始体验。涉及ComfyUI插件下载如下所示：

该插件无需模型下载和安装，仅需将Kimi申请的API key和地址设置到插件配置即可体验。
在这里插入图片描述

• Kimi API申请地址

• ComfyUI-Tara-LLM-Integration地址

工作流界面

Kimi优化提示语

插件默认增加了SDXL绘图提示语，当然也可以自行修改定制Prompt获取更好的质量。如：增加常用提示词、风格引导、中文翻译英文提示词等。同样也可以发挥更多创意，实现非SD绘图提示外的功能。

SDXL绘图提示语

DD
You are a SDXL PromptGenerator

A good prompt for the SDXL stable diffusion model is a meticulously crafted narrative that starts with specifying the image type to set the artistic tone.Itthen lays out a clear and engaging scene, methodically adding layers of detail about the main subject, the surrounding environment,and their spatial relationships.It should use precise, descriptive language to define the mood, style,and execution, incorporating technical aspects like weight syntax (e.g.,“(smiling:1.1)”) to prioritize certain features.This approach ensures a rich, coherent,and visually compelling image that aligns with the intended vision.It should be only descriptive, avoid any guidance, avoid imperative sentences and remove any such guidance from prompts beforehand.

Conversely, a bad prompt lacks clarity and coherence, providing insufficient context or detail.It might jump into specifics without setting the scene, mix incompatible styles or themes,or neglect to specify spatial relationships and relative importance of elements, leading to a fragmented and confusing image.It may also misuse technical tools like weight syntax, resulting in an imbalanced and ineffective visual representation.

The words to the beginning has a higher weight the words in the end.Stable diffusion models works best when we give them a clear and concise prompt.Only relevant content should be to the beginning ( AVOID IGNORE tags like "Create a xyz"or"Scene:“or"image_type:” etc)

Furthermore, we can use various artist, studio, movie names etc to get a likeness to that style.For example,“painting by Van Gogh"or"scene from the movie Up”.
Some keywords that can be useful are:4k, ultra hd, masterpiece, cinematic, painting, drawing, sketch, scene, movie, artist, studio, style, trending on ArtStation,DeviantArt,Behance etc.(Use extreme caution whenusing these, also it’s better to use them in the end unless mentioned elsewhere)

The prompt is specific and detailed, guiding the AI to produce a targeted and expressive image.

Use parenthesis to highlight specific words or spaces and use a colon followed by an weight to the end of a word or group to increase or decrease weight, for example (red panda:1.2) banana:1.5 or (yellow rose:0.9)

The negative prompt should be comma-seperated keywords, and not a sentence. And it should generally describe things we absolutely don’t want in an image,if a image is about cartoon, we may put photorealistic in the negative.Butif it’s about a person, we should not put animal in negative (unless specified) because there are some similarities, in general it should be more qualitative than tangible such as JPEG artifacts, blurry, watermark etc.

If a specific aesthetic, style, studio etc is mentioned, it should be accentuated as much as possible, we should increase its weight (surrealistic:1.4) or (watercolor:2.0) etc., and also add other keywords, artists, references to increase its weight.

Follow the prompt as closely as possible without violating the guidelines. Generate both an amazing positive and negative prompt

Tara-LLM插件体验

在本文中涉及的基础模型和模型配置如下：

• 绘图模型：iNiverse Mix XL

• 采样方法：DPM++ 2M Karras

• 迭代步数：30步

• 分辨率：1024* 1024

• 提示词引导系数 (CFG Scale)：7

本文涉及模型下载地址：

• iNiverse Mix XL（哩布）
• iNiverse Mix XL（C站）

通用负向提示语：

NSFW,(nsfw:1.3),(watermark:1.3),paintings,sketches,(worst quality:2),(low quality:2),(normal quality:2),lowres,normal quality,(monochrome:2),(grayscale:2),(skin spots:2),skin blemishes,age spot,backlight,(ugly:1.2),(duplicate:1.2),(morbid:1.2),(mutilated:1.2),mutated hands,(poorly drawn hands:1.3),blurry,(bad anatomy:1.2),(bad proportions:1.3),extra limbs,(disfigured:1.3),(more than 2 nipples:1.2),(missing arms:1.2),(extra legs:1.3),(fused fingers:1.5),(too many fingers:1.5),(unclear eyes:1.3),bad hands,missing fingers,extra digit,bad body,(Multiple people:1.5),

在本文中，为了演示Kimi LLM提示语优化，选择了使用最简提示语演示。在实际中，应尽量更详尽提示语让LLM更了解你的意图。

01：人物提示优化

原提示语

1girl,solo  
  
1个女孩,独自

输出效果

LLM优化提示语

增加了详尽的描述和梵高风格画背景。

A detailed and colorful portrait of a young woman with long flowing hair and deep expressive eyes, captured in a serene and contemplative pose, shot with a high-resolution lens in a soft natural light, set in an airy, minimalist room with clean lines and modern furniture. (solitude:1.3), (masterpiece:1.5), (4k ultra hd:1.2), (trending on ArtStation:1.1), (Van Gogh inspired:1.1), (surrealistic:1.4),  
  
这是一幅细致而丰富多彩的年轻女子的肖像，她有着飘逸的长发和深邃的眼神，在柔和的自然光下，用高分辨率镜头拍摄了一个宁静而沉思的姿势，背景是一个通风、极简主义的房间，线条简洁，家具现代。(孤独:1.3)，(杰作:1.5)，(4k超高清:1.2)，(ArtStation趋势:1.1)，(梵高灵感:1.1)，(超现实主义:1.4)，

输出效果

02：中国人物提示优化

原提示语

1girl,solo, chinese girl

1个女孩,独自,中国女孩

输出效果

LLM优化提示语

增加了表情描述和汉服提示。

A captivating and vibrant 4k ultra HD digital painting of a young Chinese girl, standing in a serene and traditional Chinese garden, wearing a flowing silk cheongsam in shades of jade green with intricate gold embroidery, her long, flowing black hair cascading down, adorned with delicate orchids. She gazes thoughtfully into the distance with large, expressive eyes and full lips set in a soft, knowing smile, exuding a tranquil and graceful aura. The scene is illuminated by the gentle glow of the setting sun, casting warm tones across the garden and highlighting the girls silhouette. The artist captures the timeless beauty and elegance of Chinese culture, blending traditional elements with a modern aesthetic to create a visually stunning masterpiece. Inspired by the works of Zeng Xiaojun, the art is executed with delicate brushstrokes and a keen eye for detail, evoking a sense of harmony and balance.,

这是一幅迷人的、充满活力的4k超高清数字画，画中的是一个年轻的中国女孩，站在一个宁静而传统的中国花园里，穿着一件飘逸的丝绸旗袍，旗袍上绣着复杂的金色刺绣，她长长的、飘逸的黑发垂下来，装饰着精致的兰花。她若有所思地凝视着远方，大大的、富有表情的眼睛，丰满的嘴唇上挂着柔和而会心的微笑，散发出一种宁静而优雅的气质。夕阳的柔和光芒照亮了整个场景，在花园中投射出温暖的色调，突出了女孩的轮廓。这位艺术家捕捉到了中国文化永恒的美丽和优雅，将传统元素与现代美学相结合，创造了一幅视觉上令人惊叹的杰作。受曾晓军作品的启发，这幅画以细腻的笔触和对细节的敏锐观察力表现出来，唤起了一种和谐与平衡的感觉。。

输出效果

03：风景提示优化

原提示语

1 fox in forest  
  
森林中一只狐狸

输出效果

LLM优化提示语

增加了插画风格。

A vibrant, high-resolution illustration of a cunning fox walking confidently through a lush, enchanted forest, the sun casting dappled light through the canopy, highlighting the foxs glowing fur with a myriad of warm colors. The scene is punctuated by magical flora and fauna that only a true artists eye could capture, hinting at the mystic nature of this hidden world, a place where nature is both canvas and creator.  
  
这是一幅充满活力、高分辨率的插图，一只狡猾的狐狸自信地走过一片郁郁葱葱、迷人的森林，阳光透过树冠投射出斑驳的光线，把狐狸发光的皮毛衬托得暖色调万千。画面中穿插着神奇的动植物，只有真正的艺术家才能捕捉到，暗示着这个隐秘世界的神秘本质，在这里，大自然既是画布，也是创造者。