

今天文章主题将为大家介绍一款优秀的图像反推模型:Joy Caption。这是由作者Fancy Feast开发的Joy Caption模型,是在谷歌的SigLIP模型和Meta的最新Llama3.1 模型的基础之上,使用Adapter适配模式,并通过精心训练出的优秀图像反推描述LLM模型。能够根据用户设置参数,输出相应的具有丰富细节的图像描述提示语。

  • • Google 的 SigLIP (Sigmoid Loss for Language Image Pre-Training) 是一种改进的多模态模型,类似于 CLIP,但是采用了更优的损失函数。下载地址为:https://huggingface.co/google/siglip-so400m-patch14-384

  • • Meta-Llama-3.1-8B-bnb-4bit是优化的LLM大语言模型,基于 Meta 的 Llama 3.1 架构,使用 BitsAndBytes 库进行 4-bit 量化,大幅减少内存使用,同时保持模型性能和准确率。下载地址为:https://huggingface.co/unsloth/Meta-Llama-3.1-8B-bnb-4bit。

  • (需要的同学可自行扫描获取)

Flux Joy Caption提示反推体验


如自信能搞定部署环境也可尝试Comfyui_CXH_joy_caption插件,以备白嫖期结束。详细说明参见Github插件主页。插件地址 (需要的同学可自行扫描获取)


Joy Caption+Flux文生图工作流



01. 豹纹

chinese girl, This is a high-resolution photograph featuring an East Asian woman with long, dark brown hair cascading down her back. She has a slender yet curvy figure, with a moderate bust size. Her skin tone is a smooth, porcelain-like complexion. She is dressed in a form-fitting, long-sleeved onesie with a bold, orange tiger stripe pattern on a black background, accentuating her physique. The onesie clings to her body, highlighting her curves. Her expression is calm and inviting, with a subtle, soft smile and closed eyes, giving an impression of serenity. Her makeup is natural and understated, with a focus on enhancing her features without looking too dramatic. The background features a soft, gradient-like texture of beige and light brown fabrics, which creates a warm, cozy atmosphere. A large, glowing orb, likely a softbox light, is positioned to the side, casting a warm, golden light that complements the colors of the onesie and the background. The overall mood of the image is intimate and serene, with a focus on the subject’s calm demeanor and striking appearance. The lighting is soft and even, with a warm color tone that enhances the cozy ambiance. The style of the image is contemporary, with a focus on natural light and subtle, elegant posing. The woman’s posture is relaxed, with her hands placed on her thighs, adding to the sense of calmness. The image is likely taken in a studio setting, with careful attention to lighting and composition. The overall aesthetic is sophisticated and visually appealing. The tiger onesie adds a playful, whimsical touch to the otherwise serene atmosphere. The image is a blend of fashion and portraiture, focusing on the subject’s beauty and the creative use of lighting. The style is reminiscent of high-fashion photography. The model’s hands are placed on her thighs, with her fingers splayed, adding a subtle, playful touch to her otherwise serene pose. The image is a beautiful, captivating blend of fashion and portraiture. The overall mood is intimate and serene, with a foc



02. 海狮

This is a digital artwork featuring a majestic lion's head emerging from the crest of a massive wave in the ocean. The lion's face is serene and powerful, with a thick, fluffy mane that appears almost ethereal, blending seamlessly into the surrounding water. The lion's eyes are a piercing blue, giving a sense of calm and wisdom. The wave beneath the lion's head is a deep, rich blue, with foamy white crests that add texture and dynamism to the scene. The background sky is a soft, gradient blue with a few wispy clouds, suggesting a clear, sunny day. The overall mood of the artwork is tranquil and awe-inspiring, capturing the majesty of the lion and the ocean. The digital art style is highly detailed and realistic, with subtle shading and texture that brings the scene to life. The artist has used a blend of soft and hard brushstrokes to create a sense of movement and energy in the wave, while maintaining the lion's calm demeanor. The image exudes a sense of wonder and connection between the natural world and the majestic creature. The style is reminiscent of high-end digital art, with a focus on realism and emotional depth. The entire scene is set against a clean, minimalist background, emphasizing the lion and the wave. The image is a powerful and evocative representation of nature's beauty. The colors are primarily blues and whites, with subtle hints of gray and beige in the lion's fur. The overall effect is both calming and awe-inspiring. The artwork is likely created using software such as Adobe Photoshop or similar digital art tools. The image's dimensions are standard for a digital artwork, with a wide aspect ratio that allows for an immersive experience. The style is realistic yet fantastical, blending seamlessly into the viewer's imagination. The scene is set in a serene, natural environment, emphasizing the majesty of the lion and the ocean. The entire artwork is a masterpiece of digital art, capturing the essence of nature and the sublime. The artist's use of light and shadow c



03. 街头卖艺猫咪

This is a highly detailed, photorealistic digital illustration of a cat playing a guitar on a rainy street. The cat, with orange and white fur, is dressed in a worn, green hoodie and dark blue pants, exuding a casual, street-performing vibe. The cat's large, round eyes are expressive, and its ears are perked up, as if listening to the music. The guitar, an orange-acoustic, is held delicately in the cat's paws, with the strings and fretboard visible.In the foreground, a shallow, metallic bowl filled with coins lies on the wet pavement, glistening with raindrops. The background is blurred, showing a few pedestrians walking by, their faces indistinct due to the rain and distance. The rain is depicted as a gentle, steady drizzle, with droplets visible on the cat's fur and the pavement. The overall mood is one of melancholic, urban charm, with the cat's music providing a poignant contrast to the rainy, gray surroundings. The illustration masterfully captures the textures of the cat's fur, the guitar's wood, and the wet pavement, immersing the viewer in a vivid, atmospheric scene. The colors are muted, with earthy tones and the vibrant orange of the guitar standing out against the drab background. The style is reminiscent of photorealistic digital art, with a focus on detailed textures and lighting. The overall effect is both heartwarming and melancholic. | The image is rich in texture and detail, with the rain adding a dynamic, interactive element to the scene. | The style is highly realistic, with a focus on capturing the emotional depth of the scene. | The cat's expression is one of calm, focused creativity, adding to the poignancy of the scene. | The rain adds a sense of movement and energy to the scene, emphasizing the cat's performance. | The background is subtly detailed, with the blurred figures of pedestrians adding depth to the scene. | The overall mood is contemplative and peaceful, with the cat's music serving as a poignant contrast to the rainy surroundings. | The illustration masterfully captures the

这是一幅细节丰富、逼真的数字插画,描绘的是一只猫在雨天街道上弹吉他。这只猫有着橙色和白色的皮毛,穿着一件破旧的绿色连帽衫和深蓝色裤子,散发着一种随意的街头表演氛围。这只猫的大眼睛圆溜溜的,耳朵竖起来,好像在听音乐。这把橙色的吉他被猫爪子小心地握着,琴弦和指板清晰可见。在前景中,一个装满硬币的浅金属碗放在湿漉漉的人行道上,雨滴闪闪发光。背景是模糊的,显示几个行人走过,他们的脸因雨水和距离而模糊不清。雨被描绘成一场温和而稳定的毛毛雨,猫的皮毛和人行道上可以看到水滴。整体氛围是一种忧郁的都市魅力,猫的音乐与阴雨绵绵、灰暗的环境形成了鲜明的对比。插画巧妙地捕捉了猫的毛发、吉他的木材和湿漉漉的路面的纹理,让观看者沉浸在生动、有气氛的场景中。色彩柔和,泥土色调和吉他的鲜艳橙色在单调的背景上格外醒目。这种风格让人联想到照片级写实的数字艺术,注重细节纹理和灯光。整体效果既温馨又忧郁。| 图像具有丰富的纹理和细节,雨水为场景增添了动态的互动元素。| 风格高度逼真,注重捕捉场景的情感深度。| 猫的表情平静、专注、富有创造力,为场景增添了感伤感。| 雨水为场景增添了一种动感和活力,突出了猫的表演。| 背景细节微妙,行人的模糊身影为场景增添了深度。|整体氛围是沉思而平和的,猫的音乐与阴雨的环境形成了鲜明的对比。| 插图巧妙地捕捉了猫的皮毛、吉他的木材和湿漉漉的路面的纹理,让观众沉浸在生动、大气的场景中。| 颜色柔和,泥土色调和吉他的鲜艳橙色在单调的背景下显得格外突出。| 风格让人想起照片写实


04. 负重前行

This is a fantastical, digital artwork depicting a surreal scene. A massive elephant, with its grey skin and wrinkled texture, dominates the foreground, walking across a sun-drenched savannah. The elephant's body is adorned with lush greenery, including a large acacia tree perched on its back, its branches stretching out to the sides. The tree's leaves and branches are intricately detailed, with delicate textures and shades of green.In the background, a majestic, medieval-style castle rises from the elephant's back, its stone walls and towers blending seamlessly into the elephant's hide. The castle's architecture is a mix of Gothic and Romanesque styles, with pointed arches, turrets, and a central keep. The castle's windows and doors are adorned with intricate stone carvings.The sky above is a warm, gradient blue, with soft, fluffy clouds that seem to glow with a golden light, suggesting the late afternoon or early morning sun. The overall mood is one of whimsical wonder, blending fantasy and realism in a dreamlike atmosphere. The image combines detailed textures with a sense of magic and adventure. The elephant's path leads through a landscape of tall grasses and scattered wildflowers, adding to the serene, idyllic atmosphere. The artwork's style is reminiscent of high-end digital art, with a focus on realism and intricate details.



















  • 6
  • 4
    觉得还不错? 一键收藏
  • 0




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


