作者:老余捞鱼
原创不易,转载请标明出处及原作者。
写在前面的话:
本文将 “文生图” 领域的新贵 Flux 与传统王者 Midjourney 进行了三轮关于真实人物图像生成的比较。历经三次同一标准的测试后,对二者的性能与输出效果予以评价,期望能为大家未来的使用提供些许帮助。
Flux.1 在发布不到一个月的时间内就成为了最佳开放式人工智能图像生成器,它拥有 120 亿个参数,业内不少评测结果都指向其性能超过了 Midjourney V6 和 OpenAI 的 Dall-E 3 等商业模型。然而,在我看来,评定一款文生图工具的最佳方式当属让其模拟现实世界。毕竟,大多数人觉得,只要自己花费一番功夫,依旧能够判断出这究竟是真人照片,还是由 AI 生成的人像。
我在前面的一篇文章中曾经告诉过大家如何去获得 Flux 的使用权,详见:《免费使用最近大热 Flux.1 AI 文生图工具的 5 种方法》。而 Midjourney 的使用方法网上到处都是,我就不再累述了。言归正传,我们进入PK赛环节。
Round One
为了在同一标准下对两个平台进行测试,我在本文中使用的所有提示词都是由 Perplexity 随机生成的英文。这是第一段提示词:
A young woman in her thirties, with medium-length, wavy brown hair, sits at a modern, light-wood desk. She’s wearing a casual white shirt and a gray cardigan. Her face is clearly visible, turned slightly towards the camera, with a concentrated yet relaxed expression. Her green eyes are fixed on the screen of her silver laptop, set before her on the desk. Natural light streams in through a large window on her left, creating a soft glow that highlights her features and adds depth to the image. Her hands are resting on the keyboard, fingers moving, suggesting that she is typing. On the desk, a steaming cup of coffee sits beside the computer, along with a notebook and pen. In the background, a bookcase filled with books and a few green plants add warmth to the atmosphere. The camera angle is slightly elevated, offering a clear view of the woman’s face and her workspace. The image exudes an atmosphere of serene productivity and concentration, with a soft, natural color palette.
中文机译:一位三十多岁的年轻女子,留着中等长度的棕色卷发,坐在一张现代的浅色木桌前。她穿着一件休闲的白色衬衫和一件灰色开衫。她的脸清晰可见,微微转向镜头,表情专注而放松。她绿色的眼睛盯着面前桌上银色笔记本电脑的屏幕。自然光从她左边的一扇大窗户照进来,营造出柔和的光芒,凸显了她的容貌特征,也为画面增添了层次感。她的双手放在键盘上,手指在移动,表明她正在打字。在桌上,电脑旁边放着一杯热气腾腾的咖啡,还有一个笔记本和一支笔。在背景中,一个装满书的书架和几盆绿色植物为氛围增添了温暖。相机角度稍高,清晰地展现了女子的脸和她的工作空间。这幅图像散发着宁静的高效和专注的氛围,有着柔和、自然的色彩基调。
Flux 输出的图片:
下面来自 Midjourney 的照片:
我必须说 Midjourney 细节方面是魔鬼,这位女士不但让我完全看不出她是AI生成的,而且她用的电脑尽然是苹果的?而Flux在这方面就显得还要继续努力了。而且大家注意背后书架上物品以及植物的景深效果运用,Midjourney 的处理几乎是完美的!毫无疑问,我认为Midjourney拿下了第一轮。
Round Two
接下来,我尝试了一个动作更丰富的场景,提示词如下:
A muscular boxer in his mid-twenties is captured in the middle of a fight, just after landing a powerful blow. His face is clearly visible and expressive. His eyes are slightly wide, his mouth ajar in a grimace of pain and surprise. Droplets of sweat and blood fly around his head, frozen in the moment. The boxer has short black hair and a three-day beard. He wears red boxing shorts and his hands are wrapped in white bandages. His naked, muscular torso glistens with sweat under the bright lights of the ring. His body is slightly twisted, starting to tilt backwards under the impact of the blow. His arms are in motion, one descending from his guard position, the other beginning to rise to counter. The boxing ring is visible in the background, with stretched ropes and a canvas floor. The bright lights of the ceiling create a strong contrast, accentuating the shadows and contours of the boxer’s body. The blurred crowd is visible behind the ropes, their dark silhouettes contrasting with the bright lights of the ring. The atmosphere is charged with energy and tension. The image is captured with a fast shutter speed, freezing the precise moment of impact and details like the drops of sweat suspended in the air.
中文机译:一位二十五六岁、肌肉发达的拳击手在打出一记有力的重拳后,战斗正酣之际被抓拍下来。他的脸清晰可见且富有表现力。他的眼睛微微睁大,嘴巴微张,露出痛苦又惊讶的表情。汗水和血滴在他头部周围飞溅,在这一刻被定格。这位拳击手留着黑色短发,蓄着三天没刮的胡子。他穿着红色拳击短裤,双手缠着白色绷带。在拳击台明亮的灯光下,他赤裸的、肌肉发达的上身闪烁着汗水。他的身体微微扭曲,在这一击的冲击力下开始向后倾斜。他的手臂在动,一只从防守位置落下,另一只开始抬起进行反击。背景中可以看到拳击台,有绷紧的绳索和帆布地面。天花板上明亮的灯光形成强烈的对比,突出了拳击手身体的阴影和轮廓。绳索后面可以看到模糊的人群,他们深色的轮廓与拳击台明亮的灯光形成对比。气氛中充满了能量和紧张感。这张照片是用高速快门拍摄的,定格了精确的撞击瞬间以及诸如悬浮在空中的汗滴等细节。
Flux 输出的图片:
Midjourney 输出的照片:
大家看看拳击手的表情、身上的肌肉线条,再看看被定格的汗水,还有那不多不少正好三天没刮的胡子。2:0,Midjourney 继续领先。
Round Three
让我们给Flux 最后一次机会,看看它能否挽回颓势。
A man aged around 75 sits at the wheel of his car, seen from a three-quarter view through the windscreen. His wrinkled but benevolent face is clearly visible, illuminated by the daylight. He has thinning gray hair, bushy eyebrows and wears thin gold-rimmed glasses. The man is dressed in a blue and white checked shirt, with a beige woollen vest over it. His hands, marked by age with visible veins, firmly hold the worn leather steering wheel at the 10:10 position. His expression is focused but serene, with a slight smile at the corners of his lips. His pale blue eyes, behind his glasses, are fixed on the road ahead. The car is a classic ’80s model, with a dark brown leather interior. The varnished wood dashboard is visible, with analog dials and a vintage radio. A small plastic guardian angel hangs from the rearview mirror, swaying gently. Through the windscreen, we see a country road lined with autumn-leafed trees. The sky is light blue with a few fluffy clouds. Sunlight streams in through the driver’s window, creating soft reflections on the man’s face and highlighting the textures of his skin and clothes. A small bouquet of dried lavender is attached to the air vent, adding a touch of color and suggesting a soothing ambience in the cabin.
中文机译:从挡风玻璃的四分之三处可以看到,一位 75 岁左右的老人坐在汽车方向盘前。在日光的照射下,他满脸皱纹但慈祥的面容清晰可见。他的头发稀疏灰白,眉毛浓密,戴着一副薄薄的金边眼镜。他身穿蓝白格子衬衫,外面套着一件米色毛背心。他的双手因年事已高而青筋明显,紧握着破旧的真皮方向盘,并将其置于 10:10 的位置。他的表情专注而平静,嘴角微微含笑。眼镜后一双淡蓝色的眼睛紧紧盯着前方的道路。这辆车是 80 年代的经典车型,深棕色真皮内饰。清漆木质仪表盘清晰可见,上面有模拟表盘和复古收音机。后视镜上挂着一个小小的塑料守护天使,轻轻地摇摆着。透过挡风玻璃,我们看到一条乡间小路,两旁种满了秋叶树。天空是浅蓝色的,飘着几朵薄云。阳光从驾驶室的窗户射进来,在男子的脸上形成柔和的反光,突出了他皮肤和衣服的质感。出风口上挂着一小束干薰衣草,为车厢增添了一抹色彩,也暗示着车厢内舒缓的氛围。
Flux 输出的图片:
Midjourney 输出的照片:
好吧,这次都不用我来巴拉巴拉的分析一大通了,Flux 图片上老人的皱纹就像橡皮泥堆砌起来的,和 Midjourney 图片上老人头上的老年斑一比,高下立判。
3:0 , Midjourney 显然还是在文生图领域独树一帜,它以压倒性优势获胜。
那么,您同意以上我对最佳工具的评价吗?我很想听听您的看法!
本文内容仅仅是技术探讨和学习,转发请注明原作者和出处。