【AI资讯】挑战 Midjourney，Google Imagen 3 超级详细上手实测！

爱绘画的彤姐

于 2025-01-15 10:42:33 发布

阅读量1.6k

点赞数 31

文章标签：人工智能 midjourney Imagen imagen Google Imagen 3 webui AIGC

本文链接：https://blog.csdn.net/a2421417624/article/details/145155579

版权

2024年12月18日，Google 正式发布了旗下最新的文生图模型Imagen3和最新的视频模型Veo2，普遍认为这是Google在这两个领域向Midjourney和Sora的正式挑战。

与此前Google新的AI产品发布不同，这次一经发布就受到大家的广泛关注，而且大家不约而同地给这两款产品都给出了很高的评价。

伴随着Imagen 3的重磅更新，相反，过去的一年，AI绘图领域一直的老大哥Midjourney的更新仿佛陷入了停滞

纵观这一年，Midjourney只进行了一次大的模型更新：2024年7月31日，Version 6.1版本，而这距今已经过去了5个月

目前Veo 2还需要提交申请排队，Imagen 3已经开放使用，今天和大家聊聊Imagen 3，以及我自己的测评，和大家分享。

AI绘图领域可以覆盖的范围十分的广泛，我筛选了个人认为比较常用的几个维度测试生成效果：

人物
动物
静物
风景
3D效果
风格化
文字

人物生成

人物是所有AI绘图的工具必修课，这里我设想了三个不同的场景分别测试效果

场景1：黑白高对比度肖像，一位学者坐在古老图书馆的木质桌旁，周围堆满了书籍，柔和的光线洒在他的面容上，锐利的阴影和高光突出了他深邃的眼神和智者的气质，使用徕卡M10相机与50mm镜头拍摄，经典胶片颗粒效果增添了时代感。

Prompt: High-contrast black and white portrait of ascholarsittingatanancientwoodendeskinalibrarysurroundedbystacksofbooks, softlightfallingonhisface, sharpshadowsandhighlightsaccentuatinghisdeepgazeandwisedemeanor, capturedwithaLeicaM10anda 50mmlens, classicfilmgraineffectforatimelessfeel.

Imagen 3

Midjourney V6.1

从第一组人物形象测试来看，实现效果人物效果十分接近，但从背景细节来看，Imagen 3的生成的图片展示的细节信息更加完整，比如木制书桌这个信息，Imagen 3很好的呈现了出来，而Midjourney被堆满了书籍而掩盖掉了。

下面分别是多轮生成的图像，上方是Imagen 3，下方是Midjourney

感兴趣的小伙伴，赠送全套AIGC学习资料，包含AI绘画、AI人工智能等前沿科技教程和软件工具，具体看这里。

从整体生成效果来看，Imagen 3的一致性更高，不管经历几次的重复生成，人物，形态及场景布局的一致性很高，而反观Midjourney的几组照片，人物，人物的形态特征等都更加丰富多变了。

场景2：一位优雅的芭蕾舞者在舞台上表演，空中旋转姿态完美，舞台灯光投射出戏剧性的阴影，使用 50mm f/1.8 镜头拍摄，快速快门定格舞姿，同时保持背景虚化，营造出梦幻般的效果。

Prompt: A graceful ballet dancer mid-performance, poised in the air with a perfect pirouette, the stage lights casting dramatic shadows on the floor, captured with a 50mm f/1.8 lens, fast shutter speed to freeze the motion while maintaining soft background blur for a dreamlike effect.

Imagen 3

Midjourney V6.1

Imagen 3生成的这张图片直接惊艳到了我，明显可以看到Imagen 3整体对于Prompt遵循更加完美，不仅十分优雅地展示了一个芭蕾舞者优美的姿态，还与舞台的灯光投影效果实现十分完美地交互

而Midjourney仿佛被最后一句梦幻般的舞台效果干扰到了，整体的几组照片舞台效果都很差，缺少一些美感，而Midjourney在形容词这一块很容易受到干扰，后面我们还会看到类似效果

上下分别是Imagen 3和Midjourney多轮生成效果

在单个人物的基础上，测试多个人物的动作形态及形象的遵循效果

场景3：在一家温馨的咖啡馆里，一位男士站在柜台旁，手中拿着一杯咖啡，专注地看着坐在桌旁的女孩。女孩坐在桌旁，双臂交叉在桌上，稍微歪着头，专心聆听。她的表情开放，眼神专注，而他的姿势略显紧张，似乎正在进行一场深刻的对话。咖啡馆内温暖的氛围和头顶柔和的环境光增添了亲密感和联系感。使用 35mm f/1.4 镜头拍摄，背景略微虚化，焦点集中在他们的面容和肢体语言上，捕捉这一刻的深情交流。

Prompt: In a cozy coffee shop, a man stands leaning against the counter, a coffee cup in hand, looking intently at the woman sitting at a table nearby. The woman, sitting with her arms crossed on the table, tilts her head slightly, listening intently. Her expression is open, her eyes engaged, while his stance, slightly tense, suggests a deep, meaningful conversation. The intimate environment of the café, with soft ambient light from overhead, adds a sense of warmth and connection. Shot with a 35mm f/1.4 lens, the background is subtly blurred, bringing the focus to their faces and body language, capturing a moment of connection.

Imagen 3

Midjourney V6.1

同一个场景，两个人物不同的动作形态，在细节的理解上，Imagen 3再胜一筹，显然Midjourney的男主忘记了手中的咖啡，生成的三组照片中，只有一张手里拿了咖啡，从两个人的动作表情来看，Imagen 3的效果也更自然一些

上下分别是Imagen 3和Midjourney多轮生成效果

从以上几组人物的效果来看， Imagen 3的生成效果完全领先于Midjourney，尤其在细节方面，Imagen 3表现更为惊艳

动物生成

prompt：Cute two cats, big eyes, alternating black and white fur, silly demeanor, playing together in a spacious and tidy house

Imagen 3

Midjourney V6.1

很明显可以看到，Imagen 3以写实的照片为主，而Midjourney的创造性更强一些

同时，这里也遇到了和芭蕾舞者生成的时候同样的问题，Midjourney对于Cute这类形容词似乎更加敏感，在生成图片的时候也会更多的将形象聚焦于Cute这类形容词

上下分别是Imagen 3和Midjourney多轮生成效果

而在测试中，发现一个有意思的现象，Imagen 3偶尔会不能理解Two cats，这让我比较意外

比如Two cute cats和Cute two cats，Midjourney生成的都能严格遵守，但Imagen 3偶尔会混入第三只猫，比如这样

除了小猫，我们再生成一只狐狸试试效果

Night. Close-up of a huge white fox standing and turning , with the magnificent golden palace of ancient China behind it and smoke all around . Soft blue-green lighting.

Imagen 3

Midjourney V6.1

Iamgen 3更倾向于真实效果，而Midjourney则更多融入了一些风格化的元素，但从背后建筑的生成效果来看，Imagen 3更为的精细，这样的图片少了一些AI的味道，看起来更真实一些。

静物效果

Prompt：red car, high resolution, high quality

Imagen 3

Midjourney V6.1

产品设计

Product photography of a violet-colored shampoo, with the brand name "VIOLET" displayed in gold text on an elegant purple bottle. The product is surrounded by white cream and splash designs , set against a dark, blackish-purple background with soft lighting . The image is highly detailed , hyper-realistic , and in a cinematic, minimalistic photography style with high resolution .

Imagen 3

Midjourney V6.1

室内设计

Prompt: Interior Design, a perspective of a study, large windows with natural light, Light colors, plants, modern furniture, modernist, modern interior design

Imagen 3

Midjourney V6.1

风景照片效果

Prompt: A hyper realistic photo of The Grand Canyon at sunrise, Canon RF 16mm f:2.8 STM Lens

Imagen 3

Midjourney V6.1

3D风格设计

Prompt: architectural cross section of a geometirc architectural model in the cube display, 3D, with deep layered, cubic structures. The design features intricate lighting exploration, caverns, and pathways. Viwed from a frontal, parallel sectional view

Imagen 3

Midjourney V6.1

Prompt: 3D toy, ip, Cyberpunk style, simple background,Chinese style clothes， best quality, c4d, mixer, 3D model, toy, whole body, watching the audience, super details, clean background, ip by pop mart, physical blind box, vivid color, street style, high resolution, a lot of details, Pixar, candy color, fashion trend

Imagen 3

Midjourney V6.1

赛博朋克风格

A futuristic cyberpunk city at night, with neon-lit streets reflecting off wet pavement, towering skyscrapers adorned with holographic advertisements, and a sleek red racing car hovering slightly above the ground, glowing blue vapor emitting from its exhaust. People in glowing exoskeletons walk along the streets, dynamic graffiti on walls, and drones patrol the hazy sky. The atmosphere is vibrant, with cool blue and purple tones mixed with warm orange and pink neon lights.

Imagen 3

Midjourney V6.1

中国水墨画风格

an Chinese ink painting of A landscape , black and white documentary style , japanese minimalism , Simple , large areas of white , detail shot , high quality

Imagen 3

Midjourney V6.1

漫画风格

Prompt: Iron Man, red background, comic book art

Imagen 3

Midjourney V6.1

一些更为抽象的设计

A close-up, macro photography stock photo of a strawberry intricately sculpted into the shape of a hummingbird in mid-flight, its wings a blur as it sips nectar from a vibrant, tubular flower. The backdrop features a lush, colorful garden with a soft, bokeh effect, creating a dreamlike atmosphere. The image is exceptionally detailed and captured with a shallow depth of field, ensuring a razor-sharp focus on the strawberry-hummingbird and gentle fading of the background. The high resolution, professional photographers style, and soft lighting illuminate the scene in a very detailed manner, professional color grading amplifies the vibrant colors and creates an image with exceptional clarity. The depth of field makes the hummingbird and flower stand out starkly against the bokeh background.

这个Prompt来自Google官方，描述一个草莓雕刻而成的蜂鸟的形状，挥动着翅膀正在吸取花蜜的场景

Imagen 3

Midjourney V6.1

文字生成效果

a new year greeting card showing beach shoreline filled with festive lights from afar offshore at night and the sky full of fireworks. Add the greeting "Happy New Year 2025".

Imagen 3

Midjourney V6.1

| 总结

Prompt遵循：得益于Gemini 2.0的强大的语言理解能力，Imagen 3十分严格的遵照Prompt生成图像，包含各部分完整的细节在语义遵循和画面的整体控制方面做的十分不错。
画面细节：Imagen 3对细节各方面的信息相对更加完整，这种也让图片的AI的味道少了很多。
风格的多样性：Imagen 3对于Prompt严格的遵循带来的代价也很明显，风格缺少多样性，即使多次的抽卡，最终画面的主体，特征，画面结构布局都不会出现特别显著的差异。
文字控制：从基础测试来看，二者在文字方面的效果并没有显著的差距，Midjourney的艺术风格相对更多一些。

Imagen 3更适合生成一些真实性的图像，尽可能完善图像细节，减少生成图片的AI味道，而Midjourney更适合风格多样性的设计，找寻更多的灵感和思路。

相较于Midjourney，Imagen 3目前整体的交互及网页版的功能都还十分的简陋，交互的细节还有很大的提升空间，但Imagen 3一个更大的优势：免费，没有数量限制，所有用户每天都可以无限生成图片。

2024年，AI绘图领域有一些沉寂，Imagen 3的出现，给了我们一些惊喜，这也是对Midjourney的挑战。

2025，新的一年，期待更多AI产品和功能的诞生！

Imagen 3体验

地址:

已经看到这里了，如果这篇文章对你有帮助，欢迎点赞，分享，在看！👇👇

写在最后

感兴趣的小伙伴，赠送全套AIGC学习资料，包含AI绘画、AI人工智能等前沿科技教程和软件工具，具体看这里。

AIGC技术的未来发展前景广阔，随着人工智能技术的不断发展，AIGC技术也将不断提高。未来，AIGC技术将在游戏和计算领域得到更广泛的应用，使游戏和计算系统具有更高效、更智能、更灵活的特性。同时，AIGC技术也将与人工智能技术紧密结合，在更多的领域得到广泛应用，对程序员来说影响至关重要。未来，AIGC技术将继续得到提高，同时也将与人工智能技术紧密结合，在更多的领域得到广泛应用。

在这里插入图片描述

一、AIGC所有方向的学习路线

AIGC所有方向的技术点做的整理，形成各个领域的知识点汇总，它的用处就在于，你可以按照下面的知识点去找对应的学习资源，保证自己学得较为全面。

在这里插入图片描述

二、AIGC必备工具

工具都帮大家整理好了，安装就可直接上手！
在这里插入图片描述

三、最新AIGC学习笔记

当我学到一定基础，有自己的理解能力的时候，会去阅读一些前辈整理的书籍或者手写的笔记资料，这些笔记详细记载了他们对一些技术点的理解，这些理解是比较独到，可以学到不一样的思路。
在这里插入图片描述

四、AIGC视频教程合集

观看全面零基础学习视频，看视频学习是最快捷也是最有效果的方式，跟着视频中老师的思路，从基础到深入，还是很容易入门的。

在这里插入图片描述

五、实战案例

纸上得来终觉浅，要学会跟着视频一起敲，要动手实操，才能将自己的所学运用到实际当中去，这时候可以搞点实战案例来学习。
在这里插入图片描述

若有侵权，请联系删除