最热门的人工智能模型、它们的作用以及如何使用它们

[附录英文原文]
查尔斯·罗勒特 2025 年 3 月 6 日

从谷歌等大型科技公司到 OpenAI 和 Anthropic 等初创公司,人工智能模型的开发速度令人眼花缭乱。跟踪最新的模型可能会让人不知所措。

更令人困惑的是,人工智能模型通常基于行业基准进行推广。但这些技术指标往往无法揭示真实的人和公司如何使用它们。

为了消除干扰,TechCrunch 汇编了自 2024 年以来发布的最先进 AI 模型的概述,并详细介绍了如何使用它们以及它们最适合什么。我们也会不断更新此列表,发布最新发布的内容。

目前有超过一百万个 AI 模型:例如,Hugging Face就拥有超过 140 万个模型。因此,此列表可能会遗漏一些表现更好的模型。

2025 年发布的 AI 模型

Cohere 的 Aya Vision
Cohere发布了一款名为 Aya Vision 的多模式模型,声称该模型在为图片添加字幕和回答有关照片的问题等方面表现最佳。Cohere 称,与其他模型不同,该模型在英语以外的语言方面也表现出色。该模型可在 WhatsApp 上免费使用。

OpenAI 的 GPT 4.5“Orion”
OpenAI 称 Orion 是他们迄今为止最大的模型,并宣称其拥有强大的“世界知识”和“情商”。然而,与较新的推理模型相比,它在某些基准上表现不佳。Orion 可供 OpenAI 每月 200 美元计划的订阅者使用。

克劳德·桑奈特 3.7
Anthropic 表示,这是业界首个“混合”推理模型,因为它既可以快速给出答案,又可以在需要时真正思考问题。据 Anthropic 称,它还让用户可以控制模型的思考时间。Sonnet 3.7 可供所有 Claude 用户使用,但重度用户需要每月支付 20 美元的 Pro 计划。

xAI 的 Grok 3
Grok 3 是埃隆·马斯克创办的初创公司 xAI 的最新旗舰机型。据称,它在数学、科学和编码方面的表现优于其他领先机型。该机型需要 X Premium(每月 50 美元)。一项研究发现Grok 2 偏左后,马斯克承诺将 Grok 转向更“政治中立”的立场,但目前尚不清楚这是否已实现。

OpenAI o3-mini
这是 OpenAI最新的推理模型,针对编码、数学和科学等 STEM 相关任务进行了优化。它不是 OpenAI 最强大的模型,但由于它体积较小,该公司表示成本明显较低。它是免费的,但重度用户需要订阅。

OpenAI深度研究
OpenAI 的深度研究旨在对具有明确引文的主题进行深入研究。此服务仅适用于 ChatGPT每月 200 美元的 Pro 订阅。OpenAI建议将其用于从科学到购物研究的所有领域,但请注意幻觉仍然是 AI 的一个问题。

米斯特拉尔猫
Mistral推出了多模式人工智能个人助理Le Chat 的应用版本。Mistral声称Le Chat 的响应速度比任何其他聊天机器人都快。它还有一个付费版本,其中包含法新社的最新新闻。 《世界报》的测试发现 Le Chat 的性能令人印象深刻,尽管它比 ChatGPT 犯的错误更多。

OpenAI 操作员
OpenAI 的 Operator旨在成为一名个人实习生,可以独立完成一些事情,比如帮你买杂货。它需要每月支付 200 美元的 ChatGPT Pro 订阅费。人工智能代理很有前途,但它们仍处于实验阶段:《华盛顿邮报》的一位评论员说,Operator自己决定以 31 美元的价格订购一打鸡蛋,用评论员的信用卡支付。

Google Gemini 2.0 Pro 实验版
Google Gemini备受期待的旗舰型号声称它在编码和理解常识方面表现出色。它还拥有一个由 200 万个标记组成的超长上下文窗口,可帮助需要快速处理大量文本的用户。该服务需要(至少)每月 19.99 美元的 Google One AI Premium 订阅。

2024 年发布的 AI 模型
DeepSeek R1
这种中国人工智能模型席卷了硅谷。DeepSeek 的 R1 在编码和数学方面表现出色,而其开源性质意味着任何人都可以在本地运行它。此外,它是免费的。然而,R1 集成了中国政府的审查制度,并因可能将用户数据发回中国而面临越来越多的禁令。

双子座深度研究
Deep Research以简单且引用充分的文档总结了 Google 的搜索结果。该服务对学生和其他需要快速研究摘要的人很有帮助。但是,它的质量远不如实际的同行评审论文。Deep Research 需要 19.99 美元的 Google One AI Premium 订阅。

元骆驼 3.3 70B
这是Meta 开源 Llama AI 模型的最新、最先进的版本。Meta 宣称此版本是迄今为止最便宜、最高效的版本,尤其适用于数学、常识和指令遵循。它是免费的开源版本。

OpenAI Sora
Sora 是一个基于文本创建逼真视频的模型。虽然它可以生成整个场景而不仅仅是片段,但OpenAI 承认它经常会产生“不切实际的物理现象”。它目前仅在 ChatGPT 的付费版本中可用,从 Plus 开始,每月收费 20 美元。

阿里巴巴 Qwen QwQ-32B-预览
该模型是少数可在某些行业基准上与 OpenAI 的 o1 相媲美的模型之一,在数学和编码方面表现出色。具有讽刺意味的是,对于一个“推理模型”,它“在常识推理方面还有改进的空间”,阿里巴巴表示。TechCrunch测试显示,它还融入了中国政府的审查制度。它是免费的开源模型。

Anthropic 的计算机使用
Claude 的 Computer Use 旨在控制您的计算机来完成编码或预订机票等任务,使其成为 OpenAI 的 Operator 的前身。不过,Computer use仍处于测试阶段。定价通过 API 进行:每百万个 token 输入 0.80 美元,每百万个 token 输出 4 美元。

xAI 的 Grok 2
埃隆·马斯克的人工智能公司 xAI 推出了其旗舰 Grok 2 聊天机器人的增强版,据称其速度“快了三倍”。免费用户在 Grok 上每两小时只能问 10 个问题,而 X 的 Premium 和 Premium+ 计划的订阅者则享受更高的使用限制。xAI 还推出了一款图像生成器 Aurora,可以生成高度逼真的图像,包括一些图形或暴力内容。

OpenAI o1
OpenAI 的 o1 系列旨在通过隐藏的推理功能“思考”响应,从而产生更好的答案。OpenAI声称,该模型在编码、数学和安全方面表现出色,但也存在试图欺骗人类的问题。使用 o1 需要订阅 ChatGPT Plus,每月收费 20 美元。

Anthropic 的 Claude Sonnet 3.5
Claude Sonnet 3.5 是 Anthropic声称同类中最好的模型。它以其编码能力而闻名,被认为是技术内幕人士的首选聊天机器人。 该模型可以在 Claude 上免费访问,但重度用户需要每月 20 美元的 Pro 订阅。虽然它可以理解图像,但无法生成图像。

OpenAI GPT 4o-mini
OpenAI称 GPT 4o-mini是迄今为止最实惠、速度最快的模型,这要归功于其小巧的体积。它旨在实现广泛的任务,例如为客户服务聊天机器人提供支持。该模型在 ChatGPT 的免费套餐中可用。与更复杂的任务相比,它更适合大量简单任务。

凝聚命令 R+
Cohere 的Command R+ 模型擅长为企业提供复杂的检索增强生成 (RAG) 应用。这意味着它可以很好地查找和引用特定信息。(RAG 的发明者实际上在 Cohere 工作。)尽管如此,RAG并不能完全解决 AI 的幻觉问题。

=英文原文=======
https://techcrunch.com/2025/03/05/the-hottest-ai-models-what-they-do-and-how-to-use-them/

AI models are being cranked out at a dizzying pace, by everyone from Big Tech companies like Google to startups like OpenAI and Anthropic. Keeping track of the latest ones can be overwhelming.

Adding to the confusion is that AI models are often promoted based on industry benchmarks. But these technical metrics often reveal little about how real people and companies actually use them.

To cut through the noise, TechCrunch has compiled an overview of the most advanced AI models released since 2024, with details on how to use them and what they’re best for. We’ll keep this list updated with the latest launches, too.

There are literally over a million AI models out there: Hugging Face, for example, hosts over 1.4 million. So this list might miss some models that perform better, in one way or another.

AI models released in 2025

Cohere’s Aya Vision
Cohere released a multimodal model called Aya Vision that it claims is best in class at doing things like captioning images and answering questions about photos. It also excels in languages other than English, unlike other models, Cohere claims. It is available for free on WhatsApp.

OpenAI’s GPT 4.5 “Orion”
OpenAI calls Orion their largest model to date, touting its strong “world knowledge” and “emotional intelligence.” However, it underperforms on certain benchmarks compared to newer reasoning models. Orion is available to subscribers of OpenAI’s $200-per-month plan.

Claude Sonnet 3.7
Anthropic says this is the industry’s first “hybrid” reasoning model, because it can both fire off quick answers and really think things through when needed. It also gives users control over how long the model can think for, per Anthropic. Sonnet 3.7 is available to all Claude users, but heavier users will need a $20-per-month Pro plan.

xAI’s Grok 3
Grok 3 is the latest flagship model from Elon Musk-founded startup xAI. It’s claimed to outperform other leading models on math, science, and coding. The model requires X Premium (which is $50 per month.) After one study found Grok 2 leaned left, Musk pledged to shift Grok more “politically neutral” but it’s not yet clear if that’s been achieved.

OpenAI o3-mini
This is OpenAI’s latest reasoning model and is optimized for STEM-related tasks like coding, math, and science. It’s not OpenAI’s most powerful model but because it’s smaller, the company says it’s significantly lower cost. It is available for free but requires a subscription for heavy users.

OpenAI Deep Research
OpenAI’s Deep Research is designed for doing in-depth research on a topic with clear citations. This service is only available with ChatGPT’s $200-per-month Pro subscription. OpenAI recommends it for everything from science to shopping research, but beware that hallucinations remain a problem for AI.

Mistral Le Chat
Mistral has launched app versions of Le Chat, a multimodal AI personal assistant. Mistral claims Le Chat responds faster than any other chatbot. It also has a paid version with up-to-date journalism from the AFP. Tests from Le Monde found Le Chat’s performance impressive, although it made more errors than ChatGPT.

OpenAI Operator
OpenAI’s Operator is meant to be a personal intern that can do things independently, like help you buy groceries. It requires a $200-per-month ChatGPT Pro subscription. AI agents hold a lot of promise, but they’re still experimental: A Washington Post reviewer says Operator decided on its own to order a dozen eggs for $31, paid with the reviewer’s credit card.

Google Gemini 2.0 Pro Experimental
Google Gemini’s much-awaited flagship model says it excels at coding and understanding general knowledge. It also has a super-long context window of 2 million tokens, helping users who need to quickly process massive chunks of text. The service requires (at minimum) a Google One AI Premium subscription of $19.99 a month.

AI models released in 2024

DeepSeek R1
This Chinese AI model took Silicon Valley by storm. DeepSeek’s R1 performs well on coding and math, while its open source nature means anyone can run it locally. Plus, it’s free. However, R1 integrates Chinese government censorship and faces rising bans for potentially sending user data back to China.

Gemini Deep Research
Deep Research summarizes Google’s search results in a simple and well-cited document. The service is helpful for students and anyone else who needs a quick research summary. However, its quality isn’t nearly as good as an actual peer-reviewed paper. Deep Research requires a $19.99 Google One AI Premium subscription.

Meta Llama 3.3 70B
This is the newest and most advanced version of Meta’s open source Llama AI models. Meta has touted this version as its cheapest and most efficient yet, especially for math, general knowledge, and instruction following. It is free and open source.

OpenAI Sora
Sora is a model that creates realistic videos based on text. While it can generate entire scenes rather than just clips, OpenAI admits that it often generates “unrealistic physics.” It’s currently only available on paid versions of ChatGPT, starting with Plus, which is $20 a month.

Alibaba Qwen QwQ-32B-Preview
This model is one of the few to rival OpenAI’s o1 on certain industry benchmarks, excelling in math and coding. Ironically for a “reasoning model,” it has “room for improvement in common sense reasoning,” Alibaba says. It also incorporates Chinese government censorship, TechCrunch testing shows. It’s free and open source.

Anthropic’s Computer Use
Claude’s Computer Use is meant to take control of your computer to complete tasks like coding or booking a plane ticket, making it a predecessor of OpenAI’s Operator. Computer use, however, remains in beta. Pricing is via API: $0.80 per million tokens of input and $4 per million tokens of output.

xAI’s Grok 2
Elon Musk’s AI company, xAI, has launched an enhanced version of its flagship Grok 2 chatbot it claims is “three times faster.” Free users are limited to 10 questions every two hours on Grok, while subscribers to X’s Premium and Premium+ plans enjoy higher usage limits. xAI also launched an image generator, Aurora, that produces highly photorealistic images, including some graphic or violent content.

OpenAI o1
OpenAI’s o1 family is meant to produce better answers by “thinking” through responses through a hidden reasoning feature. The model excels at coding, math, and safety, OpenAI claims, but has issues with trying to deceive humans, too. Using o1 requires subscribing to ChatGPT Plus, which is $20 a month.

Anthropic’s Claude Sonnet 3.5
Claude Sonnet 3.5 is a model Anthropic claims as being best in class. It’s become known for its coding capabilities and is considered a tech insider’s chatbot of choice. The model can be accessed for free on Claude, although heavy users will need a $20 monthly Pro subscription. While it can understand images, it can’t generate them.

OpenAI GPT 4o-mini
OpenAI has touted GPT 4o-mini as its most affordable and fastest model yet, thanks to its small size. It’s meant to enable a broad range of tasks like powering customer service chatbots. The model is available on ChatGPT’s free tier. It’s better suited for high-volume simple tasks compared to more complex ones.

Cohere Command R+
Cohere’s Command R+ model excels at complex retrieval-augmented generation (or RAG) applications for enterprises. That means it can find and cite specific pieces of information really well. (The inventor of RAG actually works at Cohere.) Still, RAG doesn’t fully solve AI’s hallucination problem.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值