使用 Google Gemini 迈出基于 LLM 构建的第一步-CSDN博客

LLM 似乎有无限的创新可能性。如果你像我一样，你已经使用过 GenAI 应用程序和工具——比如 Expedia 内置的 ChatGPT、用于代码编写的 Copilot，甚至是用于生成图像的 DALL-E。但作为一名技术专家，我想做的不仅仅是使用基于 LLM 的工具。我想建立自己的。

我告诉 DALL-E “生成一幅计算机程序员的水彩画，思考他可以用 LLM 构建的所有东西。是的，这是非常正确的。

借助所有新技术，成为建设者意味着从简单开始。对于我正在学习的任何新编程语言或我正在检查的任何新框架来说，都是如此。使用 LLM 进行构建也不例外。所以，这就是我在这里要介绍的内容。我将构建一个快速而肮脏的 API，它与 Google Gemini 交互，有效地为我提供了一个自己的小聊天机器人助手。

以下是我们将要做的：

简单介绍一下Google Gemini。
构建一个简单的Node.js应用程序。
将应用程序部署到 Heroku。
测试一下。

什么是 Google Gemini？

大多数日常消费者都知道 ChatGPT，它基于 GPT-4 LLM 构建。但说到 LLM，GPT-4 并不是镇上唯一的游戏。还有 Google Gemini（以前称为 Bard）。在大多数性能基准测试（例如多学科大学水平的推理问题或 Python 代码生成）中，Gemini 的表现优于 GPT-4。

双子座对自己有什么看法？

作为开发者，我们可以通过 Google AI Studio 中的 Gemini API 访问 Gemini。此外，还有适用于 Python、JavaScript、Swift 和 Android 的 SDK。

好。让我们开始构建。

构建 Node.js 应用程序

我们的Node.js应用程序将是一个简单的 Express API 服务器，其功能类似于 Gemini 聊天机器人。它将侦听两个端点。首先，对的请求（将包括带有消息属性的 JSON 有效负载）会将消息发送到 Gemini，然后返回响应。我们的应用程序将与 Gemini 保持正在进行的聊天对话。这使我们的聊天机器人变成了一个有用的助手，可以为我们保留笔记。POST/chat

其次，如果我们向发送请求，这将重置聊天对话以从头开始，从而有效地擦除 Gemini 之前与我们互动的记忆。POST/reset

如果要跳过此代码演练，可以在此处查看我的 GitHub 存储库中的所有代码。

初始化应用程序

首先，我们初始化 Node.js 应用程序并安装依赖项。

壳

~/project$ npm init -y && npm pkg set type="module"
 
~/project$ npm install @google/generative-ai dotenv express

然后，我们将此添加到我们的文件中：scriptspackage.json

JSON格式

"scripts": {
    "start": "node index.js"
  },

index.js 文件

我们的应用程序由一个文件组成，非常简单。我们将逐个部分介绍它。

首先，我们导入将要使用的所有包。然后，我们从 Google AI 初始化 SDK。我们将使用 gemini-pro 模型。最后，我们调用 startChat（），它为 Google 所谓的多轮对话创建一个新的 ChatSession 实例。

JavaScript的

import 'dotenv/config';
import express from 'express';
import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = genAI.getGenerativeModel({ model: "gemini-pro"});
let chat = model.startChat();

接下来，我们实例化一个新的 Express 应用程序，即我们的 API 服务器。

JavaScript的

const app = express();
app.use(express.json())

然后，我们为端点的请求设置侦听器。我们确保 JSON 有效负载正文包含一个 .我们使用我们的对象将该消息发送给 Gemini。然后，我们使用 Gemini 的响应文本响应我们的 API 调用者。POST/chatmessagechat

JavaScript的

app.post('/chat', async (req, res) => {
  if ((typeof req.body.message) === 'undefined' || !req.body.message.length) {
    res.status(400).send('"message" missing in request body');
    return;
  }

  const result = await chat.sendMessage(req.body.message);
  const response = await result.response;
  res.status(200).send(response.text());
})

请记住，通过使用，可以在所有 API 调用中存储我们与 Gemini 交互的运行历史记录。给双子座一个我们谈话的“记忆”有助于了解上下文。ChatSession

但是，如果你想让双子座完全重新开始并忘记所有以前的背景怎么办？为此，我们有端点。这只是启动一个新的./resetChatSession

JavaScript的

app.post('/reset', async (req, res) => {
  chat = model.startChat();
  res.status(200).send('OK');
})

最后，我们启动服务器开始侦听。

JavaScript的

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Server is running on port ${PORT}`)
})

顺便说一句，整个项目只是一个迷你演示。它不打算在生产中使用！我现在设计它的方式（无需身份验证），任何拥有 URL 的人都可以向发送请求。在生产设置中，我们将有适当的身份验证，每个用户都将拥有自己与Gemini的对话实例，这是其他人无法操纵的。/chat/reset

获取 Gemini API 密钥

在这一点上，我们几乎已经准备好了。我们最不需要的是一个 API 密钥来访问 Gemini API。要获取 API 密钥，请先注册一个 Google AI for Developers 帐户。

登录后，选择“启动 Google AI Studio”以开始新的 Google Gemini 项目。

在项目中，单击“获取 API 密钥”以导航到“API 密钥”页面。然后，单击“创建 API 密钥”以生成密钥。复制值。

在您的项目中，复制名为的新文件。粘贴 Gemini API 密钥的值。您的文件应如下所示：.env.template.env.env

GEMINI_API_KEY=ABCDEFGH0123456789_JJJ

在本地测试我们的应用程序

一切就绪后，我们可以在本地启动服务器进行测试。

~/project$ npm start

> heroku-gemini-node@1.0.0 start
> node index.js

Server is running on port 3000

在不同的终端中，我们可以发送一些 curl 请求：

$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"I would like to bake a shepherds pie to feed 8 people. \
            As you come up with a recipe, please keep a grocery list for me \
            with all of the ingredients that I would need to purchase."}' \
  http://localhost:3000/chat


**Shepherd's Pie Recipe for 8**

**Ingredients:**

**For the Filling:**
* 1 pound ground beef
* 1/2 pound ground lamb
* 2 medium onions, diced
…

**For the Mashed Potatoes:**
* 3 pounds potatoes, peeled and quartered
* 1/2 cup milk
…

**Instructions:**

**For the Filling:**

1. Heat a large skillet over medium heat. Add the ground beef and lamb and cook until browned.
…



$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"I also need to buy fresh basil, for a different dish \
          (not the shepherds pie). Add that to my grocery list too."}' \
  http://localhost:3000/chat


**Updated Grocery List for Shepherd's Pie for 8, and Fresh Basil:**

* 1 pound ground beef
* 1/2 pound ground lamb
* 2 medium onions
* 2 carrots
* 2 celery stalks
* 1 bag frozen peas
* 1 bag frozen corn
* 1 tablespoon Worcestershire sauce
* 1 teaspoon dried thyme
* 1 cup beef broth
* 1/4 cup tomato paste
* 3 pounds potatoes
* 1/2 cup milk
* 1/4 cup butter
* **Fresh basil**



$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"What items on my grocery list can I find in the \
          produce section?"}' \
  http://localhost:3000/chat


The following items on your grocery list can be found in the produce section:

* Onions
* Carrots
* Celery
* Potatoes
* Fresh basil



$ curl -X POST http://localhost:3000/reset


OK



$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"What items are on my grocery list?"}' \
  http://localhost:3000/chat


I do not have access to your grocery list, so I cannot give you the items on it.

它正在工作。看起来我们已经准备好部署了！

将我们的应用程序部署到 Heroku

为了部署我们的应用程序，我选择使用 Heroku。它快速、简单且成本低。我只需几个简单的步骤就可以让我的代码在云中运行，而不会陷入所有细节的基础设施问题。这样，我就可以专注于构建很酷的应用程序。

1. 将 Procfile 添加到代码库

我们需要包含一个名为的文件，它告诉 Heroku 如何启动我们的应用程序。内容如下所示：ProcfileProcfile

web: npm start

我们将此文件提交到我们的代码库存储库。

2. 登录 Heroku（通过 CLI）

~/project$ heroku login

3. 创建应用程序

~/project$ heroku create gemini-chatbot


Creating ⬢ gemini-chatbot... done
https://gemini-chatbot-1933c7b1f717.herokuapp.com/ | https://git.heroku.com/gemini-chatbot.git

4. 添加 Gemini API Key 作为配置环境变量

~/project$ heroku config:add \
  --app gemini-chatbot \ 
  GEMINI_API_KEY=ABCDEFGH0123456789_JJJ


Setting GEMINI_API_KEY and restarting ⬢ gemini-chatbot... done, v3
GEMINI_API_KEY: ABCDEFGH0123456789_JJJ

5. 将代码推送到 Heroku Remote

~/project$ git push heroku main

...
remote: -----> Building on the Heroku-22 stack
remote: -----> Determining which buildpack to use for this app
remote: -----> Node.js app detected
...
remote: -----> Build succeeded!
remote: -----> Discovering process types
remote:        Procfile declares types -> web
remote: 
remote: -----> Compressing...
remote:        Done: 45.4M
remote: -----> Launching...
remote:        Released v4
remote:        https://gemini-chatbot-1933c7b1f717.herokuapp.com/ deployed to Heroku

就是这样？就是这样。

测试我们部署的应用程序

部署完我们的应用程序后，让我们向 Heroku 应用程序 URL 发送一些 curl 请求。

$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"If I ask you later for my PIN, remind me that it is 12345."}' \
  https://gemini-chatbot-1933c7b1f717.herokuapp.com/chat

Sure, if you ask me for your PIN later, I will remind you that it is 12345.

**Please note that it is not a good idea to share your PIN with anyone, including me.**
Your PIN is a secret code that should only be known to you.
If someone else knows your PIN, they could access your account and withdraw your money.



$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"What is my PIN?"}' \
  https://gemini-chatbot-1933c7b1f717.herokuapp.com/chat

Your PIN is 12345.



$ curl -X POST https://gemini-chatbot-1933c7b1f717.herokuapp.com/reset

OK



$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"What is my PIN?"}' \
  https://gemini-chatbot-1933c7b1f717.herokuapp.com/chat

Unfortunately, I am unable to provide your personal PIN
as I do not have access to your private information. 

If you can't remember it, I suggest you visit the bank or
organization that issued the PIN to retrieve or reset it.

现在是构建基于 LLM 的应用程序的好时机。乘风破浪！

我们已经介绍了如何在 Google Gemini 之上构建一个简单的基于 LLM 的应用程序。我们简单的聊天机器人助手是基本的，但它是熟悉 Gemini API 及其相关 SDK 的好方法。通过使用 Heroku 进行部署，您可以卸载次要问题，以便您可以专注于学习和构建重要的地方。