极客时间：试试在Nvidia NIM上用Node.js试试LLM和嵌入技术

新加坡内哥谈技术

于 2024-07-15 00:30:00 发布

阅读量223

点赞数 5

文章标签：人工智能语言模型计算机视觉自然语言处理

本文链接：https://blog.csdn.net/2301_79342058/article/details/140421339

版权

每周跟踪AI热点新闻动向和震撼发展想要探索生成式人工智能的前沿进展吗？订阅我们的简报，深入解析最新的技术突破、实际应用案例和未来的趋势。与全球数同行一同，从行业内部的深度分析和实用指南中受益。不要错过这个机会，成为AI领域的领跑者。点击订阅，与未来同行！订阅：https://rengongzhineng.io/

Nvidia NIM早在很久之前就推出了，但一直没看到它的实际应用。过去几周，玩了一下OpenAI API和本地LLM。出于好奇，想看看Nvidia NIM是如何工作的。这个周末，尝试使用Node.js调用Nvidia NIM上的Llama模型和Embedding模型。

步骤一：注册Nvidia NIM账户

注册Nvidia NIM账户非常简单，并且对开发者免费开放。

步骤二：从“Models”中选择一个LLM模型并生成测试代码

在测试中，他们选择了“llama-3–70b-instruct”模型。在这里，你可以看到可以生成API密钥的操控台。

对于LLM集成，NIM还会生成在本地机器上运行的NodeJS代码。

// Llama模型脚本 (appNIM.js)
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'API Key',
  baseURL: 'https://integrate.api.nvidia.com/v1',
});

async function main() {
  try {
    const completion = await openai.chat.completions.create({
      model: "meta/llama3–70b-instruct",
      messages: [{"role":"user","content":"who are you?"}],
      temperature: 0.5,
      top_p: 1,
      max_tokens: 1024,
      stream: true,
    });

    for await (const chunk of completion) {
      process.stdout.write(chunk.choices[0]?.delta?.content || '');
    }
  } catch (error) {
    console.error('Error during completion:', error);
  }
}

main();

步骤三：运行脚本

运行“npm start”，你会看到来自NIM上Llama模型的响应。

步骤四：从“Retrieval”中选择Embedding模型

这个步骤和选择LLM模型类似。NVIDIA的embed-qa-4是一个GPU加速模型，专为生成QA检索任务的文本嵌入而设计。它是NVIDIA增强型检索生成（RAG）应用程序的一部分，目前处于预览阶段。

与LLM集成不同，NIM不会生成Nodejs代码，因此需要自己编写。

步骤五：编写调用Embedding模型的脚本

根据下面的Curl参考，

curl -X POST https://ai.api.nvidia.com/v1/retrieval/nvidia/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer API_Key" \
  -d '{
    "input": ["What is the capital of France?"],
    "model": "NV-Embed-QA",
    "input_type": "query",
    "encoding_format": "float",
    "truncate": "NONE"
  }'

import fetch from 'node-fetch'; // 确保安装了node-fetch

async function main() {
  const url = 'https://ai.api.nvidia.com/v1/retrieval/nvidia/embeddings';
  const apiKey = 'API_key';
  const requestPayload = {
    input: ["What is the capital of France?"],
    model: "NV-Embed-QA",
    input_type: "query",
    encoding_format: "float",
    truncate: "NONE"
  };

  try {
    const response = await fetch(url, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${apiKey}`
      },
      body: JSON.stringify(requestPayload)
    });

    if (!response.ok) {
      const errorText = await response.text();
      throw new Error(`Request failed with status ${response.status}: ${errorText}`);
    }

    const data = await response.json();
    console.log('Response:', data.data[0].embedding);
  } catch (error) {
    console.error('Error:', error.message);
  }
}

main();