Phi-3CookBook项目快速入门指南：Azure AI模型部署与使用详解-CSDN博客

本文链接：https://blog.csdn.net/gitblog_01061/article/details/148578691

Phi-3CookBook项目快速入门指南：Azure AI模型部署与使用详解

Phi-3CookBook This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks. 项目地址: https://gitcode.com/gh_mirrors/ph/Phi-3CookBook

前言

在人工智能技术快速发展的今天，微软推出的Phi-3系列模型以其出色的性能和易用性受到广泛关注。本文将详细介绍如何通过Azure AI平台快速部署和使用Phi-3系列模型，帮助开发者快速上手这一强大的AI工具集。

Phi-3模型系列概览

Phi-3系列模型是微软推出的新一代AI模型，包含多个不同规格的版本，适用于各种应用场景：

Phi-3-Medium系列：
- 128k上下文长度的Phi-3-Medium-128k-Instruct
- 4k上下文长度的Phi-3-medium-4k-instruct
Phi-3-Mini系列：
- 128k上下文长度的Phi-3-mini-128k-instruct
- 4k上下文长度的Phi-3-mini-4k-instruct
Phi-3-Small系列：
- 128k上下文长度的Phi-3-small-128k-instruct
- 8k上下文长度的Phi-3-small-8k-instruct

这些模型在指令跟随、多轮对话等场景表现出色，开发者可以根据自己的需求选择合适的模型版本。

环境准备

1. 获取访问凭证

在使用Phi-3模型前，需要创建一个访问令牌。这个令牌将用于后续的API调用认证：

# Bash环境
export GITHUB_TOKEN="<your-token-here>"

# PowerShell环境
$Env:GITHUB_TOKEN="<your-token-here>"

# Windows命令提示符
set GITHUB_TOKEN=<your-token-here>

2. 安装必要依赖

根据开发语言不同，安装相应的SDK：

Python环境

pip install azure-ai-inference

JavaScript环境

创建package.json文件并安装依赖：

{
  "type": "module",
  "dependencies": {
    "@azure-rest/ai-inference": "latest",
    "@azure/core-auth": "latest",
    "@azure/core-sse": "latest"
  }
}

然后运行：

npm install

基础使用示例

Python示例代码

基础问答调用

import os
from azure.ai.inference import ChatCompletionsClient
from azure.ai.inference.models import SystemMessage, UserMessage
from azure.core.credentials import AzureKeyCredential

endpoint = "https://models.inference.ai.azure.com"
model_name = "Phi-3-small-8k-instruct"
token = os.environ["GITHUB_TOKEN"]

client = ChatCompletionsClient(
    endpoint=endpoint,
    credential=AzureKeyCredential(token),
)

response = client.complete(
    messages=[
        SystemMessage(content="You are a helpful assistant."),
        UserMessage(content="What is the capital of France?"),
    ],
    model=model_name
)

print(response.choices[0].message.content)

多轮对话实现

messages = [
    SystemMessage(content="You are a helpful assistant."),
    UserMessage(content="What is the capital of France?"),
    AssistantMessage(content="The capital of France is Paris."),
    UserMessage(content="What about Spain?"),
]

response = client.complete(messages=messages, model=model_name)
print(response.choices[0].message.content)

流式输出处理

response = client.complete(
    stream=True,
    messages=[
        SystemMessage(content="You are a helpful assistant."),
        UserMessage(content="Give me 5 good reasons why I should exercise every day."),
    ],
    model=model_name,
)

for update in response:
    if update.choices:
        print(update.choices[0].delta.content or "", end="")

JavaScript示例代码

基础调用

import ModelClient from "@azure-rest/ai-inference";
import { AzureKeyCredential } from "@azure/core-auth";

const token = process.env["GITHUB_TOKEN"];
const endpoint = "https://models.inference.ai.azure.com";
const modelName = "Phi-3-small-8k-instruct";

const client = new ModelClient(endpoint, new AzureKeyCredential(token));

const response = await client.path("/chat/completions").post({
    body: {
      messages: [
        { role:"system", content: "You are a helpful assistant." },
        { role:"user", content: "What is the capital of France?" }
      ],
      model: modelName
    }
});
console.log(response.body.choices[0].message.content);

REST API调用示例

基础调用

curl -X POST "https://models.inference.ai.azure.com/chat/completions" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $GITHUB_TOKEN" \
    -d '{
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What is the capital of France?"}
        ],
        "model": "Phi-3-small-8k-instruct"
    }'

高级使用技巧

参数调优：
- temperature：控制输出的随机性（0-2）
- max_tokens：限制响应长度
- top_p：核采样参数

错误处理：

try:
    response = client.complete(...)
except Exception as e:
    print(f"Error occurred: {str(e)}")

性能优化：
- 对于长对话，考虑定期清理历史消息
- 使用流式输出提升用户体验
- 合理设置超时参数

使用限制与注意事项

免费使用限制：
- 每分钟请求数限制
- 每日请求总量限制
- 单次请求token数限制
- 并发请求数限制
内容安全：
- 所有请求都会经过Azure AI内容安全过滤
- 无法在免费使用中关闭内容过滤
生产环境建议：
- 免费版本仅适用于原型开发
- 生产环境建议使用Azure付费服务
- 注意遵守微软的服务条款

结语

通过本文的介绍，您应该已经掌握了Phi-3系列模型的基本使用方法。从环境配置到基础调用，再到高级技巧，这些知识将帮助您快速开始AI应用的开发。随着对模型的深入使用，您可以进一步探索Phi-3模型在各种场景下的强大能力，打造出更加智能的应用解决方案。

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考