Semantic Kernel:图识文

多模态是每个LLM具有的能力,图片又是最常见的信息载体,GPT对图片的识别也很早就有了,随着GPT版本的迭代,效果越来越好。SK也是在很多就适配了图识文,只不过最近版本才支持本地图片的上传。(有点晚)

图片场景识别:

using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;


var chatModelId = "gpt-4o";
var key = File.ReadAllText(@"C:\GPT\key.txt");
#pragma warning disable SKEXP0070
#pragma warning disable SKEXP0010
#pragma warning disable SKEXP0001
#pragma warning disable SKEXP0110
var kernel = Kernel.CreateBuilder()
   .AddOpenAIChatCompletion(chatModelId, key)
   .Build();


var chat = kernel.GetRequiredService<IChatCompletionService>();
var chatHistory = new ChatHistory();
chatHistory.AddUserMessage(new ChatMessageContentItemCollection
{
     new TextContent("请说明这是那里,什么样的天气,大家在干什么?一共有多少人"),
     new ImageContent(File.ReadAllBytes("tam.jpg"),"image/jpeg")
});
var settings = new Dictionary<string, object>
{
    ["max_tokens"] = 1000,
    ["temperature"] = 0.2,
    ["top_p"] = 0.8,
    ["presence_penalty"] = 0.0,
    ["frequency_penalty"] = 0.0
};


var content = chat.GetStreamingChatMessageContentsAsync(chatHistory, new PromptExecutionSettings
{
    ExtensionData = settings
});
await foreach (var item in content)
{
    Console.Write(item.Content);
}
Console.ReadLine();

图片:

3438491bc42d2a824ba4918b9fb73c4a.png

结果:

64300db6178656d40156242df312663f.png

文字识别:

using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;


var chatModelId = "gpt-4o";
var key = File.ReadAllText(@"C:\GPT\key.txt");
#pragma warning disable SKEXP0070
#pragma warning disable SKEXP0010
#pragma warning disable SKEXP0001
#pragma warning disable SKEXP0110
var kernel = Kernel.CreateBuilder()
   .AddOpenAIChatCompletion(chatModelId, key)
   .Build();


var chat = kernel.GetRequiredService<IChatCompletionService>();
var chatHistory = new ChatHistory();
chatHistory.AddUserMessage(new ChatMessageContentItemCollection
{
     new TextContent("请识别图片上的文字,并输出"),
     new ImageContent(File.ReadAllBytes("japancard.png"),"image/jpeg")
});
var settings = new Dictionary<string, object>
{
    ["max_tokens"] = 1000,
    ["temperature"] = 0.2,
    ["top_p"] = 0.8,
    ["presence_penalty"] = 0.0,
    ["frequency_penalty"] = 0.0
};


var content = chat.GetStreamingChatMessageContentsAsync(chatHistory, new PromptExecutionSettings
{
    ExtensionData = settings
});
await foreach (var item in content)
{
    Console.Write(item.Content);
}
Console.ReadLine();

图片:

640fafaebb02c336859079a7e8bb073d.png

结果:

9f8a779e6a610588d19289720f75a0e5.png

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值