txtai API 库
本教程系列将涵盖txtai的主要用例,这是一个 AI 驱动的语义搜索平台。该系列的每章都有相关代码,可也可以在colab 中使用。
colab 地址
txtai API 是由FastAPI支持的基于 Web 的服务。所有 txtai 功能,包括相似性搜索、提取 QA 和零样本标记都可以通过 API 获得。
本文安装了 txtai API 并展示了一个使用 txtai 支持的每种语言绑定的示例。
安装依赖
安装txtai
和所有依赖项。由于本文使用了API,我们需要安装api extras包。
pip install txtai[api]
Python
我们将尝试的第一种方法是通过 Python 直接访问。我们将在此处的所有示例中使用零样本标记。有关零样本分类的更多详细信息,请参阅此文章。
import os
from IPython.core.display import display, HTML
from txtai.pipeline import Labels
def table(rows):
html = """
<style type='text/css'>
@import url('https://fonts.googleapis.com/css?family=Oswald&display=swap');
table {
border-collapse: collapse;
width: 900px;
}
th, td {
border: 1px solid #9e9e9e;
padding: 10px;
font: 20px Oswald;
}
</style>
"""
html += "<table><thead><tr><th>Text</th><th>Label</th></tr></thead>"
for text, label in rows:
html += "<tr><td>%s</td><td>%s</td></tr>" % (text, label)
html += "</table>"
display(HTML(html))
# Create labels model
labels = Labels()
将标签应用于文本
data = ["Wears a red suit and says ho ho",
"Pulls a flying sleigh",
"This is cut down and decorated",
"Santa puts these under the tree",
"Best way to spend the holidays"]
# List of labels
tags = ["🎅 Santa Clause", "🦌 Reindeer", "🍪 Cookies", "🎄 Christmas Tree", "🎁 Gifts", "👪 Family"]
# Render output to table
table([(text, tags[labels(text, tags)[0][0]]) for text in data])
文本 | 标签 |
---|---|
穿着红色西装说ho ho | 🎅圣诞老人 |
拉着飞行的雪橇 | 🦌 驯鹿 |
这是削减和装饰 | 🎄圣诞树 |
圣诞老人把这些放在树下 | 🎁 礼物 |
度过假期的最佳方式 | 👪家人 |
我们再次看到零样本标记的力量。该模型未针对此示例的任何特定数据进行训练。仍然对大型 NLP 模型中存储了多少知识感到惊讶。
启动 API 实例
现在我们将启动一个 API 实例来运行其余的示例。API 需要一个配置文件才能运行。下面的示例已简化为仅包含标签。有关更详细的配置示例,请参阅此链接。
API 实例在后台启动。
CONFIG=index.yml nohup uvicorn "txtai.api:app" &> api.log & sleep 90
JavaScript
txtai.js 可通过 NPM 获得,可以按如下方式安装。
npm install txtai
对于此示例,我们将克隆 txtai.js 项目以导入示例构建配置。
git clone https://github.com/neuml/txtai.js
创建标签.js
以下文件是标签示例的 JavaScript 版本。
import {Labels} from "txtai";
import {sprintf} from "sprintf-js";
const run = async () => {
try {
let labels = new Labels("http://localhost:8000");
let data = ["Wears a red suit and says ho ho",
"Pulls a flying sleigh",
"This is cut down and decorated",
"Santa puts these under the tree",
"Best way to spend the holidays"];
// List of labels
let tags = ["🎅 Santa Clause", "🦌 Reindeer", "🍪 Cookies", "🎄 Christmas Tree", "🎁 Gifts", "👪 Family"];
console.log(sprintf("%-40s %s", "Text", "Label"));
console.log("-".repeat(75))
for (let text of data) {
let label = await labels.label(text, tags);
label = tags[label[0].id];
console.log(sprintf("%-40s %s", text, label));
}
}
catch (e) {
console.trace(e);
}
};
run();
构建和运行标签示例
cd txtai.js/examples/node
npm install
npm run build
node dist/labels.js
Text Label
---------------------------------------------------------------------------
Wears a red suit and says ho ho 🎅 Santa Clause
Pulls a flying sleigh 🦌 Reindeer
This is cut down and decorated 🎄 Christmas Tree
Santa puts these under the tree 🎁 Gifts
Best way to spend the holidays 👪 Family
JavaScript 程序显示的结果与通过 Python 本地运行时相同!
Java
txtai.java 与标准 Java 构建工具(Gradle、Maven、SBT)集成。下面展示了如何将 txtai 作为依赖添加到 Gradle。
implementation 'com.github.neuml:txtai.java:v2.0.0'
对于此示例,我们将克隆 txtai.java 项目以导入示例构建配置。
git clone https://github.com/neuml/txtai.java
创建 LabelsDemo.java
以下文件是标签示例的 Java 版本。
import java.util.Arrays;
import java.util.ArrayList;
import java.util.List;
import txtai.API.IndexResult;
import txtai.Labels;
public class LabelsDemo {
public static void main(String[] args) {
try {
Labels labels = new Labels("http://localhost:8000");
List <String> data =
Arrays.asList("Wears a red suit and says ho ho",
"Pulls a flying sleigh",
"This is cut down and decorated",
"Santa puts these under the tree",
"Best way to spend the holidays");
// List of labels
List<String> tags = Arrays.asList("🎅 Santa Clause", "🦌 Reindeer", "🍪 Cookies", "🎄 Christmas Tree", "🎁 Gifts", "👪 Family");
System.out.printf("%-40s %s%n", "Text", "Label");
System.out.println(new String(new char[75]).replace("\0", "-"));
for (String text: data) {
List<IndexResult> label = labels.label(text, tags);
System.out.printf("%-40s %s%n", text, tags.get(label.get(0).id));
}
}
catch (Exception ex) {
ex.printStackTrace();
}
}
}
cd txtai.java/examples
../gradlew -q --console=plain labels 2> /dev/null
Text Label
---------------------------------------------------------------------------
Wears a red suit and says ho ho 🎅 Santa Clause
Pulls a flying sleigh 🦌 Reindeer
This is cut down and decorated 🎄 Christmas Tree
Santa puts these under the tree 🎁 Gifts
Best way to spend the holidays 👪 Family
Java 程序显示的结果与通过 Python 本地运行时相同!
Rust
txtai.rs 可以通过 crates.io 获得,并且可以通过将以下内容添加到您的 cargo.toml 文件来安装
[dependencies]
txtai = { version = "2.0" }
tokio = { version = "0.2", features = ["full"] }
对于此示例,我们将克隆 txtai.rs 项目以导入示例构建配置。首先我们需要安装 Rust。
apt-get install rustc
git clone https://github.com/neuml/txtai.rs
创建balel.rs
以下文件是标签示例的 Rust 版本。
use std::error::Error;
use txtai::labels::Labels;
pub async fn labels() -> Result<(), Box<dyn Error>> {
let labels = Labels::new("http://localhost:8000");
let data = ["Wears a red suit and says ho ho",
"Pulls a flying sleigh",
"This is cut down and decorated",
"Santa puts these under the tree",
"Best way to spend the holidays"];
println!("{:<40} {}", "Text", "Label");
println!("{}", "-".repeat(75));
for text in data.iter() {
let tags = vec!["🎅 Santa Clause", "🦌 Reindeer", "🍪 Cookies", "🎄 Christmas Tree", "🎁 Gifts", "👪 Family"];
let label = labels.label(text, &tags).await?[0].id;
println!("{:<40} {}", text, tags[label]);
}
Ok(())
}
构建和运行标签示例
cd txtai.rs/examples/demo
cargo build
cargo run labels
Text Label
--------------------------------------------------------------------------------
Wears a red suit and says ho ho 🎅 Santa Clause
Pulls a flying sleigh 🦌 Reindeer
This is cut down and decorated 🎄 Christmas Tree
Santa puts these under the tree 🎁 Gifts
Best way to spend the holidays 👪 Family
Rust 程序显示的结果与通过 Python 本地运行时相同!
GO
txtai.go 可以通过添加以下导入语句来安装。使用模块时,会自动安装txtai.go。否则使用go get
.
import "github.com/neuml/txtai.go"
对于此示例,我们将创建一个独立的标签流程。首先我们需要安装 Go。
apt install golang-go
go get "github.com/neuml/txtai.go"
创建label.go
以下文件是标签示例的 Go 版本。
package main
import (
"fmt"
"strings"
"github.com/neuml/txtai.go"
)
func main() {
labels := txtai.Labels("http://localhost:8000")
data := []string{"Wears a red suit and says ho ho",
"Pulls a flying sleigh",
"This is cut down and decorated",
"Santa puts these under the tree",
"Best way to spend the holidays"}
// List of labels
tags := []string{"🎅 Santa Clause", "🦌 Reindeer", "🍪 Cookies", "🎄 Christmas Tree", "🎁 Gifts", "👪 Family"}
fmt.Printf("%-40s %s\n", "Text", "Label")
fmt.Println(strings.Repeat("-", 75))
for _, text := range data {
label := labels.Label(text, tags)
fmt.Printf("%-40s %s\n", text, tags[label[0].Id])
}
}
go run labels.go
Text Label
--------------------------------------------------------------------------------
Wears a red suit and says ho ho 🎅 Santa Clause
Pulls a flying sleigh 🦌 Reindeer
This is cut down and decorated 🎄 Christmas Tree
Santa puts these under the tree 🎁 Gifts
Best way to spend the holidays 👪 Family
参考
https://dev.to/neuml/tutorial-series-on-txtai-ibg