sentencebert 文本匹配训练

最新推荐文章于 2024-06-04 11:20:32 发布

机器玄学实践者

最新推荐文章于 2024-06-04 11:20:32 发布

阅读量322

点赞数

文章标签： python 开发语言自然语言处理 bert

本文链接：https://blog.csdn.net/weixin_39673686/article/details/130087707

版权

本文介绍了使用SentenceBERT进行文本匹配的训练过程，重点强调了标签类型（int或float）、损失函数的选择以及评估器的应用。参考代码可在UKPLab的Sentence Transformers GitHub仓库中找到。

摘要由CSDN通过智能技术生成

"""
This examples trains BERT (or any other transformer model like RoBERTa, DistilBERT etc.) for the STSbenchmark from scratch. It generates sentence embeddings
that can be compared using cosine-similarity to measure the similarity.

Usage:
python training_nli.py

OR
python training_nli.py pretrained_transformer_model_name
"""
# 远程路径
#/data/yangjie/yuhang/sentence-transformers/examples/training/sts


from torch.utils.data import DataLoader
import math
from sentence_transformers import SentenceTransformer,  LoggingHandler, losses, models, util
from sentence_transformers.evaluation import EmbeddingSimilarityEvaluator,BinaryClassificationEvaluator
from sentence_transformers.readers import InputExample
import logging
from datetime import datetime
import sys
import os
import gzip
import csv
import pandas as pd
#### Just some code to print debug information to stdout
logging.basicConfig(format='%(asctime)s - %(message)s',