参考官方教程:MAUVE - a Hugging Face Space by evaluate-metric
This metrics is a wrapper around the official implementation of MAUVE: https://github.com/krishnap25/mauve
所以还需要 pip install mauve-text
加载
首先加载 mauve
load需要注意的事项见:这篇文章
from evaluate import load
import pandas as pd
mauve = load('./metrics/mauve.py')
准备数据
构建数据,这个指标分别需要机器生成和人类生成的文字
predictions = ["我有信心完成论文,即使遇到困难也会坚持不懈,同时相信自己能够找到解决问题的方法。",
"当我遇到学习上的挑战时,我会相信自己能找到解决问题的方法,并能保持冷静应对。",
"当家里有突发情况时,我有信心妥善处理,因为我相信自己的解决问题能力。"]
references = ["....",
"...",
"..."]
计算mauve
mauve_results = mauve.compute(predictions=predictions, references=references,
featurize_model_name='/mnt/workspace/model/Qwen2-1.5B',
device_id=0)
print(mauve_results.mauve)
mauve.compute
的参数
device_id
: 使用的GPU的id,我设置为0featurize_model_name
: 模型名称,原作者只允许使用gpt2:gpt2
,gpt2-medium
,gpt2-large
,gpt2-xl
如果想要使用别的模型,例如我传入了 /mnt/workspace/model/Qwen2-1.5B
,就会发生报错
(ques) root@dsw-582560-844c7b5497-wtchp:/mnt/workspace/evaluation# python calculate_mavue.pyLoading tokenizer
Traceback (most recent call last):
File "/mnt/workspace/evaluation/calculate_mavue.py", line 6, in <module>
mauve_results = mauve.compute(predictions=predictions, references=references,
File "/root/anaconda3/envs/ques/lib/python3.9/site-packages/evaluate/module.py", line 467, in compute
output = self._compute(**inputs, **compute_kwargs)
File "/root/.cache/huggingface/modules/evaluate_modules/metrics/mauve/b87f25fdd9447734bf0651854506123cb4ecc7820a56cf1eb7b9e41950e9a83b/mauve.py", line 136, in _compute
out = compute_mauve(
File "/root/anaconda3/envs/ques/lib/python3.9/site-packages/mauve/compute_mauve.py", line 92, in compute_mauve
p_features = get_features_from_input(
File "/root/anaconda3/envs/ques/lib/python3.9/site-packages/mauve/compute_mauve.py", line 172, in get_features_from_input
TOKENIZER = get_tokenizer(featurize_model_name)
File "/root/anaconda3/envs/ques/lib/python3.9/site-packages/mauve/utils.py", line 38, in get_tokenizer
raise ValueError(f'Unknown model: {model_name}')
ValueError: Unknown model: /mnt/workspace/model/Qwen2-1.5B
解决方法如下:
前往报错地点 /root/anaconda3/envs/ques/lib/python3.9/site-packages/mauve/utils.py
,在get_model
和get_tokenize
加入你想要的模型。我把代码改成了下面的样子,加入了 'Qwen2' in model_name
的条件判断,尽量不修改源代码。
def get_model(model_name, tokenizer, device_id):
device = get_device_from_arg(device_id)
if 'gpt2' in model_name or "bert" in model_name:
model = AutoModel.from_pretrained(model_name, pad_token_id=tokenizer.eos_token_id).to(device)
model = model.eval()
else:
if 'Qwen2' in model_name:
model = AutoModel.from_pretrained(model_name, pad_token_id=151643).to(device)
model = model.eval()
else:
raise ValueError(f'Unknown model: {model_name}')
return model
def get_tokenizer(model_name='gpt2'):
if 'gpt2' in model_name or "bert" in model_name or "Qwen2" in model_name:
tokenizer = AutoTokenizer.from_pretrained(model_name)
else:
raise ValueError(f'Unknown model: {model_name}')
return tokenizer
最终运行成功
这个指标的计算还需要一些超参数的设定,也需要大量的输入文本,本次实践只是尝试运行,后续还需进一步的调整。详见:krishnap25/mauve: Package to compute Mauve, a similarity score between neural text and human text)