为什么loss值不适合用于评估LSTM聊天机器人输出置信度

       使用LSTM构建聊天机器人,无论输入是什么,就算输入完全不符合语法,模型都会给出一个输出,显然,这个输出不是我们想要的,如何识别模型输出是不是我们想要的?我们需要一种评估指标,评估模型输出的置信度。那么使用loss值做为评价指标是否可行呢?具体的想法是输入一句话(string)进LSTM模型,得到预测句子,把预测句子做为标签(label),即(string,label)做为一条对话,输入到LSTM训练模型,计算出loss值loss_val,loss_val除以模型训练时loss平均值做为LSTM模型的置信度指标。这个想法是否可行,写代码测试看看。从下图测试结果来看,训练loss值:最小0.0,最大0.000322,平均2.495e-6;语料集之内的对话测试3次loss值:6.557e-7,0.0,0.219;语料集之外的对话测试3次loss值:2.187e-5,0.0,0.0886,语料集之内与之外的loss值并没与明显的边界,因此不能做为LSTM模型的置信度指标。

import sys
import pickle
import time

import numpy as np
import tensorflow as tf
from flask import Flask,request
import re

import relavance
from sequence_to_sequence import SequenceToSequence
from data_utils import batch_flow

def test(params):

    x_data, _ = pickle.load(open('pkl/chatbot.pkl', 'rb'))
    ws_encode = pickle.load(open('pkl/ws_encode.pkl', 'rb'))
    ws_decode = pickle.load(open('pkl/ws_decode.pkl', 'rb'))
    relv = pickle.load(open('pkl/relevance.pkl','rb'))

    for x in x_data[:5]:
        print(' '.join(x))

    config = tf.ConfigProto(
        device_count = {'CPU':1, 'GPU':0},
        allow_soft_placement=True,
        log_device_placement=False
    )

    save_path = './model/s2ss_chatbot_anti.ckpt'

    #tf.reset_default_graph()
    graph_pred = tf.Graph()
    graph_loss = tf.Graph()

    with graph_pred.as_default():
        model_pred = SequenceToSequence(
            input_vocab_size=len(ws_encode),
            target_vocab_size=len(ws_decode),
            batch_size=1,
            mode='decode',
            beam_width=0,
            max_decode_step=2,
            **params
        )
        init_pred = tf.global_variables_initializer()

    with graph_loss.as_default():
        model_loss = SequenceToSequence(
                input_vocab_size=len(ws_encode),
                target_vocab_size=len(ws_decode),
                batch_size=1,
                **params
        )
        init_loss = tf.global_variables_initializer()

    sess_loss = tf.Session(graph=graph_loss,config=config)
    sess_loss.run(init_loss)

    with tf.Session(graph=graph_pred,config=config) as sess:
        sess.run(init_pred)
        model_pred.load(sess, save_path)
        model_loss.load(sess_loss,save_path)

        while True:
            user_text = input('请输入您的句子:')
            if len(user_text)<1 or user_text in ('exit', 'quit'):
                sess_loss.close()
                print("test exit.")
                exit(0)

            x_test = [list(user_text.lower())]
            bar = batch_flow([x_test], ws_encode, 1)
            x, xl = next(bar)
            x = np.flip(x, axis=1)

            pred = model_pred.predict(sess,np.array(x),np.array(xl),
                attention=False,ws = None)

            loss = model_loss.train(sess_loss,np.array(x),np.array(xl),
                    pred,np.array([len(pred[0])]), loss_only=True)
            print("loss:", loss)

            ask = ''.join(ws_encode.inverse_transform(x[0])[::-1])

            print("input:",ask)

            for p in pred:
                if len(p) > 0:
                    ans = ws_decode.inverse_transform(p)
                    
                    print("answer:", ''.join(ans))
                    print("\n")

def main():
    import json

    time.sleep(1)
    print("start test...")
    test(json.load(open('params.json')))

if __name__ == '__main__':
    main()

训练时loss值:

语料集之内的对话测试结果:

语料集之外的对话测试结果:

 

 

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值