【论文笔记】SEQ2SQL: GENERATING STRUCTURED QUERIES FROM NATURAL LANGUAGE USING REINFORCEMENT LEARNING

最新推荐文章于 2024-05-18 18:02:58 发布

volcano_66

最新推荐文章于 2024-05-18 18:02:58 发布

阅读量917

点赞数 19

文章标签：论文阅读

本文链接：https://blog.csdn.net/volcano_66/article/details/136167160

版权

本文介绍了一种基于AUGMENTEDPOINTERNETWORK的Seq2SQL模型，它结合了两层双向LSTM编码和两层单向LSTM解码。模型通过计算注意力权重生成SQL查询，包括聚合操作选择、SELECT列定位和WHERE子句生成，同时应用强化学习进行优化。

摘要由CSDN通过智能技术生成

处理输入: $x^c_1;x^c_2; ...;x^c_N ; <sql>; x^s; <question>; x^q]$
encode: two-layer, bidirectional LSTM, the output is $h_t$
decode: two layer, unidirectional LSTM. the output is $g_t$
produce scaler attention: $α^{ptr}_{s,t} = W^{ptr}tanh (U^{ptr}g_s + V^{ptr}h_t )$
predict the next token: $softmax(α^{ptr}_{s,t})$

在这里插入图片描述

First, the network classifies an aggregation operation for the query, with the addition of a null operation that corresponds to no aggregation.
Next, the network points to a column in the input table corresponding to the SELECT column.
Finally, the network generates the conditions for the query using a pointer network.

$Loss = L^{agg} + L^{sel} + L^{whe}$

compute the scaler attention: $α^{inp}_t = W^{inp}h^{enc}_t$
softmax: $β^{inp} = softmax (α^{inp})$
compute input representation: $κ^{agg} = \sum\limits^t β^{inp}_t h^{enc}$
multi-layer perceptron: $α^{agg} = W^{agg} tanh (V^{agg}κ^{agg} + b^{agg}agg) + c^{agg}$
softmax

using AUGMENTED POINTER NETWORK

apply RL

在这里插入图片描述

关注