AUGMENTED POINTER NETWORK
-
处理输入: x = [ < c o l > ; x 1 c ; x 2 c ; . . . ; x N c ; < s q l > ; x s ; < q u e s t i o n > ; x q ] x = [<col>; x^c_1;x^c_2; ...;x^c_N ; <sql>; x^s; <question>; x^q] x=[<col>;x1c;x2c;...;xNc;<sql>;xs;<question>;xq]
-
encode: two-layer, bidirectional LSTM, the output is h t h_t ht
-
decode: two layer, unidirectional LSTM. the output is g t g_t gt
-
produce scaler attention: α s , t p t r = W p t r t a n h ( U p t r g s + V p t r h t ) α^{ptr}_{s,t} = W^{ptr}tanh (U^{ptr}g_s + V^{ptr}h_t ) αs,tptr=Wptrtanh(Uptrgs+Vptrht)
-
predict the next token: s o f t m a x ( α s , t p t r ) softmax(α^{ptr}_{s,t}) softmax(αs,tptr)
SEQ2SQL
- First, the network classifies an aggregation operation for the query, with the addition of a null operation that corresponds to no aggregation.
- Next, the network points to a column in the input table corresponding to the SELECT column.
- Finally, the network generates the conditions for the query using a pointer network.
L o s s = L a g g + L s e l + L w h e Loss = L^{agg} + L^{sel} + L^{whe} Loss=Lagg+Lsel+Lwhe
Aggregation Operation
- compute the scaler attention: α t i n p = W i n p h t e n c α^{inp}_t = W^{inp}h^{enc}_t αtinp=Winphtenc
- softmax: β i n p = s o f t m a x ( α i n p ) β^{inp} = softmax (α^{inp}) βinp=softmax(αinp)
- compute input representation: κ a g g = ∑ t β t i n p h e n c κ^{agg} = \sum\limits^t β^{inp}_t h^{enc} κagg=∑tβtinphenc
- multi-layer perceptron: α a g g = W a g g t a n h ( V a g g κ a g g + b a g g a g g ) + c a g g α^{agg} = W^{agg} tanh (V^{agg}κ^{agg} + b^{agg}agg) + c^{agg} αagg=Waggtanh(Vaggκagg+baggagg)+cagg
- softmax
SELECT Column
- encode each column name with a LSTM: e j c e^c_j ejc
- compute κ s e l κ^{sel} κsel: method same as κ a g g κ^{agg} κagg
- multi-layer perceptron: α j s e l = W s e l t a n h ( V s e l κ s e l + V c e j c ) α^{sel}_j = W^{sel} tanh (V^{sel}κ^{sel} + V^ce^c_j ) αjsel=Wseltanh(Vselκsel+Vcejc)
- softmax
WHERE Clause
using AUGMENTED POINTER NETWORK
apply RL