Slot-Gated Modeling for Joint Slot Filling and Intent Prediction

Introduction

slot-filling, intent detection示例：

1. the  proposed  slot-gated  approach  achieves  better  performance  than the  attention-based  models;
2. the  experiments  on two  SLU  datasets  show  the  generalization  and  the effectiveness  of  the  proposed  slot  gate;
3. the  gating  results  help  us  analyze  the  slot-intent  relations.


Slot-Gated Model

Attention-Based RNN Model

Figure 2中的BILSTM输入为word sequence $\mathbf{x}=(x_{1},…,x_{T})$ ，生成前向隐层状态$\underset{h_{i}}{\rightarrow}$和反向隐层状态$\underset{h_{i}}{\leftarrow}$ ，最终将二者拼接得到$h_{i}=[\underset{h_{i}}{\rightarrow};\underset{h_{i}}{\leftarrow}]$。

Slot Filling:

SF任务是将输入$\mathbf{x}=(x_{1},…,x_{T})$映射成输出$\mathbf{y}=(y_{1}^{S},…,y_{T}^{S})$。对于每个步长的输入word对应的$h_{i}$，首先计算slot context vector $c_{i}^{S}$ （实际上是self-attention，对应Figure 2中的slot attention）：

$\alpha_{i,j}^{S}$ 是attention score：

Intent Prediction

intent context vector $c^{I}$ 的计算方式类似于 $c_{i}^{S}$ ，区别在于预测意图时只使用BILSTM最后一个隐层状态$h_{T}$：

Slot-Gated Mechanism

slot-gated的主要目的是使用intent context vector来改善slot-filling的表现，结构如下：

where v and W are trainable vector and matrix respectively. The summation is done over elements in one time step.

Joint Optimization

$$p(y^{S},y^{I}|\mathbf{x})\\=p(y^{I}|\mathbf{x})\prod_{t=1}^{T}p(y^{S}_{t}|\mathbf{x})\\=p(y^{I}|x_{1},…,x_{T})\prod_{t=1}^{T}p(y^{S}_{t}|x_{1},…,x_{T})$$

Experiment

Dataset

Compared to single-domain ATIS dataset, Snips is more complicated mainly due to the intent diversity and large vocabulary.

Results and Analysis

Considering different complexity of these datasets, the probable reason is that a simpler SLU task, such as ATIS, does not require additional slot attention to achieve good results, and the slot gate is capable of providing enough cues for slot filling. On the other hand, Snips is more complex, so that the slot attention is needed in order to model slot filling better (as well as the semantic frame results).

It may credit to the proposed slot gate that learns the slot-intent relations to provide helpful information for global optimization of the joint model.

Conclusion

This paper focuses on learning the explicit slot-intent relations by introducing a slot-gated mechanism into the state-of-the-art attention model, which allows the slot filling can be conditioned on the learned intent result in order to achieve better SLU (joint slot filling and intent detection).