RNNLG

RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.

数据

1.原始数据来自于四个不同的领域:hotel,laptop,reataurant,tv
2.每个数据集分为三类:train,valid,test
3.数据格式为三元组:[MR/Dialogue Act, Human Authored Response, HDC baseline]

# example
[
    "inform(name='trattoria contadina';pricerange=moderate)",
    "trattoria contadina is a nice restaurant in the moderate price range",
    "trattoria contadina is a nice place it is in the moderate price range"
]

scLSTM的输入

1.数据预处理:将原始数据中的三元组扩充为四元组

1
2
3
4
5
6
7
8
9
10
11
12
13
# example
[
[('a', u'inform'), (u'area', '_1'), (u'name', '_1'), (u'pricerange', '_1')],
u"inform(name='alamo square seafood grill';area='friendship village';pricerange=moderate)",
u'SLOT_NAME is a nice restaurant in the area of SLOT_AREA and it is in the SLOT_PRICERANGE price range',
u'SLOT_NAME is a nice place , it is in the area of SLOT_AREA and it is in the SLOT_PRICERANGE price range'
]
[
[('a', u'?request'), (u'near', '?')],
u'?request(near)',
u'please confirm your area of interest',
u'where would you like it to be near to'
]

2.feat_template: a, sv, s, v

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# a代表句子的意图
a = [
a.?compare,a.?confirm,a.?reqmore,a.?request,a.?select, a.bye,
a.goodbye,a.inform,a.inform_all,a.inform_count,a.inform_no_info,
a.inform_no_match,a.inform_only_match,a.recommend,a.suggest
]

# sv : sv.slot.slot_value
sv = [
sv.acceptscreditcards.dontcare,sv.acceptscreditcards.no,
sv.acceptscreditcards.yes,sv.accessories._1,sv.accessories._2,
sv.accessories.none,sv.address._1,sv.area.?,sv.area._1,
sv.area.dontcare,sv.audio._1,sv.audio._2,sv.audio.none,
sv.battery._1,sv.battery._2,
etc...
]

# s : s.slot
s = [
s.acceptscreditcards,s.accessories,s.address,s.area,
s.audio,s.battery,s.batteryrating,s.color,s.count,
s.design,s.dimension
etc...
]

# v : slot_value
v = [
v.?,v._1,v._2,v._3,v.dontcare,v.no,v.none,v.yes,v.NONE
]

3.input of scLSTM for training

  • a: 句子意图(inform,request,recommend,etc)在feat_template中a集合的位置索引
  • sv: sv is the index of sv.slot_name.slot_value in all the sv.slot_name.slot_values in feat_template.txt
  • s: same as sv
  • v: same as sv
  • words: 四元组中第三项Human Authored Response句子的单词index集合
  • b_size: batch size
  • lengs: len(a,sv,s,v,sent),numpy二维数组[[1], [3], [3], [3], [20]]
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    # example
    # [('a', u'inform_count'), (u'count', '_1'), (u'goodformeal', '_1'), (u'type', '_1')],
    # u"inform_count(type=restaurant;count='2';goodformeal=brunch)",
    # u'there are SLOT_COUNT SLOT_TYPE -s good for SLOT_GOODFORMEAL',
    # u'there are SLOT_COUNT SLOT_TYPE -s good for SLOT_GOODFORMEAL'
    array([[7]], dtype=int32),
    array([[ 53, 78, 82, 101]], dtype=int32),
    array([[16, 22, 24, 31]], dtype=int32),
    array([[1, 7, 1, 1]], dtype=int32),
    array([[ 1],
    [ 5],
    [ 2],
    [ 4],
    [ 22],
    [140],
    [117],
    [113],
    [105],
    [ 41],
    [136],
    [147],
    [ 10],
    [ 1]], dtype=int32),
    1,
    array([[ 1],
    [ 4],
    [ 4],
    [ 4],
    [14]], dtype=int32)

scLSTM的输出

1.input for test : a sv s v
2.output of scLSTM: 20 * (penalty + sentence)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# ('gens : ', [(0.13399766372142932, [1, 5, 2, 4, 33, 105, 41, 138, 113, 126, 8, 2, 77, 15, 132, 1]), (0.14933728562324292, [1, 5, 138, 113, 126, 8, 2, 77, 15, 132, 1]), (0.1549744254930803, [1, 5, 2, 4, 113, 105, 41, 2, 77, 15, 132, 1]), (0.15852280774253719, [1, 5, 138, 113, 126, 15, 132, 1]), (0.19026726564154084, [1, 5, 2, 4, 33, 105, 41, 138, 113, 8, 2, 77, 15, 132, 1]), (0.20233138058456338, [1, 5, 2, 4, 33, 105, 41, 138, 113, 126, 15, 132, 1]), (0.20940732168664519, [1, 5, 2, 4, 33, 105, 41, 138, 113, 132, 8, 2, 77, 15, 132, 1]), (0.21145074659354887, [1, 5, 2, 4, 113, 105, 41, 138, 132, 1]), (0.21476023212302103, [1, 5, 2, 4, 33, 105, 41, 138, 113, 126, 8, 14, 2, 77, 15, 132, 1]), (0.22121748802971725, [1, 5, 2, 4, 33, 105, 41, 138, 113, 126, 8, 138, 132, 1]), (0.22331384956149691, [1, 5, 2, 4, 33, 105, 41, 138, 113, 8, 14, 2, 77, 15, 132, 1]), (0.22577840448106021, [1, 5, 2, 4, 33, 105, 41, 138, 113, 126, 8, 2, 4, 77, 202, 15, 132, 1]), (0.23494928307737384, [1, 5, 2, 4, 33, 105, 41, 138, 113, 15, 132, 1]), (0.2414091617709512, [1, 5, 2, 4, 33, 105, 41, 138, 113, 126, 8, 2, 77, 15, 132, 8, 2, 77, 15, 132, 1]), (0.25039630774077898, [1, 5, 2, 4, 33, 105, 41, 138, 113, 126, 8, 2, 111, 15, 132, 1]), (0.26162732976045144, [1, 5, 2, 4, 33, 105, 41, 138, 113, 126, 8, 14, 2, 4, 77, 202, 15, 132, 1]), (0.27529049174629477, [1, 5, 2, 4, 33, 105, 41, 138, 113, 126, 8, 2, 4, 77, 132, 1]), (0.30046496217825086, [1, 5, 2, 4, 33, 105, 41, 138, 113, 126, 8, 14, 2, 4, 77, 132, 1]), (0.3006930035044566, [1, 5, 2, 4, 33, 105, 41, 138, 113, 126, 1]), (0.31477015858324936, [1, 5, 2, 4, 33, 105, 41, 138, 113, 126, 8, 2, 77, 15, 1])])
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD food and is good for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME serves SLOT_FOOD food and is good for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a SLOT_FOOD restaurant that is good for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME serves SLOT_FOOD food for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD and is good for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD food for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD SLOT_GOODFORMEAL and is good for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a SLOT_FOOD restaurant that serves SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD food and it is good for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD food and serves SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD and it is good for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD food and is a good place for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD food and is good for SLOT_GOODFORMEAL and is good for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD food and is great for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD food and it is a good place for SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD food and is a good SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD food and it is a good SLOT_GOODFORMEAL')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD food')
('gen : ', 'SLOT_NAME is a nice restaurant that serves SLOT_FOOD food and is good for')

3.由penalty得到score,选取top-k

Evaluation for scLSTM

Evaluation

_explanation for bleu:_

BLEU