\n "]},{"cell_type":"markdown","metadata":{},"source":["While the overall performance of the model is not enough to outperform our extractive baseline, we have shown that it can incorporate a query and utilize the information to create more focused summaries."]},{"cell_type":"markdown","metadata":{},"source":["![title](https://a2c.fyi/q/MSzTSWNWZU/f0.png)"]},{"cell_type":"markdown","metadata":{},"source":["Figure 1: Diagram of the GRU architecture. xt corresponds to the input and ht the output of the GRU cell. Boxes with multiple input vectors have the input concatenated. A circle signifies an operation , where + is vector addition, · is elementwise multiplication, and \"1−\" computes the complement probability for the input elements in [0, 1]. (Gated Recurrent Units)"]},{"cell_type":"markdown","metadata":{},"source":["\n\\newcommand{\\larrowover}{\\stackrel{\\leftarrow}{#1}}\n\\newcommand{\\rarrowover}{\\stackrel{\\rightarrow}{#1}}\n\\newcommand{\\argmax}{\\underset{#1}{\\operatorname{arg}\\,\\operatorname{max}}\\;}\\DeclareMathOperator{\\GRU}{GRU}\n\\begin{align*}\nr_t &= \\sigma(W^r[x_{t}, h_{t-1}] + b^r) \\\\\nz_t &= \\sigma(W^z[x_{t}, h_{t-1}] + b^z)\n\\\\[0.3cm]\nh'_{t} &= \\tanh(W^h[x_{t}, r_t \\odot h_{t-1}] + b^h)\n\\\\\nh_{t} &= z_t \\odot h_{t-1} + (1-z_t) \\odot h'_{t},\n\\end{align*}"]},{"cell_type":"code","execution_count":0,"outputs":[],"metadata":{},"source":["r_t == sigma(W**r(x_t, h_(t - 1)) + b**r) \n"," z_t == sigma(W**z(x_t, h_(t - 1)) + b**z) \n"," 0.3cm h__deriv_t == tanh(W**h(x_t, r_t odot h_(t - 1)) + b**h)\n"," \n"," h_t == z_t odot h_(t - 1) +(1 - z_t) odot h__deriv_t"]},{"cell_type":"markdown","metadata":{},"source":["![title](https://a2c.fyi/q/MSzTSWNWZU/f2.png)"]},{"cell_type":"markdown","metadata":{},"source":["Figure 3: Overview of our model. It illustrates connections between parts of the model at a fixed decoder time step t. The bottom part, containing labeled boxes, correspond to the different RNNs. The top part is intended to visualize the two ways the output word yt can be selected, through the pointer and generator mechanism, to the left and right respectively. (Model)"]},{"cell_type":"markdown","metadata":{},"source":["\n\\newcommand{\\larrowover}{\\stackrel{\\leftarrow}{#1}}\n\\newcommand{\\rarrowover}{\\stackrel{\\rightarrow}{#1}}\n\\newcommand{\\argmax}{\\underset{#1}{\\operatorname{arg}\\,\\operatorname{max}}\\;}\\DeclareMathOperator{\\GRU}{GRU}\n\\begin{align*}\nh_t = \\GRU(h_{t-1}, x_t).\n\\end{align*}"]},{"cell_type":"code","execution_count":0,"outputs":[],"metadata":{},"source":["h_t == GRU"]},{"cell_type":"markdown","metadata":{},"source":["\n\\newcommand{\\larrowover}{\\stackrel{\\leftarrow}{#1}}\n\\newcommand{\\rarrowover}{\\stackrel{\\rightarrow}{#1}}\n\\newcommand{\\argmax}{\\underset{#1}{\\operatorname{arg}\\,\\operatorname{max}}\\;}\\DeclareMathOperator{\\GRU}{GRU}\n\\begin{align*}\nr_t &= \\sigma(W^r[x_{t}, h_{t-1}] + b^r) \\\\\nz_t &= \\sigma(W^z[x_{t}, h_{t-1}] + b^z)\n\\\\[0.3cm]\nh'_{t} &= \\tanh(W^h[x_{t}, r_t \\odot h_{t-1}] + b^h)\n\\\\\nh_{t} &= z_t \\odot h_{t-1} + (1-z_t) \\odot h'_{t},\n\\end{align*}"]},{"cell_type":"code","execution_count":0,"outputs":[],"metadata":{},"source":["r_t == sigma(W**r(x_t, h_(t - 1)) + b**r) \n"," z_t == sigma(W**z(x_t, h_(t - 1)) + b**z) \n"," 0.3cm h__deriv_t == tanh(W**h(x_t, r_t odot h_(t - 1)) + b**h)\n"," \n"," h_t == z_t odot h_(t - 1) +(1 - z_t) odot h__deriv_t"]},{"cell_type":"markdown","metadata":{},"source":["![title](https://a2c.fyi/q/MSzTSWNWZU/f3.png)"]},{"cell_type":"markdown","metadata":{},"source":["Figure 4: Visualization of the attention distribution as the summary in Table ?? is generated. The words of the document are shown on the horizontal axis, from left to right. Only a limited number of document words are shown. The vertical axis shows the output words, from top to bottom, after the token. The darker a cell is, the higher the attention on that position. Figure 5: Visualization of the attention distribution, αti, as an output summary for a documentquery pair is generated. The query is \"australia\". The format is the same as in Figure ??. Figure 6: Visualization of the attention distribution, αti, as an output summary for a documentquery pair in the test set is generated. The query is \"only fools and horses\". The format is the same as in Figure ??. The ellipsis signifies that parts of the attention distribution has been skipped. (Attention Visualisations)"]},{"cell_type":"markdown","metadata":{},"source":["\n\\newcommand{\\larrowover}{\\stackrel{\\leftarrow}{#1}}\n\\newcommand{\\rarrowover}{\\stackrel{\\rightarrow}{#1}}\n\\newcommand{\\argmax}{\\underset{#1}{\\operatorname{arg}\\,\\operatorname{max}}\\;}\\DeclareMathOperator{\\GRU}{GRU}\n\\begin{align*}\nh_t = \\GRU(h_{t-1}, x_t).\n\\end{align*}"]},{"cell_type":"code","execution_count":0,"outputs":[],"metadata":{},"source":["h_t == GRU"]}],"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"}},"nbformat":4,"nbformat_minor":2}