[1909.03186v1] On Extractive and Abstractive Neural Document Summarization with Transformer Language Models
While we believe that this work is a step forward towards generating more abstractive summaries, it remains an open challenge to develop models that respect the underlying facts of the content being summarized while matching the creative ability of humans to coherently and concisely synthesize summaries.
Abstract: We present a method to produce abstractive summaries of long documents that
exceed several thousand words via neural abstractive summarization. We perform
a simple extractive step before generating a summary, which is then used to
condition the transformer language model on relevant information before being
tasked with generating a summary. We show that this extractive step
significantly improves summarization results. We also show that this approach
produces more abstractive summaries compared to prior work that employs a copy
mechanism while still achieving higher rouge scores. Note: The abstract above
was not written by the authors, it was generated by one of the models presented
in this paper.
‹Figure 1: Proposed model for abstractive summarization of a scientific article. An older version of this paper is shown as the reference document. First, a sentence pointer network extracts important sentences from the paper. Next, these sentences are provided along with the whole scientific article to be arranged in the following order: Introduction, extracted Sentences, abstract & the rest of the paper. A transformer language model is trained on articles organized in this format. During inference, the introduction and the extracted sentences are given to the language model as context to generate a summary. In domains like news and patent documents, the introduction is replaced by the entire document. (Introduction)Figure 2: n-gram overlaps between the abstracts generated by different models and the input article on the arXiv dataset. We show in detail which part of the input was copied for our TLM conditioned on intro + extract. (Transformer Language Models (TLM))Figure 3: t-sne visualization of the TLM-learned word embeddings. The model appears to partition the space based on the broad paper categoty in which it frequently occurs. (T-SNE of learned word embeddings)›
[1902.09243v1] Pretraining-Based Natural Language Generation for Text Summarization[1705.04304] A Deep Reinforced Model for Abstractive Summarization[1902.09243] Pretraining-Based Natural Language Generation for Text Summarization[1908.07026] Topic Augmented Generator for Abstractive Summarization[1712.06100] Query-Based Abstractive Summarization Using Neural Networks[1812.02303] Neural Abstractive Text Summarization with Sequence-to-Sequence Models[1808.10792] Bottom-Up Abstractive Summarization[1906.00138] Efficient Adaptation of Pretrained Transformers for Abstractive Summarization[1603.07252] Neural Summarization by Extracting Sentences and Words[1602.06023] Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond
Related: Semantic Math
[1909.03186] On Extractive and Abstractive Neural Document Summarization with Transformer Language Models[1909.03186] On Extractive and Abstractive Neural Document Summarization with Transformer Language Models[1812.01501] Domain Attentive Fusion for End-to-end Dialect Identification with Unknown Target Domain[1812.01501] Domain Attentive Fusion for End-to-end Dialect Identification with Unknown Target Domain[1812.06083] Coupled Representation Learning for Domains, Intents and Slots in Spoken Language Understanding[1710.00517] Temporal shape super-resolution by intra-frame motion encoding using high-fps structured light[1707.08340] Structure-Preserving Image Super-resolution via Contextualized Multi-task Learning[1901.10124] Adversarial Adaptation of Scene Graph Models for Understanding Civic Issues[1803.08669] Pyramid Stereo Matching Network[1803.08669] Pyramid Stereo Matching Network