

The next best abstractive method is trained to compress scientific papers by an average of only 36.5 times. This means each paper is compressed on average to 238 times its size. The scientific papers included in the SciTldr dataset average 5,000 words. The researchers then hired annotators to summarize more papers by reading and further condensing the synopses that had already been written by peer reviewers.Įxtreme summarization: While many other research efforts have tackled the task of summarization, this one stands out for the level of compression it can achieve. To find these high-quality summaries, they first went hunting for them on OpenReview, a public conference paper submission platform where researchers will often post their own one-sentence synopsis of their paper.

The fine-tuning data: The researchers first created a dataset called SciTldr, which contains roughly 5,400 pairs of scientific papers and corresponding single-sentence summaries.

They then fine-tuned the model-in other words, trained it further-on the specific task of summarization. This process is known as “pre-training” and is part of what makes transformers so powerful. The researchers first trained the transformer on a generic corpus of text to establish its baseline familiarity with the English language. How they did it: AI2’s abstractive model uses what’s known as a transformer-a type of neural network architecture first invented in 2017 that has since powered all of the major leaps in NLP, including OpenAI’s GPT-3.
