CS Colloquium: Arman Cohan (Allen Institute for AI)
Adapting Transformer models for Document-level Natural Language Tasks
Representation learning forms the foundation of today’s natural language processing systems; Transformer models have been extremely effective at producing word- and sentence-level contextualized representations, achieving state-of-the-art results in many NLP tasks. However, applying these models to produce contextualized representations of the entire documents faces challenges. These challenges include lack of inter-document relatedness information, decreased performance in low-resource settings, and computational inefficiency when scaling to long documents.
In this talk, I will describe 3 recent works on developing Transformer-based models that target document-level natural language tasks. I will first introduce SPECTER, a method for producing scientific document representations using a Transformer model that incorporates document-level relatedness signals and achieves state-of-the-art results in multiple document-level tasks. Second, I will describe TLDR, the task of extreme summarization for scientific papers. I will introduce SciTLDR, our new multi-target summarization dataset of scientific papers as well as a simple yet effective training strategy for generating summaries in low-resource settings. Finally, I will discuss the practical challenges of scaling existing Transformer models to long documents, and our proposed solution, Longformer. Longformer, introduces a new sparse self-attention pattern that scales linearly with the input length while capturing both local and global context in the document, and achieves state-of-the-art results in both character-level language modeling and document NLP tasks.
Bio: Arman Cohan is a Research Scientist at the Allen Institute for AI. His research primarily focuses on developing NLP models with practical applications, with special focus in the scientific domain.
Prior to AI2, he obtained his PhD from Georgetown University in May 2018. His research has been recognized with best paper award at EMNLP 2017, honorable mention award at COLING 2018, and Harold N. Glassman Distinguished Doctoral Dissertation award in 2019.