Simon Ostermann

This is the seminar edition from 2022. Looking for the Bert and Friends seminar in summer semester 2023? Please have a look here: Bert and Friends '23

Seminar Content

The advent of large-scale pretrained language models as "Swiss Army Knives" for various applications and problems in natural language understanding and computational semantics has drastically changed the natural language processing landscape.

The BERT model is only the most prominent example. The publication of BERT caused a huge impact in the NLP research community and basically lead to a paradigm change: Pretraining language models based on large text collections and then adapting them to a task at hand has become the most prominent procedure for cutting edge and state of the art systems both in research and for industry applications.

Since the birth of BERT, research has been flourishing that is targeted at finding smaller, faster and more accurate variants and that investigates the adaptation of BERT-like transformer models to new tasks. In this seminar, we will look at such variants and adaptations of pretrained language models. We will cover papers on diverse new and effective pretraining methods for such language models, as well as papers that investigate how to use and adapt pretrained models for selected tasks in natural language understanding and computational semantics. We will look into prominent use cases such as machine reading comprehension and open question answering, but also read papers on multilinguality, natural language inference or text classification (based on the interest of participants).

Selection of Relevant Papers

Pretraining

XLNet: Generalized Autoregressive Pretraining for Language Understanding | paper
ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations | paper
RoBERTa: A Robustly Optimized BERT Pretraining Approach | paper
GPT-2: Language Models are Unsupervised Multitask Learners | paper
GPT-3: Language Models are Few-Shot Learners | paper
T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer | paper
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators | paper
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension | paper
SpanBERT: Improving Pre-training by Representing and Predicting Spans | paper

Tasks

Open Question Answering/Neural Retrieval

Dense Passage Retrieval for Open-Domain Question Answering | paper
How Much Knowledge Can You Pack Into the Parameters of a Language Model? | paper
RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering | paper
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks | paper
REALM: Retrieval-Augmented Language Model Pre-Training | paper

Machine Comprehension

TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection | paper
The Cascade Transformer: an Application for Efficient Answer Sentence Selection | paper
Retrospective Reader for Machine Reading Comprehension | paper

Natural Language Inference, Entailment and Similarity

Semantics-aware BERT for Language Understanding | paper
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks | paper
Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation | paper
Multi-Task Deep Neural Networks for Natural Language Understanding | paper
SimCSE: Simple Contrastive Learning of Sentence Embeddings | paper

Multilingual Models

mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer paper
Unsupervised Cross-lingual Representation Learning at Scale paper
Multilingual Denoising Pre-training for Neural Machine Translation paper

Some words on grading: This seminar is meant to be as interactive as possible. Final grades will be based on students' presentations, term papers (optional), but also on participation and discussion in class.

The participants are expected to prepare for classes accordingly, by reading the relevant papers and also doing background reading, if necessary. Based on this preparation, the participants should be able to discuss the presented papers in depth and to understand relevant context during the discussion.

Seminar: BERT and Friends - Pretrained LMs in Computational Semantics

Monday, 12:15 - 13:45

April 25 - July 18