Seminar: Efficient and Robust Natural Language Processing

Mo 14:15–15:45, Seminarraum 1.12

Thur 16:15–17:45, Room -1.05

Dr. Simon Ostermann, Tatiana Anikina, Natalia Skachkova
Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI)
If you would like to participate, please drop an email to efficient-nlp-seminar@dfki.de
until April 17 (23:59).
In your email, please:
  • Give your name, semester, study program
  • Write some words on why you want to take part in this course
  • List some of your previous experience:
    • your background in deep learning or machine learning
    • your background in natural language processing in general
Prerequisites: This seminar is primarily targeted at Master students, but is also open to advanced Bachelor students. We expect you to have a curious mind and some familiarity with large language models. At the very least, we expect all students to have read (and understood :-)) the BERT paper and the Transformer paper.

Seminar Content

Nowadays, Large Language Models (LLMs) achieve impressive results on a variety of NLP tasks and languages. This increase in performance comes with a specific cost: Better performance is typically attributed to an increased number of parameters and training data. This means that better and larger models also need more computational resources, more time, more memory, and more energy, which makes training processes prohibitively expensive.

How can we solve the problem of ever larger and ever hungrier models? Efficient NLP is an umbrella term that includes a lot of different approaches to address these problems in terms of model design and data efficiency, by introducing more efficient fine-tuning, inference, or hardware utilization. In this seminar we will mostly focus on model and data-level efficiency and will explore various approaches including efficient prompting methods, adapters, transfer learning, data augmentation and active learning.


List of relevant Papers and Topics (subject to smaller changes)

Data Efficiency
  • FreeAL: Towards Human-Free Active Learning in the Era of Large Language Models | paper
  • MELM: Data Augmentation with Masked Entity Language Modeling for Low-Resource NER | paper
  • A Survey of Data Augmentation Approaches for NLP | paper
  • Does the Order of Training Samples Matter? Improving Neural Data-to-Text Generation with Curriculum Learning | paper
  • Applying Natural Annotation and Curriculum Learning to Named Entity Recognition for Under-Resourced Languages | paper
  • Do We Need to Create Big Datasets to Learn a Task? | paper
  • NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework | paper
  • Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes | paper
Model Efficiency
  • ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts | paper
  • Parameter-Efficient Transfer Learning for NLP | paper
  • SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer | paper
  • The Power of Scale for Parameter-Efficient Prompt Tuning | paper
  • SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot | paper
  • An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks | paper
  • Hyperdecoders: Instance-specific decoders for multi-task NLP | paper
  • Don’t Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner | paper
  • Efficient Multimodal Fusion via Interactive Prompting | paper
  • HyperPrompt: Prompt-based Task-Conditioning of Transformers | paper
  • PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models | paper
  • Learning to Compress Prompts with Gist Tokens | paper

Some words on grading: This seminar is meant to be as interactive as possible. Final grades will be based on students' presentations, term papers (optional), but also on participation and discussion in class.

The participants are expected to prepare for classes accordingly, by reading the relevant papers and also doing background reading, if necessary. Based on this preparation, the participants should be able to discuss the presented papers in depth and to understand relevant context during the discussion.