Seminar: Efficient and Robust Natural Language Processing
Dr. Simon Ostermann
Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI)
First Session: Wed 12:15–13:45, Room 1.14
(The final seminar slot will be set in the first session)
If you would like to participate, please drop an email to efficient-nlp-seminar@dfki.de
until April 10th (23:59).
In your email, please:
- Give your name, semester, study program
- Write some words on why you want to take part in this course
- List some of your previous experience:
- your background in deep learning or machine learning
- your background in natural language processing in general
Prerequisites
This seminar is primarily targeted at Master students, but is also open to advanced Bachelor students. We expect you to have a curious mind and advanced familiarity with large language models. At the very least, we expect all students to have read (and understood :-)) the BERT paper and the Transformer paper.
Seminar Content
Large Language Models (LLMs) achieve impressive results across a wide variety of NLP tasks and languages. This increase in performance comes at a cost: better models typically require more parameters, more training data, more memory, more energy, and longer inference times, making both training and deployment prohibitively expensive at scale.
This seminar asks: how can we build NLP systems that are not only powerful, but efficient and robust? We explore this question across five thematic parts:
- Efficient architecture: how model design choices such as Mixture-of-Experts routing and weight pruning reduce the computational cost of large models from the ground up.
- Efficient fine-tuning (PEFT): how to adapt large pre-trained models to new tasks without updating all parameters, covering adapters, prompt tuning, LoRA variants, and memory-efficient training strategies.
- Data efficiency: how to select, curate, and generate training data more intelligently, including pre-training data curation at scale, active learning for fine-tuning, and synthetic data generation via self-play and distillation.
- Efficient inference: how to reduce the cost of running models at deployment time through IO-aware attention algorithms, speculative decoding, multi-head decoding, and prompt compression.
- Robustness: how robust are efficient LLMs to adversarial manipulation and distribution shift, and what trade-offs arise between efficiency and robustness?
The seminar consists of ten thematic sessions across these five parts. Students present a paper of their choice from each session's paper pool and lead the class discussion. The paper pool spans foundational work and the latest results from ACL, EMNLP, NeurIPS, and ICML venues.
A syllabus and preliminary list of papers to choose from can be found here.
Some words on grading: This seminar is meant to be as interactive as possible. Final grades will be based on students' presentations, term papers (optional), but also on participation and discussion in class.
The participants are expected to prepare for classes accordingly, by reading the relevant papers and also doing background reading, if necessary. Based on this preparation, the participants should be able to discuss the presented papers in depth and to understand relevant context during the discussion.
