MA-INF 4332: LARGE LANGUAGE MODELS (Seminar)

Winter Semester 2024 – 2025

Content:

What is the Large Language Models seminar about?

Large Language Models (LLMs), such as GPT-4, Gemini, and their successors, have had an enormous impact on various domains, including natural language processing, machine learning, and artificial intelligence. These models have redefined what’s possible in applications such as text generation, translation, summarization, sentiment analysis, and more. The aim of this seminar is to explore cutting-edge research, insights, and trends in the field of LLMs, such as:

  • Hallucination reduction and factual grounding
  • Explainability, reasoning, faithfulness
  • Safety, toxicity, fairness and bias
  • Social and moral alignment of LLMs
  • Style control and personalization
  • Sustainability, compression, model size reduction, knowledge distillation
  • Multilinguality and multimodality
  • LLMs as planning agents
  • and more

Logistics:

  • Seminars: are on Tuesday 10:15 AM - 11:45 AM in B-IT 2.113 (Friedrich-Hirzebruch-Allee 6). ZOOM LINK
  • Note: The class is hybrid, it is fine to attend/ present online, but be ready to share your screen for the hands-on results!
  • Course Materials: will be uploaded every week on eCampus.
  • Contact: Students should ask all course-related questions in our forum discussion on eCampus. For external inquiries, emergencies, or personal matters, you can contact Prof. Flek, Dr. Florian Mai, or Vahid.
  • Office Hours: Please reach out to us first via mail to arrange any in-person meeting.
    • Prof. Dr. Lucie Flek: Friedrich-Hirzebruch-Allee 6 (B-IT) – Room: 2.123
    • Dr. Florian Mai: Friedrich-Hirzebruch-Allee 6 (B-IT) – Room: 2.107
    • Vahid Sadiri Javadi: Friedrich-Hirzebruch-Allee 6 (B-IT) – Room: 2.126

NEWS / UPDATES:

  • 21.10.2024: The first class starts on Tuesday, 22.10.2024 at 10:00 AM

Instructors:

Prof. Dr. Lucie Flek

flek(at)bit.uni-bonn.de

Head of CAISA Lab

Dr. Florian Mai

fmai@bit.uni-bonn.de

Course Instructor

Vahid Sadiri Javadi

vahidsj(at)bit.uni-bonn.de

Course Coordinator


Seminar Work:

1. Presentation (50%):

  • A group of 2-3 people presents every week on a selected topic:
    • You summarize a paper or a set of papers in a presentation
    • You can showcase your point with a model API or web interface
    • You can prepare a short hands-on session for the group as a part of your presentation (can others fool / hack/ break / improve the LLMs in the aspect you discuss?)

2. Final Essay (50%):

  • In addition to understanding the technical foundations of LLMs, an important skill is to analyze and communicate complex ideas effectively. For this final essay, you are required to write a five-page paper that engages deeply with a specific topic covered in the seminar. Your essay should demonstrate a critical understanding of LLM technology, review the relevant research and perspectives on your chosen topic, and present a well-structured argument or analysis related to LLMs. You are encouraged to pursue an essay on the same topic as your oral presentation and build on the discussions from your presentation session.

You will develop the essay throughout the semester.

Submission: Final Essay should be submitted via eCampus. Further instructions will be announced soon.

 

  • Deadlines:
    • Block your presentation slot until: 29.10.2024
    • Register your essay plan until: 15.12.2024
    • Hand in your essay until: 02.02.2025
  • Submission: Presentations and Final Essay should be submitted via eCampus. Further instructions will be announced soon.

Allocation:

  • 3 + 1 SWS
  • Master in Media Informatics: 4 ECTS credits
  • Master in Computer Science: MA-INF 4332 - 4 CP
  • Students must register for the exam on POS/BASIS.

Literature:

WeekDateDescriptionResourcesPresenter
Week 0Tue Oct 22Organization & Outline Dr. Florian Mai
Week 1Tue Oct 29

Introduction to Foundation Models / LLMs

  • What exactly are foundation models?
  • How do you train foundation models?
  • How do you use foundation models?
  • Main contributions of LLMs to the field (Why/how did this happen?)
  • Open challenges
  • (What still doesn’t work and what LLMs are not made for)
 
[1, 2, 3]Dr. Florian Mai
Week 2Tue Nov 5

Pretraining of LLMs

  • Neural architectures
  • Objectives
  • Data mix
 
[4, 5, 6, 7] 
Week 3Tue Nov 12

Posttraining of LLMs

  • RLHF, DPO
  • Instruction tuning
  • Tool use
 
[8, 9, 10, 11] 
Week 4Tue Nov 19

Science of LLMs

  • (Why) does in-context learning work?
  • Emerging abilities of LLMs
  • Representational power of LLMs
 
[12, 13, 14, 15] 
Week 5Tue Nov 26

Bias, robustness, and hallucinations

  • Societal impact of LLMs
 
[16, 17, 18] 
Week 6Tue Dec 3

Multimodal Language Models

  • Vision-language models
  • Vision-language-action models (robotics)
  • Text-audio models
 
[19, 20, 21] 
Week 7Tue Dec 10

Knowledge in LLMs

  • Parametric knowledge in LLMs
  • RAG mechanism
  • Unification of LLMs and knowledge graphs
 
[22, 23, 24] 
Week 8Tue Dec 17

Scalable alignment

  • How to train superhuman AIs?
  • “Debate” technique
  • “Iterative Amplification”
  • “Weak-to-strong generalization”
 
[25, 26, 27, 28] 
Week 9Tue Jan 7

LLM reasoning

  • Prompting techniques, chain-of-thought
  • Finetuning techniques
  • Reinforcement learning techniques
 
[29, 30, 31, 32] 
Week 10Tue Jan 14

LLMs as social agents

  • LLM evaluations and training on social tasks
  • Theory of mind in LLMs
  • LLM’s values
 
[33, 34, 35] 
Week 11Tue Jan 21

LLM applications

  • LLMs for mathematics
  • LLMs for medicine
  • LLMs’ impact on productivity
 
[36, 37, 38] 
Week 12Tue Jan 28

CAISA ongoing research

  • Wrap-up
  • Discussion
 
 Dr. Florian Mai