MA-INF 4332: LARGE LANGUAGE MODELS (Seminar)
Winter Semester 2024 – 2025
Content:
What is the Large Language Models seminar about?
Large Language Models (LLMs), such as GPT-4, Gemini, and their successors, have had an enormous impact on various domains, including natural language processing, machine learning, and artificial intelligence. These models have redefined what’s possible in applications such as text generation, translation, summarization, sentiment analysis, and more. The aim of this seminar is to explore cutting-edge research, insights, and trends in the field of LLMs, such as:
- Hallucination reduction and factual grounding
- Explainability, reasoning, faithfulness
- Safety, toxicity, fairness and bias
- Social and moral alignment of LLMs
- Style control and personalization
- Sustainability, compression, model size reduction, knowledge distillation
- Multilinguality and multimodality
- LLMs as planning agents
- and more
Logistics:
- Seminars: are on Tuesday 10:15 AM - 11:45 AM in B-IT 2.113 (Friedrich-Hirzebruch-Allee 6). ZOOM LINK
- Note: The class is hybrid, it is fine to attend/ present online, but be ready to share your screen for the hands-on results!
- Course Materials: will be uploaded every week on eCampus.
- Contact: Students should ask all course-related questions in our forum discussion on eCampus. For external inquiries, emergencies, or personal matters, you can contact Prof. Flek, Dr. Florian Mai, or Vahid.
- Office Hours: Please reach out to us first via mail to arrange any in-person meeting.
- Prof. Dr. Lucie Flek: Friedrich-Hirzebruch-Allee 6 (B-IT) – Room: 2.123
- Dr. Florian Mai: Friedrich-Hirzebruch-Allee 6 (B-IT) – Room: 2.107
- Vahid Sadiri Javadi: Friedrich-Hirzebruch-Allee 6 (B-IT) – Room: 2.126
NEWS / UPDATES:
- 21.10.2024: The first class starts on Tuesday, 22.10.2024 at 10:00 AM
Instructors:
Seminar Work:
1. Presentation (50%):
- A group of 2-3 people presents every week on a selected topic:
- You summarize a paper or a set of papers in a presentation
- You can showcase your point with a model API or web interface
- You can prepare a short hands-on session for the group as a part of your presentation (can others fool / hack/ break / improve the LLMs in the aspect you discuss?)
2. Final Essay (50%):
- In addition to understanding the technical foundations of LLMs, an important skill is to analyze and communicate complex ideas effectively. For this final essay, you are required to write a five-page paper that engages deeply with a specific topic covered in the seminar. Your essay should demonstrate a critical understanding of LLM technology, review the relevant research and perspectives on your chosen topic, and present a well-structured argument or analysis related to LLMs. You are encouraged to pursue an essay on the same topic as your oral presentation and build on the discussions from your presentation session.
You will develop the essay throughout the semester.
Submission: Final Essay should be submitted via eCampus. Further instructions will be announced soon.
- Deadlines:
- Block your presentation slot until: 29.10.2024
- Register your essay plan until: 15.12.2024
- Hand in your essay until: 02.02.2025
- Submission: Presentations and Final Essay should be submitted via eCampus. Further instructions will be announced soon.
Allocation:
- 3 + 1 SWS
- Master in Media Informatics: 4 ECTS credits
- Master in Computer Science: MA-INF 4332 - 4 CP
- Students must register for the exam on POS/BASIS.
Literature:
- Bommasani, Rishi, et al., 2021, arXiv preprint: On the opportunities and risks of foundation models
- Devlin, Jacob, et al., 2018, arXiv preprint: Bert: Pre-training of deep bidirectional transformers for language understanding
- Brown, Tom, et al., 2020, arXiv preprint: Language models are few-shot learners
- WX Zhao, et al., 2023, arXiv preprint: A survey of large language models
- Yang, Jingfeng, et al., 2023, arXiv preprint: Harnessing the power of LLMs in practice: A survey on ChatGPT and beyond
- Awesome Foundation Models: GitHub Repository
Week | Date | Description | Resources | Presenter |
---|---|---|---|---|
Week 0 | Tue Oct 22 | Organization & Outline | Dr. Florian Mai | |
Week 1 | Tue Oct 29 | Introduction to Foundation Models / LLMs
| [1, 2, 3] | Dr. Florian Mai |
Week 2 | Tue Nov 5 | Pretraining of LLMs
| [4, 5, 6, 7] | |
Week 3 | Tue Nov 12 | Posttraining of LLMs
| [8, 9, 10, 11] | |
Week 4 | Tue Nov 19 | Science of LLMs
| [12, 13, 14, 15] | |
Week 5 | Tue Nov 26 | Bias, robustness, and hallucinations
| [16, 17, 18] | |
Week 6 | Tue Dec 3 | Multimodal Language Models
| [19, 20, 21] | |
Week 7 | Tue Dec 10 | Knowledge in LLMs
| [22, 23, 24] | |
Week 8 | Tue Dec 17 | Scalable alignment
| [25, 26, 27, 28] | |
Week 9 | Tue Jan 7 | LLM reasoning
| [29, 30, 31, 32] | |
Week 10 | Tue Jan 14 | LLMs as social agents
| [33, 34, 35] | |
Week 11 | Tue Jan 21 | LLM applications
| [36, 37, 38] | |
Week 12 | Tue Jan 28 | CAISA ongoing research
| Dr. Florian Mai |