
Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 4 - LLM Training
For more information about Stanford’s graduate programs, visit: https://online.stanford.edu/graduate-education
October 17, 2025
This lecture covers:
• Pretraining
• Quantization
• Hardware optimization
• Supervised finetuning (SFT)
• Parameter-efficient finetuning (LoRA)
To follow along with the course schedule and syllabus, visit: https://cme295.stanford.edu/syllabus/
Chapters:
00:00:00 Introduction
00:07:19 Pretraining
00:13:26 FLOPs, FLOPS
00:16:34 Scaling laws, Chinchilla law
00:24:49 Training optimizations overview
00:31:09 Data parallelism with ZeRO
00:35:51 Model parallelism
00:38:26 Flash Attention
00:52:37 Quantization
00:56:00 Mixed precision training
01:02:31 Supervised finetuning
01:09:21 Instruction tuning
01:37:53 Parameter-efficient finetuning with LoRA
01:45:16 QLoRA
Afshine Amidi is an Adjunct Lecturer at Stanford University.
Shervine Amidi is an Adjunct Lecturer at Stanford University.
October 17, 2025
This lecture covers:
• Pretraining
• Quantization
• Hardware optimization
• Supervised finetuning (SFT)
• Parameter-efficient finetuning (LoRA)
To follow along with the course schedule and syllabus, visit: https://cme295.stanford.edu/syllabus/
Chapters:
00:00:00 Introduction
00:07:19 Pretraining
00:13:26 FLOPs, FLOPS
00:16:34 Scaling laws, Chinchilla law
00:24:49 Training optimizations overview
00:31:09 Data parallelism with ZeRO
00:35:51 Model parallelism
00:38:26 Flash Attention
00:52:37 Quantization
00:56:00 Mixed precision training
01:02:31 Supervised finetuning
01:09:21 Instruction tuning
01:37:53 Parameter-efficient finetuning with LoRA
01:45:16 QLoRA
Afshine Amidi is an Adjunct Lecturer at Stanford University.
Shervine Amidi is an Adjunct Lecturer at Stanford University.
Stanford Online
You can gain access to a world of education through Stanford Online, the Stanford School of Engineering’s portal for academic and professional education offered by schools and units throughout Stanford University. https://online.stanford.edu/
Our robust ...