July 18th | LLM Inference and Reasoning Zhou Zijian, NUS
Inference: - What are the inputs and outputs of an LLM model?
- Difference between pre-filling and auto-regressive decoding
- Auto-regressive decoding:
- How are tokens sampled based on output
- What are top-k, top-p, temperature?
- How does the LLM know when to stop?
Reasoning: - What is reasoning in its fundamental sense?
- Why reasoning is important for LLM?
- Two approaches of achieving reasoning:
- Using a fine-tuned model
- Prompting
Slides · Recording | |
July 23rd | Post-Training Reasoning Models Zhi Wang, UCSD
Key Topics: - Motivation for post-training: overcoming scaling limits of pre-training and enabling LLMs to "think"
- Introducing temporal reasoning via Chain-of-Thought (CoT) and Tree-of-Thought (ToT)
- Supervised Fine-Tuning (SFT) on reasoning data: objectives and benefits
- Reinforcement Learning with Verifiable Rewards (RLVR) and GRPO (Group Relative Policy Optimization)
Applications & Insights: - Practical design of reasoning-oriented pipelines for math and code tasks
- Techniques to enhance reasoning during inference without retraining
- Discussion on current limitations and future research directions in scalable reasoning for LLMs
Slides · Recording | |
August 8th | Foundational Methods for Foundation Models for Scientific Machine Learning Michael Mahoney, UC Berkeley, Amazon
Key Points: - Pre-train & fine-tune paradigm for SciML, adapted from NLP/CV
- Scaling laws: model size, data size vs. fine-tuning performance
- Out-of-distribution transfer across physics parameters
- Multi-task pre-training across physics problems
- Failure modes at SciML–ML interface & mitigation strategies
- Deployment at scale using HPC environments like NERSC
Slides · Recording | |
August 13th | Learning from Experience AKA Reinforcement Learning (2024 Turing Award topic for research and business) Yuxi Li, University of Alberta, AI4All Institute
First principles: - Learning from experience
- Iterative improvement based on ground truth
Research: - Pursuing truth or following trend
- Autonomous, optimal and adaptive agents
- Simulation, integration of (world) model and data
- Explore alternative approaches w.r.t. data collection, architectures, and algorithms
- "Small" language models, modularity, generalist vs specialist
Business: - Value investment
- AI vs IT
- Code LLMs
- Experience data collection
- Decentralized AI aka AI + blockchain, in particular, for stablecoin
Slides · Recording | |