Series Ⅰ


Lecture Guests

Michael Mahoney

Michael Mahoney
Professor in University of California Berkeley, Amazon Scholar
mmahoney@stat.berkeley.edu

Yuxi Li

Yuxi Li
PhD in Computer Science at University of Alberta
yuxili@gmail.com

Zhi Wang

Zhi Wang
PhD in Data Science at University of California San Diego
zhw119@ucsd.edu

Zhou Zijian

Zhou Zijian
PhD in Computer Science at National University of Singapore
zhou_zijian@u.nus.edu


Schedule

Date Guest Lecture Supplemental Readings
July 18th LLM Inference and Reasoning
Zhou Zijian, NUS

Inference:
  • What are the inputs and outputs of an LLM model?
  • Difference between pre-filling and auto-regressive decoding
  • Auto-regressive decoding:
    • How are tokens sampled based on output
    • What are top-k, top-p, temperature?
    • How does the LLM know when to stop?
Reasoning:
  • What is reasoning in its fundamental sense?
  • Why reasoning is important for LLM?
  • Two approaches of achieving reasoning:
    • Using a fine-tuned model
    • Prompting
Slides · Recording
July 23rd Post-Training Reasoning Models
Zhi Wang, UCSD

Key Topics:
  • Motivation for post-training: overcoming scaling limits of pre-training and enabling LLMs to "think"
  • Introducing temporal reasoning via Chain-of-Thought (CoT) and Tree-of-Thought (ToT)
  • Supervised Fine-Tuning (SFT) on reasoning data: objectives and benefits
  • Reinforcement Learning with Verifiable Rewards (RLVR) and GRPO (Group Relative Policy Optimization)
Applications & Insights:
  • Practical design of reasoning-oriented pipelines for math and code tasks
  • Techniques to enhance reasoning during inference without retraining
  • Discussion on current limitations and future research directions in scalable reasoning for LLMs
Slides · Recording
August 8th Foundational Methods for Foundation Models for Scientific Machine Learning
Michael Mahoney, UC Berkeley, Amazon

Key Points:
  • Pre-train & fine-tune paradigm for SciML, adapted from NLP/CV
  • Scaling laws: model size, data size vs. fine-tuning performance
  • Out-of-distribution transfer across physics parameters
  • Multi-task pre-training across physics problems
  • Failure modes at SciML–ML interface & mitigation strategies
  • Deployment at scale using HPC environments like NERSC
Slides · Recording
August 13th Learning from Experience AKA Reinforcement Learning (2024 Turing Award topic for research and business)
Yuxi Li, University of Alberta, AI4All Institute

First principles:
  • Learning from experience
  • Iterative improvement based on ground truth
Research:
  • Pursuing truth or following trend
  • Autonomous, optimal and adaptive agents
  • Simulation, integration of (world) model and data
  • Explore alternative approaches w.r.t. data collection, architectures, and algorithms
  • "Small" language models, modularity, generalist vs specialist
Business:
  • Value investment
  • AI vs IT
  • Code LLMs
  • Experience data collection
  • Decentralized AI aka AI + blockchain, in particular, for stablecoin
Slides · Recording