Overview
This is the course website for M1399.000400/M3309.005200: “Deep Learning: A Statistical Perspective” at Seoul National University in Fall 2023. Course schedule and assignments will be available on this website.
Announcements
Course Information
Instructor: Joong-Ho (Johann) Won (wonj AT stats DOT snu DOT ac DOT kr)
Class Time: Tue/Thu 14:00 - 15:15 @ 25-405
Office Hours: By appointment.
Textbook: There is no required textbook, but [ESL] and [Ripley] will be frequently referred to.
Books
-
Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning, MIT Press (2016).
-
Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press (2014).
-
[ESL] Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning, 2nd Ed., Springer (2009).
-
Christopher Bishop, Pattern Recognition and Machine Learning, Springer (2006).
-
Richard O. Duda, Peter E. Hart, David G. Stork Pattern Classification, 2nd Ed., Wiley (2001).
-
[Ripley] Brian Ripley, Pattern Recognition and Neural Networks, Cambridge University Press (1996).
Review articles
-
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton, Deep Learning, Nature 521, 436–444 (2015).
-
Jianqing Fan, Cong Ma, Yiqiao Zhong, A Selective Overview of Deep Learning, Statistical Science 36(2), 264-290 (May 2021).
-
Bing Cheng, D. M. Titterington, Neural Networks: A Review from a Statistical Perspective, Statistical Science 9(1), 2-30 (February 1994), with discussion:
Online resources
- Analyses of Deep Learning, Stanford University Stat, 2019.
- Foundations of Deep Learning, UCLA CS, 2019.
- Advanced Topics in Deep Learning, HKUST Math, 2020.
Course Objectives
- The goal of the course is to
- learn some elements of learning;
- learn deep neural networks (DNN); and
- solve a real-life problem including implementation.
-
We cover some theoretical elements of deep learning in supervised and unsupervised settings.
-
We do not cover computational issues such as parallel computing.
- Intro to Python and frameworks (PyTorch and TensorFlow) will be provided to help get started.
Course Overview
Assessment
The course will be graded based on the following components:
-
Attendance (10%): Mandatory.
- Project (90%): projects could include
- implementation of a challenging real-world data example;
- numerical experiments or theoretical exposition to gain explanations in unknown aspects of architecture or optimization; or
- critical literature review of a topic.
-
One- or two-paged written proposal will be presented to evaluate relevance to the course. This amounts to the Midterm Exam.
- Final Exam constitutes a publication-worthy paper and presentation with explicit explanation of contribution of each member in a team.
Schedule
The following schedule is tentative, and is subject to change over the course.
Week | Topic | Reading assignment | Due Date |
---|---|---|---|
1 (9/5, 9/7) | Introduction, linear classification | LeCun, Bengio, & Hinton, Cheng & Titterington, ESL Ch. 4 | - |
2 (9/12, 9/14) | Python/deep learning frameworks | - | - |
3 (9/19, 9/21) | Python/deep learning frameworks, linear classification | ESL Ch. 4 | - |
4 (9/26, |
SVM, RKHS | ESL Chs. 4, 12 | - |
5 ( |
SVM, RKHS | ESL Chs. 5, 12 | - |
6 (10/10, 10/12) | SVM, RKHS, Multi-layer perceptron | ESL Ch. 11 | - |
7 (10/17, 10/19) | Multi-layer perceptron, backpropagation | Ripley Ch. 5 | project proposal |
8 (10/24, 10/26) | Proposal presentation, MLP (cont’d) | Universal approximation bounds for superpositions of a sigmoidal function | - |
9 (10/31, 11/2) | Benefits of deep models | Error bounds for approximations with deep ReLU networks | - |
10 (11/7, 11/9) | Deep supervised learning models I | Mad Max: Affine Spline Insights Into Deep Learning | - |
11 (11/14, 11/16) | Deep supervised learning models II | ||
12 (11/21, 11/23) | Deep unsupervised learning models | Representation Learning: A Review and New Perspectives; Auto-Encoding Variational Bayes; Variational Inference: A Review for Statisticians | - |
13 (11/28, 11/30) | Deep generative models | Generative adversarial nets | - |
14 (12/5, 12/7) | Statistical learning theory | Ripley Ch. 5, Shalev-Shwartz & Ben-David Chs. 6, 26, 27 | - |
15 (12/12, 12/14) | Project presentation | - | term paper |