STAT 157, Spring 19
Table Of Contents
STAT 157, Spring 19
Table Of Contents

Syllabus

This class provides a practical introduction to deep learning, including theoretical motivations and how to implement it in practice. As part of the course we will cover multilayer perceptrons, backpropagation, automatic differentiation, and stochastic gradient descent. Moreover, we introduce convolutional networks for image processing, starting from the simple LeNet to more recent architectures such as ResNet for highly accurate models. Secondly, we discuss sequence models and recurrent networks, such as LSTMs, GRU, and the attention mechanism. Throughout the course we emphasize efficient implementation, optimization and scalability, e.g. to multiple GPUs and to multiple machines. The goal of the course is to provide both a good understanding and good ability to build modern nonparametric estimators. The course loosely follows Dive into Deep Learning in terms of notebooks, slides and assignments.

Date Topics
1/22 Logistics, Software, Linear Algebra
1/24 Probability and Statistics (Bayes Rule, Sampling Naive Bayes, Sampling)
1/29 Gradients, Chain Rule, Automatic differentiation
1/31 Linear Regression, Basic Optimization
2/5 Likelihood, Loss Functions, Logisitic Regression, Information Theory
2/7 Multilayer Perceptron
2/12 Model Selection, Weight Decay, Dropout
2/14 Numerical Stability, Hardware
2/19 Environment
2/21 Layers, Parameters, GPUs
2/26 Convolutional Layers
2/28 LeNet, AlexNet, VGG, NiN
3/5 Project Midterm Presentation
3/7 Inception, Residual Networks
3/12 Computation Performance, Multi-GPU and Multi-Machine Training
3/14 Image Augmentation, Fine Turning
3/19 Midterm Exam
3/21 Object Detection I
4/2 Object Detection II, CNN Training Tricks
4/4 Sequence models and Language
4/9 Recurrent neural networks, Making it work - language modeling
4/11 Truncated Backprop, Gated Recurrent Unit, Long Short Term Memory
4/16 Bi-LSTM, Deep RNNs
4/18 Word2vec, FastText, GloVe, Sentiment Analysis
4/23 Encoder-Decoder, Seq2seq, Machine Translation
4/25 Attention, Transformer, GPT, BERT
4/30 Convex Optimization, Convergence Rate
5/2 Momentum, AdaGrad, RMSProp, AdaDelta, Adam
5/7 Project Final Presentation
5/10 Project Final Presentation