Math 565: Lecture Notes and Videos on Optimization for Machine Learning
|
Copyright: I (Bala Krishnamoorthy) hold the copyright
for all lecture scribes/notes, documents, and other
materials including videos posted on these course web
pages. These materials might not be used for commercial
purposes without my consent.
|
Scribes from all lectures so far (as a single big file)
| Lec | Date | Topic(s) | Scribe | Video |
| 1 |
Jan 13 |
syllabus, logistics, ML problems: clustering, classification, regression, optimization in 1D, regression via minimization
|
Scb1
|
Vid1
|
| 2 |
Jan 15 |
\(\nabla J = \mathbf{0} \Rightarrow D^TD \mathbf{w} =
D^T \mathbf{y}\), optimization in graphs, using \(D =
QR\), Tikhonov regularization, binary classification
|
Scb2
|
Vid2
|
| 3 |
Jan 20 |
support vector machine (SVM), Taylor expansion, local
optimality in 1D, gradient descent
(python),
optimality in \(d\)-dim
|
Scb3
|
Vid3
|
| 4 |
Jan 22 |
local optimality: second order conditions, convex (cvx) sets + functions, properties, \(f(g(\mathbf{w}))\) cvx when \(f\) cvx + \(g\) linear
|
Scb4
|
Vid4
|
| 5 |
Jan 27 |
local min of cvx \(f \Rightarrow\) global min,
first+second derivative cndn of convexity, strict
convexity, computing \(\nabla J\), updating
\(\alpha_t\)
|
Scb5
|
Vid5
|
| 6 |
Jan 29 |
second order cndtns example, line search for
\(\alpha_t\), additively separable loss
\(J\)\(=\)\(\sum_i J_i\), stochastic gradient descent
(SGD)
|
Scb6
|
Vid6
|
Last modified: Tue Jan 27 23:22:18 PST 2026
|