主文献阅读(2):最优化算法
# Outline First-order optimization SGD and its variants # 机器学习回顾 A set of data: X={xn}n=1N⊂XX = \{x_n\}_{n=1}^N \subset \mathcal{X}X={xn}n=1N⊂X, optionally, with labels Y={y}n=1N⊂YY = \{y\}_{n=1}^N \subset \mathcal{Y}Y={y}n=1N⊂Y. A loss function L:Y×Y↦RL : \mathcal{Y} \times \mathcal{Y}...
more...