Proximal gradient methods for learning information

Proximal gradient (forward backward splitting) methods for learning is an area of research in optimization and statistical learning theory which studies algorithms for a general class of convex regularization problems where the regularization penalty may not be differentiable. One such example is $\ell _{1}$ regularization (also known as Lasso) of the form

\min _{w\in \mathbb {R} ^{d}}{\frac {1}{n}}\sum _{i=1}^{n}(y_{i}-\langle w,x_{i}\rangle )^{2}+\lambda \|w\|_{1},\quad {\text{ where }}x_{i}\in \mathbb {R} ^{d}{\text{ and }}y_{i}\in \mathbb {R} .

Proximal gradient methods offer a general framework for solving regularization problems from statistical learning theory with penalties that are tailored to a specific problem application.^[1]^[2] Such customized penalties can help to induce certain structure in problem solutions, such as sparsity (in the case of lasso) or group structure (in the case of group lasso).

^ Combettes, Patrick L.; Wajs, Valérie R. (2005). "Signal Recovering by Proximal Forward-Backward Splitting". Multiscale Model. Simul. 4 (4): 1168–1200. doi:10.1137/050626090. S2CID 15064954.
^ Mosci, S.; Rosasco, L.; Matteo, S.; Verri, A.; Villa, S. (2010). "Solving Structured Sparsity Regularization with Proximal Methods". Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science. Vol. 6322. pp. 418–433. doi:10.1007/978-3-642-15883-4_27. ISBN 978-3-642-15882-7.

[combettes-1] Combettes, Patrick L.; Wajs, Valérie R. (2005). "Signal Recovering by Proximal Forward-Backward Splitting". Multiscale Model. Simul. 4 (4): 1168–1200. doi:10.1137/050626090. S2CID 15064954.

[structSparse-2] Mosci, S.; Rosasco, L.; Matteo, S.; Verri, A.; Villa, S. (2010). "Solving Structured Sparsity Regularization with Proximal Methods". Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science. Vol. 6322. pp. 418–433. doi:10.1007/978-3-642-15883-4_27. ISBN 978-3-642-15882-7.

Proximal gradient methods for learning information

and 26 Related for: Proximal gradient methods for learning information

Proximal gradient methods for learning

Proximal gradient method

Stochastic gradient descent

Outline of machine learning

Proximal policy optimization

Reinforcement learning

Gradient descent

Least squares

Statistical learning theory

Stochastic variance reduction

Christine De Mol

Online machine learning

Augmented Lagrangian method

Reinforcement learning from human feedback

Deep reinforcement learning

Backtracking line search

Machine learning in video games

Structured sparsity regularization

Outline of statistics

Bregman method

Landweber iteration

Large language model

OpenAI Five

Peter Richtarik

Regularized least squares

Matrix regularization