Enable JavaScript to see more content

Related: TFIDF
[1706.10207] Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning[1909.03550] Lecture Notes: Optimization for Machine Learning[1401.7020] A Stochastic Quasi-Newton Method for Large-Scale Optimization[1906.06821] A Survey of Optimization Methods from a Machine Learning Perspective[1508.02087] A Linearly-Convergent Stochastic L-BFGS Algorithm[1602.03943] Second-Order Stochastic Optimization for Machine Learning in Linear Time[1805.08095] Small steps and giant leaps: Minimal Newton solvers for Deep Learning[1309.1369] Semistochastic Quadratic Bound Methods[1807.00172] Algorithms for solving optimization problems arising from deep neural net models: smooth problems[1908.10400] On the Convergence Theory of Gradient-Based Model-Agnostic Meta-Learning Algorithms
Mentions
[1602.03943] Second-Order Stochastic Optimization for Machine Learning in Linear Time[1412.1193] New insights and perspectives on the natural gradient method[1505.02250] Newton Sketch: A Linear-time Optimization Algorithm with Linear-Quadratic Convergence[1403.6382] CNN Features off-the-shelf: an Astounding Baseline for Recognition[1601.04738] Sub-Sampled Newton Methods II: Local Convergence Rates[1311.6547] Practical Inexact Proximal Quasi-Newton Method with Global Complexity Analysis[1309.2388] Minimizing Finite Sums with the Stochastic Average Gradient
Related: Semantic Math
[1907.06032] Minimal Sample Subspace Learning: Theory and Algorithms1812.10426v1[1812.10426] Stochastic Trust Region Inexact Newton Method for Large-scale Machine Learning1812.10426v1[1812.10426] Stochastic Trust Region Inexact Newton Method for Large-scale Machine Learning[1411.6725] Accelerated Parallel Optimization Methods for Large Scale Machine Learning[1411.6725] Accelerated Parallel Optimization Methods for Large Scale Machine Learning[1803.03239] Fairness Through Computationally-Bounded Awareness[1907.04472] The stochastic multi-gradient algorithm for multi-objective optimization and its application to supervised machine learning[1303.6935] Efficiently Using Second Order Information in Large l1 Regularization Problems