Truncated Newton method
The Truncated Newton Method originated in a paper by Ron Dembo and Trond Steihaug first published in Mathematical Programming (Dembo, R.S., Steihaug, T. Truncated-Newton algorithms for large-scale unconstrained optimization. Mathematical Programming 26, 190–212)'. Convergence results for this algorithm can be found in: Dembo, Ron S., Stanley C. Eisenstat, and Trond Steihaug. "Inexact newton methods." SIAM Journal on Numerical analysis 19.2 (1982): 400-408. Also known as Hessian-free optimization,[1] are a family of optimization algorithms designed for optimizing non-linear functions with large numbers of independent variables. A truncated Newton method consists of repeated application of an iterative optimization algorithm to approximately solve Newton's equations, to determine an update to the function's parameters. The inner solver is truncated, i.e., run for only a limited number of iterations. It follows that, for truncated Newton methods to work, the inner solver needs to produce a good approximation in a finite number of iterations;[2] conjugate gradient has been suggested and evaluated as a candidate inner loop.[1] Another prerequisite is good preconditioning for the inner algorithm.[3]
References
- Martens, James (2010). Deep learning via Hessian-free optimization (PDF). Proc. International Conference on Machine Learning.
- Nash, Stephen G. (2000). "A survey of truncated-Newton methods". Journal of Computational and Applied Mathematics. 124 (1–2): 45–59. doi:10.1016/S0377-0427(00)00426-X.
- Nash, Stephen G. (1985). "Preconditioning of truncated-Newton methods" (PDF). SIAM J. Sci. Stat. Comput. 6 (3): 599–616.
Further reading
- Grippo, L.; Lampariello, F.; Lucidi, S. (1989). "A Truncated Newton Method with Nonmonotone Line Search for Unconstrained Optimization". J. Optimization Theory and Applications. 60 (3). CiteSeerX 10.1.1.455.7495.
- Nash, Stephen G.; Nocedal, Jorge (1991). "A numerical study of the limited memory BFGS method and the truncated-Newton method for large scale optimization". SIAM J. Optim. 1 (3): 358–372. CiteSeerX 10.1.1.474.3400.