Welcome to Week 3! This week we will introduce one of the most important and influential methods in optimization: Newton’s method. Newton’s method uses “second order” information, which means that it makes use of the second derivative of a function. It is based on approximating a function of interest by a quadratic function. Besides Newton’s method, we will have a first look at Machine Learning, in particular classification problems, and see how these can be approached using gradient descent.

**Learning outcomes**

- Understand Newton’s method and the conditions under which it gives quadratic convergence.
- Know how to formulate a machine learning problem and how optimization plays a role in it.

**Tasks and Materials**

- The lecture notes are available in the Lectures section.
- Work through the problems from Part A and Part B. Part A will be discussed in class.

**Further reading**

- Lecture 5: Section 9.5 of (1), Section 3.3 of (2), Section 1.2.4 of (3).
- Lecture 6: See the paper Optimization Methods for Large-Scale Machine Learning by Bottou, Curtis and Nocedal.

## Literature

- Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
- Jorge Nocedal and Stephen J.Wright. Numerical Optimization. Springer, 2006.
- Yuri Nesterov. Introductory Lectures on Convex Optimization. A basic course. Springer, 2004.