### MACHINE LEARNING IN FINANCE

**25% Super Early Bird Discount until Friday January 11th 2019**

Prerequisites:

There are no formal prerequisites for the course, and we’ll endeavour to explain the foundational concepts during the training if required. However, since data science and machine learning rely on the following disciplines, it is good to brush up on:

- Linear algebra — we are dealing with datasets consisting of many data points and algorithms with many (hyper)parameters; linear algebra is the essential language in this multivariate setting;
- Probability theory — many of the models are versed in the language of probability: for example, disturbances in linear regression models are random variables; frequentist likelihoods and Bayesian priors and posteriors are probabilities; Somewhat less pertinently:
- Information theory — this branch of applied mathematics is concerned with quantifying how much information is present in our inputs; in machine learning we are concerned with extracting as much information as possible;
- Numerical computation — many of the algorithms rely on numerical methods rather than analytical solutions; in practice care is needed to avoid numerical issues such as overflow and underflow, poor conditioning, etc.;
- Optimisation theory — much of machine learning is concerned with optimising (hyper)parameters and therefore utilises the machinery from optimisation theory, such as gradient-based optimisation.

In order to practise machine learning, one needs

- A working knowledge of a convenient programming language, such as Python or R;
- Familiarity with relevant libraries.
- During our trainings, all code demonstrations will be using Python.

DAY ONE

09:00 – 10:00 – Lecture 1: Probability and Statistics

- Interpretation of probability – classical, frequentist, Bayesian, axiomatic
- Statistical inference and estimation theory

10:00 – 10:30 – Tutorial 1: Statistical inference and estimation theory

10:30 – 11:00 – Coffee break

11:00 – 12:00 – Lecture 2: Linear regression

- A geometric perspective
- Interpreting the linear regression, multicollinearity

12:00 – 12:30 – Tutorial 2: Linear regression

12:30 – 13:30 – Lunch

13:30 – 14:30 – Lecture 3: PCA and dimensionality reduction

- The geometry of eigenvectors and eigenvalues, covariance and correlation matrixes
- PCA and dimensionality reduction

14:30 – 15:00 – Tutorial 3: Demo of PCA

15:00 – 15:30 – Coffee break

15:30 – 16:30 – Lecture 4: Unsupervised machine learning

- Anomaly detection
- Clustering

16:30 – 17:00 – Tutorial 4: Demo of clustering analysis

DAY TWO

09:00 – 10:00 – Lecture 5: From statistics to supervised machine learning

- Bias-variance tradeoff
- Under and over fitting

10:00 – 10:30 – Tutorial 5: Demo of bias-variance tradeoff

10:30 – 11:00 – Coffee break

11:00 – 12:00 – Lecture 6: Model and features selection

- Cross-validation
- Bootstrap
- Regularization: shrinkage methods

12:00 – 12:30 – Tutorial 6: Demo of model selection for market impact assessment

12:30 – 13:30 – Lunch

13:30 – 14:30 – Lecture 7: Classification methods

- Logistic regression
- Decision Trees and Random Forests

14:30 – 15:00 – Tutorial 7: Solving classification problem and features selection by random forests

15:00 – 15:30 – Coffee break

15:30 – 16:30 – Lecture 8: Deep-learning

- Optimization, gradient descend
- Inference with Neural Networks: the theory
- Feed-forward neural networks and backpropagation

16:30 – 17:00 – Tutorial 8: Construction of NN and backpropagation algorithm