Europe/Lisbon
Online

Sebastian Engelke, Research Center for Statistics, University of Geneva
Machine learning beyond the data range: extreme quantile regression

Machine learning methods perform well in prediction tasks within the range of the training data. When interest is in quantiles of the response that go beyond the observed records, these methods typically break down. Extreme value theory provides the mathematical foundation for estimation of such extreme quantiles. A common approach is to approximate the exceedances over a high threshold by the generalized Pareto distribution. For conditional extreme quantiles, one may model the parameters of this distribution as functions of the predictors. Up to now, the existing methods are either not flexible enough or do not generalize well in higher dimensions. We develop new approaches for extreme quantile regression that estimate the parameters of the generalized Pareto distribution with tree-based methods and recurrent neural networks. Our estimators outperform classical machine learning methods and methods from extreme value theory in simulations studies. We illustrate how the recurrent neural network model can be used for effective forecasting of flood risk.

Additional file

document preview

Slides

Joint seminar CEMAT and CEAUL