Europe/Lisbon
Online

Catarina Padrela Loureiro, Departamento de Matemática, Instituto Superior Técnico
Air Quality Data Analysis with Symbolic Principal Components

Air pollution is a global challenge with deep implications in public health and environment. We examine air quality data from a monitoring station in Entrecampos, Lisbon, using Symbolic Data Analysis. The dataset consists of hourly concentrations of nine pollutants during three years, which are logarithmically transformed and aggregated in intervals, taking the daily minimum and maximum values. The symbolic mean and variance are estimated for each variable through the method of moments, and the pairwise dependencies are captured using a bivariate copula. Symbolic principal component scores are obtained from the estimated covariance matrix and used to fit generalized extreme value distributions. Control charts, based on these distributions' quantiles, are used to identify outlying observations. A comparative analysis with daily average-based outlier detection methods is conducted. The results show the relevance of Symbolic Data Analysis in revealing new insights into air quality.

Joint seminar CEMAT and CEAUL