Recent seminars

Europe/Lisbon
Room 6.2.33, Faculty of Sciences of the Universidade de Lisboa — Online

Lídia André, Lancaster University

Modelling and inference for the body and tail regions of multivariate data

When an accurate representation of multivariate data is required across both the body (described by non-extreme observations) and the tail (defined by the extreme observations) regions, it is crucial to have a model that is able to characterise the joint behaviour across both regions. In this work, we propose dependence models that represent the entire distribution without the need to explicitly define each region. We do so by constructing copulas that are based on mixture distributions defi ned on the full support of the data. For such models, we derive (sub)-asymptotic dependence properties for specific model configurations, and show that they are flexible in capturing a broad range of extremal dependence structures through simulation studies. Motivated by the computational resources required to evaluate the likelihood function of the proposed models, we also explore likelihood-free approaches that use neural networks to perform inference. In particular, we assess the performance of neural Bayes estimators in estimating the model parameters, both for one of the models introduced for the joint body and tail, and further complex extremal dependence models. We also propose a neural Bayes classifier for model selection. In this way, we provide a toolbox for simple fitting and model selection of complex extremal dependence models.

Joint work with: Jennifer Wadsworth, Jonathan Tawn, Raphaël Huser and Adrian O’Hagan.

Joint seminar CEMAT and CEAUL

Europe/Lisbon
Online

Anderson Ara, Universidade Federal do Paraná , Brazil

Can a set of strong learners create a single stronger learner?

Since the coined question in 1986 “Can a set of weak learners create a single strong learner?”, ensemble learning has been focused on merging simple machine learning methods in order to increase predictive performance. Characteristics such as stability and diversity are important to choose these weak learners in bagging procedure. In the same field, Support Vector Models (SVM) are strong and stable learners which have been drawing the attention of the community once these models have some properties which are easy to characterize and at the same time provide an estimation process with global optimization properties. In this talk, we present the Random Machines method. A new machine learning method based on SVM ensemble learning which exposes how a set of strong learners can create a single stronger learner.

Joint seminar CEMAT and CEAUL


SASlab (6.4.29) Faculty of Sciences of the Universidade de Lisboa

Renato Assunção, ESRI Inc., USA and Department of Computer Science, Universidade Federal de Minas Gerais, Brazil

Advancing Monte Carlo simulation with GANs, diffusion models, and normalizing flows

Recent years have seen remarkable progress in Monte Carlo simulation methods, driven by the integration of cutting-edge machine learning techniques such as Generative Adversarial Networks (GANs), diffusion models, and normalizing flows. These innovations enable the generation of complex, high-dimensional data, from highly realistic human faces to artistic transformations, such as converting a landscape photo into a Van Gogh-style painting. These breakthroughs, which often make headlines, capture widespread interest but remain challenging to simulate using traditional Monte Carlo techniques. GANs operate by training two networks in a competitive framework, yielding impressive results in high-dimensional sampling. Diffusion models offer a compelling alternative to Monte Carlo sampling by iteratively refining samples, reversing a noise-adding process, and producing smooth transitions critical for many applications. Normalizing flows map simple, tractable distributions (e.g., Gaussians) to complex target distributions through a sequence of invertible transformations, enabling efficient density estimation and sample generation. These advancements significantly expand the scope of Monte Carlo simulations, allowing statisticians and researchers to model more complex and non-standard distributions with greater accuracy and computational efficiency. This talk will explore these transformative methods, highlighting their principles, applications, and potential to redefine simulation in modern statistics and data science.

Joint seminar CEMAT and CEAUL

Europe/Lisbon
Online

Alejandra Andrea Tapia Silva, Pontificia Universidad Católica de Chile

Melhorando a modelagem de regressão com resposta binária com base em novas propostas de diagnósticos estatísticos com aplicações a dados médicos

Neste estudo, propomos um método diagnóstico baseado em uma nova família de funções de ligação, que permite avaliar a sensibilidade de ligações simétricas (como o logit) em relação a versões assimétricas. Essa abordagem visa melhorar o ajuste de modelos de regressão binária, especialmente em contextos médicos, onde o uso da função logit é comum devido à sua interpretabilidade via razão de chances. O modelo geral proposto, estimado por métodos baseados em verossimilhança, permite substituir a ligação padrão quando inadequada. Também desenvolvemos ferramentas de influência local sob diferentes esquemas de perturbação, com ênfase na sensibilidade da função de ligação e razão de chances. Simulações de Monte Carlo e aplicações com dados médicos ilustram a utilidade da proposta, ressaltando a importância de diagnósticos estatísticos adequados na modelagem.

Joint seminar CEMAT and CEAUL

Europe/Lisbon
SASlab (6.4.29) Faculty of Sciences of the Universidade de Lisboa — Online

Louiza Soltane, Laboratoire de Mathématiques Appliquées Université Mohamed Khider, Biskra, Algérie

Approximation Gaussienne de l’estimateur de la prime nette d’une distribution à queue lourde sous censure aléatoire

La prime nette est un outil pratique de gestion des risques et l’une des mesures de risque les plus connues en assurance. Il est également connu sous le nom de principe de la valeur moyenne ($μ=E[X]$), le principe de prime le plus simple, équivalent à l’espérance de la variable de taille de sinistre (risque). Dans ce principe, le taux de prime est fixé égal à la valeur attendue du risque $X$ ($X ≥ 0$), de fonction de répartition $F$, qui est définie par \[\Pi=E[X]:=\int_0^\infty(1-F(x))\,dx\] où $Π:χ →\mathbb{R}$, appelée fonction de mesure du risque et $χ$ un ensemble de variables aléatoires réelles. En effet, il existe dans la littérature une grande variété d’estimateurs asymptotiquement normaux sur la base des données complètes.

Cependant, il arrive souvent en pratique que les données soient censurées pour une raison ou une autre de telle façon que les techniques d’estimation basées sur les données complètes deviennent inappropriées. On utilise dans ce travail la théorie des valeurs extrêmes pour proposer une estimation alternative pour les données censurées aléatoirement à droite. Sous des conditions modérées, nous établissons la normalité asymptotique de l’estimateur proposé. La preuve repose fortement sur l’approximation gaussienne de Brahmi et al. (2015), Stute (1995) et les résultats de Kaplan-Meier (1958).

* En collaboration avec D. Meraghni et A. Necir.

Joint seminar CEMAT and CEAUL