Séminaire Probabilités et Statistiques
Linear Regression With Unmatched Data: A Deconvolution Perspective
13
mars 2025
logo_team
Intervenant : Fadoua Balabdaoui
Institution : ETH Zurich
Heure : 15h30 - 16h30
Lieu : 3L15

Consider the regression problem where the response $Y\in \mathbb R$ and the covariate $X\in  \mathbb R^d $ for $d\geq 1$ are \textit{unmatched}. Under this scenario we do not have access to pairs of observations from the distribution of $(X, Y)$, but instead we have separate data sets $\{Y_i\}_{i=1}^n$ and $\{X_j\}_{j=1}^m$, possibly collected from different sources. We study this problem assuming that the regression function is linear and the noise distribution is known or can be estimated. We introduce an estimator of the regression vector based on deconvolution and demonstrate its consistency and asymptotic normality under an identifiability assumption. In the general case, we show that our estimator (DLSE: Deconvolution Least Squares Estimator) is consistent in terms of an extended $\ell_2$ norm. Using this observation, we devise a method for semi-supervised learning, i.e., when we have access to a small sample of matched pairs $(X_k, Y_k)$. Several applications with synthetic and real data sets are considered to illustrate the theory.

 

The talk is based on a joint work with Mona Azadkia

Voir tous les événements