The LUR models performed similarly well in terms of their model adjusted R2 and cross-validation R2, ranging respectively from 0.62 and 0.63 to 0.75 and 0.73. The ESCAPE model performed well at the ESCAPE sites in Sabadell (R2?=?0.69) and moderately well at the ESCAPE sites in Girona province (R2?=?0.53). The ESCAPE model predicted the external sites less well: R2 were 0.51 and 0.36 in Sabadell and Girona province. The INMA-Sabadell and REGICOR-Girona models showed a similar pattern: the R2 for the INMA model dropped from 0.69 to 0.50 at INMA versus ESCAPE sites in Sabadell, while the R2 for the REGICOR model dropped from 0.63 to 0.44 for REGICOR versus ESCAPE sites in Girona province. The drop in performance for external sites is likely a combination of overfitting and differences in the sampling campaigns (years, site selection). Agreement between models was 53 % -74 % for the classification of low, medium, and high levels of air pollution predicted at cohort addresses. Despite the drop in performance, the three models still explained a substantial fraction of the variation at independent sites, especially in Sabadell, supporting their use in epidemiological studies.