Calibration and XGBoost reweighting to reduce coverage and non-response biases in overlapping panel surveys: application to the Healthcare and Social Survey.

Publication date: Feb 15, 2024

Surveys have been used worldwide to provide information on the COVID-19 pandemic impact so as to prepare and deliver an effective Public Health response. Overlapping panel surveys allow longitudinal estimates and more accurate cross-sectional estimates to be obtained thanks to the larger sample size. However, the problem of non-response is particularly aggravated in the case of panel surveys due to population fatigue with repeated surveys. To develop a new reweighting method for overlapping panel surveys affected by non-response. We chose the Healthcare and Social Survey which has an overlapping panel survey design with measurements throughout 2020 and 2021, and random samplings stratified by province and degree of urbanization. Each measurement comprises two samples: a longitudinal sample taken from previous measurements and a new sample taken at each measurement. Our reweighting methodological approach is the result of a two-step process: the original sampling design weights are corrected by modelling non-response with respect to the longitudinal sample obtained in a previous measurement using machine learning techniques, followed by calibration using the auxiliary information available at the population level. It is applied to the estimation of totals, proportions, ratios, and differences between measurements, and to gender gaps in the variable of self-perceived general health. The proposed method produces suitable estimators for both cross-sectional and longitudinal samples. For addressing future health crises such as COVID-19, it is therefore necessary to reduce potential coverage and non-response biases in surveys by means of utilizing reweighting techniques as proposed in this study.

Open Access PDF

Concepts Keywords
Biases COVID-19
Covid Machine learning
Healthcare Non-response bias
Urbanization Panel surveys
Public health
Sampling

Semantics

Type Source Name
disease MESH COVID-19 pandemic
disease VO effective
disease VO population
disease IDO process
pathway REACTOME Reproduction
disease IDO infected population
drug DRUGBANK L-Phenylalanine
disease MESH chronically ill
drug DRUGBANK Trestolone
disease VO time
disease VO Gap
drug DRUGBANK Coenzyme M
drug DRUGBANK Ademetionine
disease MESH infection
disease VO protocol
drug DRUGBANK Aspartame
disease VO effectiveness
drug DRUGBANK Flunarizine
drug DRUGBANK L-Valine
disease VO data set
drug DRUGBANK Dimercaprol
disease VO Canada
disease IDO algorithm
drug DRUGBANK Tropicamide
disease MESH morbidity
disease VO USA
drug DRUGBANK Indoleacetic acid
disease VO efficiency
disease MESH infertility
disease VO efficient

Original Article

(Visited 1 times, 1 visits today)