What's the Weight? Estimating Controlled Outcome Differences in Complex Surveys for Health Disparities Research
Published 6/27/2024
A basic descriptive question in statistics often asks whether there are differences in mean outcomes between groups based on levels of a discrete covariate (e.g., racial disparities in health outcomes). However, when this categorical covariate of interest is correlated with other factors related to the outcome, direct comparisons may lead to biased estimates and invalid inferential conclusions without appropriate adjustment. Propensity score methods are broadly employed with observational data as a tool to achieve covariate balance, but how to implement them in complex surveys is less studied - in particular, when the survey weights depend on the group variable under comparison. In this work, we focus on a specific example when sample selection depends on race. We propose identification formulas to properly estimate the average controlled difference (ACD) in outcomes between Black and White individuals, with appropriate weighting for covariate imbalance across the two racial groups and generalizability. Via extensive simulation, we show that our proposed methods outperform traditional analytic approaches in terms of bias, mean squared error, and coverage. We are motivated by the interplay between race and social determinants of health when estimating racial differences in telomere length using data from the National Health and Nutrition Examination Survey. We build a propensity for race to properly adjust for other social determinants while characterizing the controlled effect of race on telomere length. We find that evidence of racial differences in telomere length between Black and White individuals attenuates after accounting for confounding by socioeconomic factors and after utilizing appropriate propensity score and survey weighting techniques. Software to implement these methods can be found in the R package svycdiff at https://github.com/salernos/svycdiff.