Abstract

Under model misspecification, the MLE generally converges to the pseudo-true parameter, the parameter corresponding to the distribution within the model that is closest to the distribution from which the data are sampled. In many problems, the pseudo-true parameter corresponds to a population parameter of interest, and so a misspecified model can provide consistent estimation for this parameter. Furthermore, the well-known sandwich variance formula of Huber (1967) provides an asymptotically accurate sampling distribution for the MLE, even under model misspecification. However, confidence intervals based on a sandwich variance estimate may behave poorly for low sample sizes, partly due to the use of a plug-in estimate of the variance. From a Bayesian perspective, plug-in estimates of nuisance parameters generally underrepresent uncertainty in the unknown parameters, and averaging over such parameters is expected to give better performance. With this in mind, we present a Bayesian sandwich posterior distribution, whose likelihood is based on the sandwich sampling distribution of the MLE. This Bayesian approach allows for the incorporation of prior information about the parameter of interest, averages over uncertainty in the nuisance parameter and is asymptotically robust to model misspecification. In a small simulation study on estimating a regression parameter under heteroscedasticity, the addition of accurate prior information and the averaging over the nuisance parameter are both seen to improve the accuracy and calibration of confidence intervals for the parameter of interest.

Keywords: estimating equations, exponential family, model misspecification, pivotal quantity.