Body

The interpretability of learned representations is an important topic in both scientific and statistical analysis. This talk therefore introduces a set of algorithmic, statistical, and mathematical tools for ascribing meaning to learned representations. We cast interpretability as a non-linear functional support recovery problem: finding parameterizations of data manifolds from within sets of user-defined interpretable covariate functions. We introduce a family of regression models for this problem that are based on local linearization of non-linear structure and sparsity-inducing penalties for support estimation. This family includes methods for both explaining learned representations and learning representations de novo. We draw connections between these estimators and techniques in convex optimization, non-parametric dimension reduction, and error-in-variables regression, and apply the resulting methodologies to scientific and statistical benefit in experimental examples.