Abstract

Despite increasing interest across a range of scientific applications in modeling and understanding social network structure, collecting complete network data remains logistically and financially challenging, especially in the social sciences. This paper introduces a latent space representation of social network structure for partially observed network data. We derive a multivariate measure of expected (latent) distance between an observed actor and unobserved actors with given features. We also draw novel parallels between our work and dependent data in spatial and ecological statistics. An application using a random digit-dial telephone survey further demonstrates the contribution of our model. The latent space model for networks represents high dimensional network structure through a projection to a low-dimensional latent geometric space—encoding dependence as distance in the space. We develop a latent space model for cases when complete network data are unavailable. We focus specifically on Aggregated Relational Data (ARD) which measure network structure indirectly by asking respondents how many connections they have with members of a certain subpopulation (e.g. How many individuals living with HIV/AIDS do you know?) and are easily added to existing surveys. Instead of conditioning on the (latent) distance between two members of the network, the latent space model for ARD conditions on the expected distance between a survey respondent and the center of a subpopulation in the latent space. A spherical latent space facilitates tractable computation of this expectation. This model estimates relative homogeneity between groups in the population and variation in the propensity for interaction between respondents and group members. The model also estimates features of groups which are difficult to reach using standard surveys (the homeless, for example).

Key Words: Bayesian methods, density estimation, partially observed social network