Body

Preference data, such as rankings and ratings, are prevalent in the social sciences for expressing and measuring attitudes or opinions. Oftentimes, deterministic algorithms or summary statistics are used to aggregate preferences, which lack the ability to measure uncertainty or identify preference heterogeneity in a population. This thesis proposes new methodologies for statistical preference analysis that aid accurate estimation, inference, and decision-making with preference data in social science applications.

Motivated by previous attempts to integrate ordinal and cardinal data in psychometrics and computer science, we first propose two of the first joint statistical models for rankings and ratings. Our models exploit the distinct and complementary properties of rankings and ratings to estimate fine-grained preferences in a population and identify potential heterogeneity. The proposed models impose few assumptions and permit many common preference data types, allowing their use in a variety of applications. We propose computationally efficient frequentist and Bayesian estimation frameworks, and apply the models to real data from peer review and to preference survey data.

Second, we propose a Bayesian methodology for estimating rank-clusters from rankings. Rank-clusters denote cases when objects in a collection are equal in quality and thus should be clustered in their population-level rank. We extend previous frequentist work on rank-clustering pairwise comparison data to permit analysis of more flexible ordinal data types. Furthermore, the model relies on a Bayesian framework that naturally allows for incorporating prior information and uncertainty quantification. We apply our model to real ranked-choice election data to analyze voters' perceptions of candidates.