Estimating causal effects of natural language from text experiments
Seminar presented by Eli Ben-MichaelAs language technologies become widespread, text experiments — where texts are randomly assigned to readers — are increasingly used by social scientists and technology developers to understand how language affects perceptions and behavior. This talk presents a framework for estimating causal effects of language from such experiments. A key challenge is the high-dimensional nature of language, which leads to positivity violations and low effective sample sizes. We address this by characterizing language-encoded interventions as stochastic interventions, averaging over the distribution of texts from a corpus. We distinguish between ``natural effects'' that capture the effect of a language attribute along with correlated attributes, and ``isolated effects'' that capture the effect of the attribute while keeping others fixed. We show that natural effects are easily identified and estimated in text experiments, but as experimental corpora are not always representative of naturally occurring language, we propose a method to generalize from a randomized text corpus to any target corpus. In contrast, it is challenging to estimate isolated effects even in randomized text experiments, as this requires approximating and adjusting for all non-focal language attributes. We link this to learning text representations and use principles of omitted variable bias to evaluate isolated effect estimation along the axes of the fidelity and overlap of the text representations. Finally, we apply these ideas to large language model (LLM) alignment, showing how to learn generative text models that optimally cause desired impacts and de-bias correlational approaches to LLM alignment. Throughout, we demonstrate these approaches in various applied settings including political persuasion and the perception of hate speech.
