Directed graphical models are commonly used to model causal relations between random variables and to understand conditional independencies in their joint distributions. We focus on the crucial task of structure learning, which aims to recover graphical structures using observational data sampled from distributions that obey certain underlying graphical model.

A common challenge in structure learning is the computational and statistical cost of learning large graphs or using high dimensional data. In this dissertation, we study four cases where the efficiency of structure learning could be improved over existing methods. We propose new algorithms and provide theoretical guarantees for the consistency of these algorithms.

First, we study a simple setting of linear structural equation model (SEM) with equal error variances. It is known that in this setting the DAG can be uniquely identified from observational data. We proposed a simple yet state-of-the-art procedure that sequentially estimates the causal ordering of the random variables. This procedure is consistent and readily extendable to high-dimensional setting. Next we consider the problem of structure learning in sparse high-dimensional settings that may be subject to the presence of unmeasured confounders, as well as selection bias. We propose a new local notion of sparsity for consistent structure learning in the presence of latent and selection variables, and develop a new version of the Fast Causal Inference (FCI) algorithm with reduced computational and sample complexity, which we refer to as local FCI (lFCI). The new notion of sparsity allows the presence of highly connected hub nodes, which are common in real-world networks, but problematic for existing methods. Our numerical experiments indicate that the lFCI algorithm achieves state-of-the-art performance across many classes of large random networks containing hub nodes. As part of the investigation on efficiency improvement, we also study the graphical characterization of ancestral relations via CPDAGs and d-separation relations. We propose a framework that can learn definite non-ancestral relations without first learning the skeleton. We demonstrated that this framework yields structural information that can be used in both score- and constraint-based algorithms to learn causal DAGs more efficiently. Finally, we consider a DAG learning problems where partial ordering of variables is available. We discuss a general procedure for discovering DAGs with arbitrary structure from partial orderings. We also present an efficient framework for estimation of high-dimensional, sparse directed acyclic graphs.