Loew Hall

### Loew Hall

# Location estimation for symmetric log-concave densities

We revisit the problem of adaptive estimation of the center of symmetry of an unknown symmetric distribution with an additional shape-constraint of log-concavity. This problem was investigated by early authors like Stone (1975), Van Eden (1970), Sacks (1975) who constructed adaptive estimators which depend on tuning parameters. An additional assumption of log-concavity can help us construct simpler estimates which can be efficiently computed without using tuning parameters. To estimate the center of symmetry we consider truncated one-step estimators.

# Statistical Models for Social Network Data and Processes

This work deals with three areas of network modeling. First, in the area of latent space modeling of social networks, it develops and extends latent cluster social network models by adding random effects and providing efficient algorithms for fitting these models. Second, it explores properties of ERGM and ERGM-based models under changing network size, and proposes a way of addressing the problems that arise.

# Fast Automatic Unsupervised Image Segmentation and Curve Detection in Spatial Point Processes

Advisor: Adrian Raftery

# A Flexible Framework for Bayesian Learning and Estimation Using Gaussian Graphical Models

We develop a framework for the modeling of joint distributions of high-dimensional data that is robust to a variety of data types and modeling paradigms. Central to our considerations are the issues of structural learning and posterior parameter estimation. By building our framework from Gaussian Graphical Models (GGMs) we are able to separate the learning and estimation problems, thereby proposing a methodology possessing desirable computational features.

# Lifetime and Disease Onset Distributions from Incomplete Observations

Advisor: Jon Wellner

# Goodness-of-Fit Statistics Based on Phi-Divergences

# Lattice Conditional Independence Models for Missing Observations in Categorical Data, in Continuous Data, and for Seemingly Unrelated Regressions

Advisor: Michael Perlman

# Pricing Options in Incomplete Markets: Regression with Constraints

**Joint Computational Finance and Optimization Seminar** We consider a new approach to pricing options in incomplete markets. The algorithm replicates an option by a portfolio consisting of a stock and a bond. It simultaneously calculates prices of options with all strikes. We apply a linear regression framework with constraints which can accommodate various assumptions on stochastic processes of the underlying security. We can directly calibrate the model with historical prices of the underlying security using assumptions on the class of replication policies.

# Multiscale Testing of Physical Processes Using the Discrete Wavelet Transform

Advisors: Peter Guttorp & Don Percival

# Nonparametric Probability Density Estimation: Theory, Modeling and Computation

We formulate nonparametric density estimation as a constrained maximum likelihood problem whose constraints model any prior information available about the density. This technique will be used as a vehicle to illustrate the importance of including non-data information in the formulation of an estimation problem. For example, this non-data information may take the form of bounds on moments or specification of support, shape or smoothness.

# Morphological and Bayesian Approaches in Image Restoration

Two of the most successful approaches in image restoration are mathematical morphology and Bayesian image restoration. They arise from different philosophies and are formulated very differently. Mathematical morphology involves a basic set of elementary operators that are usually combined to perform non-linear filtering of noisy images. The Bayesian approach involves finding the best interpretation of the images assuming a probabilistic model. Our aim is to investigate the possible relationships between these two approaches. The interest of such a study is twofold.

# Nonparametric Bayesian Classification

A Bayesian approach to the classification problem is proposed in which random partitions play a central role. It is argued that the partitioning approach has the capacity to take advantage of a variety of large-scale spatial structures, if they are present in the unknown regression function $f_0$. An idealized one-dimensional problem is considered in detail. The proposed nonparametric prior is found to provide a consistent estimate of the regression function in the $\\L^p$ topology, for any $1 \\leq p < \\infty$, and for arbitrary measurable $f_0:[0,1] \\rightarrow [0,1]$.

# A Hidden Two-Dimensional Irreversible Stochastic Compartment Model

Advisor: Peter Guttorp We consider a special case of the two-dimensional stochastic n-compartment or stepping-stone model. This class of models represents a special type of Markov population process in which the state of the process a given time t is represented by (X1(t),X2(t)) = (X11(t), ... ,X1n(t);X21(t), ... , X2n(t)). In our example, all compartments except the nth are assumed completely unobservable, while in the nth we are able to obtain only a sample of each dimension at discrete time intervals. The first compartment is a hidden linear birth-emigration process.

# Extended Linear Modeling with Splines

Extended linear models form a very general framework for statistical modeling. Many practically important contexts fit into this framework, including regression, logistic or Poisson regression, density estimation, spectral density estimation, and conditional density estimation. Moreover, hazard regression, proportional hazard regression, marked counting process regression, and diffusion processes with or without jumps, all perhaps with time-dependent covariates, also fit into this framework.

# In Memory of Lucien Le Cam (1924 - 2000): Seven of Le Cam's Theorems Every Statistician Should Know

Lucien Le Cam spent most of his career at the University of California, Berkeley. He created and developed large parts of the current large sample theory of statistics including contiguity theory, approximation of experiments, rates of Poisson approximation, tightness and basic theory for empirical processes, Poissonization inequalitites, preservation of local asymptotic normality under information loss, and methods for establishing rates of convergence in nonparametric problems in terms of measures of metric dimension investigated by Kolmogorov.

# The Forensic and Statistical Science of Evidence Derived from Fragments of Glass

When someone breaks a window or another type of glass tiny fragments of glass are transferred onto their clothing. If the breaking of this window or other related events is a criminal offence, then these fragments become evidence. The physical properties of the recovered glass fragments can be matched with those of the putative source. However other processes are at work, which affect the quantity and strength of the evidence.

# Estimating Local Trends in Large Environmental Spatial Temporal Databases

Many large-scale environmental data bases have been produced in recent years for the purpose of knowledge discovery related to processes such as greenhouse gas cycling, large scale hydrology etc. These data bases typically extend over a period of 20 to 50 years and over large spatial domains, such as continents, hemispheres, or even the entire global terrestrial domain. The possible existence of (temporal) trends is one of the primary topics of interest to environmental scientists.

# Probabilities on Pedigrees: What? How? and Why?

Patterns of inheritance of genes on pedigrees underlie similarities among relatives, and hence approaches to the analysis of genetic data observed on related individuals. With modern genetic technology, data are often available for large numbers of genetic loci, sometimes on large sets of interrelated individuals. The space of underlying inheritance patterns consistent with the data is then not only huge, but also tightly constrained by the laws of genetics.

# Empirical Margin Distributions and Bounding the Generalization Error in Learning Problems

We will consider a new class of probabilistic upper bounds on generalization error of complex classifiers that are \"combinations\" of simpler classifiers from a base class of functions. Such combinations can be implemented by neural networks or by voting methods of combining the classifiers, such as boosting and bagging. The resulting combined classifiers often have a large classification margin. The bounds on the generalization error are expressed in such cases in terms of the empirical distribution of the margins of the combined classifier.

# Causal Inference From Graphical Time Series Models

Graphical models have become an important tool for analyzing multivariate data. While originally the interpretation of graphical models has been restricted to conditional independences between variables there has recently been growing interest in graphical models as a general framework for causal modelling and inference in experimental and observational studies. In this talk we discuss two approaches for the identification of cause-effect relationships from multivariate time series data. Both approaches exploit the fact that an effect cannot precede its cause in time for causal inference.

# Bayesian Multidimensional Scaling and Choice of Dimension

Multidimensional scaling is widely used to handle data which consist of dissimilarity measures between pairs of objects or people, and consists of estimating an object configuration in Euclidean space such that the estimated distances are related to the dissimilarities. Problems of this kind are pervasive in psychology and social science, and have arisen recently in areas such as document clustering, classification of Web sites, gene expression data, and datamining.

# Detecting and Extracting Complex Patterns from Images and Realizations of Spatial Point Processes

Advisor: Adrian Raftery

# Likelihood Inference for Parameteric Models of Dispersal

Advisor: Elizabeth Thompson