Fang Han
Associate Professor, University of Washington
fanghan@uw.edu | |
Phone | +1 206 221-6560 |
UW Box Number | 354322 |
Homepage | Personal Home Page |
ORCID iD | 0000-0003-2996-5693 |
Preprints
On Rosenbaum's Rank-based Matching Estimator
Matias D. Cattaneo, Fang Han, Zhexiao Lin
In two influential contributions, Rosenbaum (2005, 2020) advocated for using the distances between component-wise ranks, instead of the original data values,…
On regression-adjusted imputation estimators of the average treatment effect
Zhexiao Lin, Fang Han
Imputing missing potential outcomes using an estimated regression function is a natural idea for estimating causal effects. In the literature, estimators that…
On the failure of the bootstrap for Chatterjee's rank correlation
Zhexiao Lin, Fang Han
While researchers commonly use the bootstrap for statistical inference, many of us have realized that the standard bootstrap, in general, does not work for…
On the adaptation of causal forests to manifold data
Yiyi Huo, Yingying Fan, Fang Han
Researchers often hold the belief that random forests are "the cure to the world's ills" (Bickel, 2010). But how exactly do they achieve this? Focused on the…
Limit theorems of Chatterjee's rank correlation
Zhexiao Lin, Fang Han
Establishing the limiting distribution of Chatterjee's rank correlation for a general, possibly non-independent, pair of random variables has been eagerly…
On propensity score matching with a diverging number of matches
Yihui He, Fang Han
This paper reexamines Abadie and Imbens (2016)'s work on propensity score matching for average treatment effect estimation. We explore the asymptotic behavior…
On boosting the power of Chatterjee's rank correlation
Zhexiao Lin, Fang Han
Chatterjee (2021)'s ingenious approach to estimating a measure of dependence first proposed by Dette et al. (2013) based on simple rank statistics has quickly…
Nonparametric mixture MLEs under Gaussian-smoothed optimal transport distance
Fang Han, Zhen Miao, Yandi Shen
The Gaussian-smoothed optimal transport (GOT) framework, pioneered in Goldfeld et al. (2020) and followed up by a series of subsequent papers, has quickly…
Estimation based on nearest neighbor matching: from density ratio to average treatment effect
Zhexiao Lin, Peng Ding, Fang Han
Nearest neighbor (NN) matching as a tool to align data sampled from different groups is both conceptually natural and practically well-used. In a landmark…
On Azadkia-Chatterjee's conditional dependence coefficient
Hongjian Shi, Mathias Drton, Fang Han
In recent work, Azadkia and Chatterjee (2021) laid out an ingenious approach to defining consistent measures of conditional dependence. Their fully…
Azadkia-Chatterjee's correlation coefficient adapts to manifold data
Fang Han, Zhihan Huang
In their seminal work, Azadkia and Chatterjee (2021) initiated graph-based methods for measuring variable dependence strength. By appealing to nearest neighbor…
Fisher-Pitman permutation tests based on nonparametric Poisson mixtures with application to single cell genomics
Zhen Miao, Weihao Kong, Ramya Korlakai Vinayak, Wei Sun, Fang Han
This paper investigates the theoretical and empirical performance of Fisher-Pitman-type permutation tests for assessing the equality of unknown Poisson mixture…
On universally consistent and fully distribution-free rank tests of vector independence
Hongjian Shi, Marc Hallin, Mathias Drton, Fang Han
Rank correlations have found many innovative applications in the last decade. In particular, suitable rank correlations have been used for consistent tests of…
On the power of Chatterjee rank correlation
Hongjian Shi, Mathias Drton, Fang Han
Chatterjee (2021) introduced a simple new rank correlation coefficient that has attracted much recent attention. The coefficient has the unusual appeal that it…
Distribution-free consistent independence tests via center-outward ranks and signs
Hongjian Shi, Mathias Drton, Fang Han
This paper investigates the problem of testing independence of two random vectors of general dimensions. For this, we give for the first time a distribution…
On a phase transition in general order spline regression
Yandi Shen, Qiyang Han, Fang Han
In the Gaussian sequence model $Y= \theta_0 + \varepsilon$ in $\mathbb{R}^n$, we study the fundamental limit of approximating the signal $\theta_0$ by a class …
Optimal estimation of variance in nonparametric regression with random design
Yandi Shen, Chao Gao, Daniela Witten, Fang Han
Consider the heteroscedastic nonparametric regression model with random design \begin{align*} Y_i = f(X_i) + V^{1/2}(X_i)\varepsilon_i, \quad i=1,2,\ldots,n, …
High dimensional consistent independence testing with maxima of rank correlations
Mathias Drton, Fang Han, Hongjian Shi
Testing mutual independence for high-dimensional observations is a fundamental statistical challenge. Popular tests based on linear and simple rank…
Exponential inequalities for dependent V-statistics via random Fourier features
Yandi Shen, Fang Han, Daniela Witten
We establish exponential inequalities for a class of V-statistics under strong mixing conditions. Our theory is developed via a novel kernel expansion based on…
On rank estimators in increasing dimensions
Yanqin Fan, Fang Han, Wei Li, Xiao-Hua Zhou
The family of rank estimators, including Han's maximum rank correlation (Han, 1987) as a notable example, has been widely exploited in studying regression…
On Estimation of Isotonic Piecewise Constant Signals
Chao Gao, Fang Han, Cun-Hui Zhang
Consider a sequence of real data points $X_1,\ldots, X_n$ with underlying means $\theta^*_1,\dots,\theta^*_n$. This paper starts from studying the setting that…
Probability inequalities for high dimensional time series under a triangular array framework
Fang Han, Weibiao Wu
Study of time series data often involves measuring the strength of temporal dependence, on which statistical properties like consistency and central limit…
Asymptotic joint distribution of extreme eigenvalues and trace of large sample covariance matrix in a generalized spiked population model
Zeng Li, Fang Han, Jianfeng Yao
This paper studies the joint limiting behavior of extreme eigenvalues and trace of large sample covariance matrix in a generalized spiked population model,…
Moment bounds for large autocovariance matrices under dependence
Fang Han, Yicheng Li
The goal of this paper is to obtain expectation bounds for the deviation of large sample autocovariance matrices from their means under weak data dependence…
Tail behavior of dependent V-statistics and its applications
Yandi Shen, Fang Han, Daniela Witten
We establish exponential inequalities and Cramer-type moderate deviation theorems for a class of V-statistics under strong mixing conditions. Our theory is…
On inference validity of weighted U-statistics under data heterogeneity
Fang Han, Tianchen Qian
Motivated by challenges on studying a new correlation measurement being popularized in evaluating online ranking algorithms' performance, this manuscript…
An Extreme-Value Approach for Testing the Equality of Large U-Statistic Based Correlation Matrices
Cheng Zhou, Fang Han, Xinsheng Zhang, Han Liu
There has been an increasing interest in testing the equality of large Pearson's correlation matrices. However, in many applications it is more important to…
Pairwise Difference Estimation of High Dimensional Partially Linear Model
Fang Han, Zhao Ren, Yuxin Zhu
This paper proposes a regularized pairwise difference approach for estimating the linear component coefficient in a partially linear model, with consistency…
Distribution-Free Tests of Independence in High Dimensions
Fang Han, Shizhe Chen, Han Liu
We consider the testing of mutual independence among all entries in a $d$-dimensional random vector based on $n$ independent observations. We study two…
A Provable Smoothing Approach for High Dimensional Generalized Regression with Applications in Genomics
Fang Han, Hongkai Ji, Zhicheng Ji, Honglang Wang
In many applications, linear models fit the data poorly. This article studies an appealing alternative, the generalized regression model. This model only…
On Gaussian Comparison Inequality and Its Application to Spectral Analysis of Large Random Matrices
Fang Han, Sheng Xu, Wen-Xin Zhou
Recently, Chernozhukov, Chetverikov, and Kato [Ann. Statist. 42 (2014) 1564--1597] developed a new Gaussian comparison inequality for approximating the suprema…
An Exponential Inequality for U-Statistics under Mixing Conditions
Fang Han
The family of U-statistics plays a fundamental role in statistics. This paper proves a novel exponential inequality for U-statistics under the time series…
ECA: High Dimensional Elliptical Component Analysis in non-Gaussian Distributions
Fang Han, Han Liu
We present a robust alternative to principal component analysis (PCA) --- called elliptical component analysis (ECA) --- for analyzing high dimensional,…
Statistical analysis of latent generalized correlation matrix estimation in transelliptical distribution
Fang Han, Han Liu
Correlation matrices play a key role in many multivariate methods (e.g., graphical model estimation and factor analysis). The current state-of-the-art in…
Robust Inference of Risks of Large Portfolios
Jianqing Fan, Fang Han, Han Liu, Byron Vickers
We propose a bootstrap-based robust high-confidence level upper bound (Robust H-CLUB) for assessing the risks of large portfolios. The proposed approach…
Challenges of Big Data Analysis
Jianqing Fan, Fang Han, Han Liu
Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle…
A Direct Estimation of High Dimensional Stationary Vector Autoregressions
Fang Han, Huanran Lu, Han Liu
The vector autoregressive (VAR) model is a powerful tool in modeling complex time series and has been exploited in many fields. However, fitting high…
On the Impact of Dimension Reduction on Graphical Structures
Fang Han, Huitong Qiu, Han Liu, Brian Caffo
Statisticians and quantitative neuroscientists have actively promoted the use of independence relationships for investigating brain networks, genomic networks,…
Joint Estimation of Multiple Graphical Models from High Dimensional Time Series
Huitong Qiu, Fang Han, Han Liu, Brian Caffo
In this manuscript we consider the problem of jointly estimating multiple graphical models in high dimensions. We assume that the data are collected from n…
High Dimensional Semiparametric Scale-Invariant Principal Component Analysis
Fang Han, Han Liu
We propose a new high dimensional semiparametric principal component analysis (PCA) method, named Copula Component Analysis (COCA). The semiparametric model…
Sparse Median Graphs Estimation in a High Dimensional Semiparametric Model
Fang Han, Han Liu, Brian Caffo
In this manuscript a unified framework for conducting inference on complex aggregated data in high dimensional settings is proposed. The data are assumed to be…
Sparse Principal Component Analysis for High Dimensional Vector Autoregressive Models
Zhaoran Wang, Fang Han, Han Liu
We study sparse principal component analysis for high dimensional vector autoregressive time series under a doubly asymptotic framework, which allows the…