Distance Profiles and Conformal Inference for Random Objects
The underlying probability measure of random objects, which are metric-space valued data, can be characterized by distance profiles that correspond to one-dimensional distributions of probability mass falling into balls of increasing radius under mild regularity conditions.
Harvesting pairwise optimal transports between estimated distance profiles leads to a measure of centrality for random objects that is useful for data analysis in general spaces. In the presence of Euclidean (vector) predictors, conditional average transport costs to transport a given distance profile to all other distance profiles can be utilized as conditional conformity scores. Applying the split conformal algorithm these scores lead to conditional prediction sets with asymptotic conditional validity. Illustrations are based on network data from New York taxi trips and on compositional data that reflect energy sourcing of U.S. states. This talk is based on joint work with Yaqing Chen (Rutgers) and Paromita Dubey (USC), and with Hang Zhou (Davis).