Abstract

We introduce a novel geometric framework for separating the phase and the amplitude variability in functional data of the type frequently studied in growth curve analysis. This framework uses the Fisher-Rao Riemannian metric to derive a proper distance on the quotient space of functions modulo the time-warping group. A convenient square-root velocity function (SRVF) representation transforms the Fisher-Rao metric into the standard L2 metric, simplifying the computations. This distance is then used to define a Karcher mean template and warp the individual functions to align them with the Karcher mean template. The strength of this framework is demonstrated by deriving a consistent estimator of a signal observed under random warping, scaling, and vertical translation. The new method is empirically shown to be superior in performance to several recently published methods for registration and alignment of functional data.

Project Description

The problem of statistical analysis in function spaces is important in a wide variety of applications arising in nearly every branch of science, ranging from speech processing to geology, biology and chemistry. One can easily encounter a problem where the observations are real-valued functions on an interval, and the goal is to perform their statistical analysis. By statistical analysis we mean to compare, align, average, and model a collection of such random observations. These problems can, in principle, be addressed using tools from functional analysis, e.g. using the L2 Hilbert structure of the function spaces, where one can compute L2 distances, cross-sectional (i.e. point-wise) means and variances, and principal components of the observed functions. However, a serious challenge arises when functions are observed with flexibility or domain warping along the x axis. This warping may come either from an uncertainty in the measurement process, or may simply denote an inherent variability in the underlying process itself that needs to be separated from the variability along the y axis (or the vertical axis), such as variations in maturity in the context of growth curves. As another possibility, the warping may be introduced as a tool to horizontally align the observed functions, reduce their variance and increase parsimony in the resulting model. We will call these functions elastic functions, keeping in mind that we allow only the x-axis (the domain) to be warped and the y-values to change only consequentially.

Consider the set of functions shown in the above left panel. These functions differ from each other in both heights and locations of their peaks and valleys. One would like to separate the variability associated with the heights, call it the amplitude variability from the variability associated with the locations, call it the phase variability. Extracting the amplitude variability implies temporally aligning the given functions using nonlinear time warping, with the result shown in bottom right. The corresponding set of warping functions, shown in top right, represent the phase variability. The phase component can also be illustrated by applying these warping functions to the same function, as shown in top right. The main reason for separating functional data into these components is to better preserve the structure of the observed data, since a separate modeling of amplitude and phase variability will be more natural, parsimonious and efficient.

Related Publications