Statistical Shape Analysis & Modeling Group

Statistical Inferences in the Spike Train Space

The research goal of this project is to develop a framework for a novel, data-driven statistical inference in the space of neural spike trains. Statistical inference methods have played an essential role in the analysis of neural spike trains. However, current approaches are mainly based on probabilistic representations at each time and therefore cannot address the statistical nature in the space of spike trains directly. Motivated by the abundance of spike train data from experimental recordings, we propose a data-driven framework to address this important issue by treating each spike train as one point in an infinite dimensional function space. The new framework will be based on novel metric (distance) systems on spike trains. We will construct new tools for: 1) quantifying differences in spike trains, 2) computing summary statistics such as means and covariance of spike trains, and 3) performing statistical inferences in the spike train space.

Time-dependent information is represented via sequences of stereotyped spike waveforms in the nervous system. A precise characterization and analysis of waveform sequences (or spike trains) is one of the central problems in neural coding. By the nature of spike trains (random samples from experimental recordings), probabilistic and statistical methods have been extensively investigated to examine the underlying firing patterns. Current statistical analysis on neural data can be highlighted in Fig. 1A (Kass et al., 2005), where a parametric model (e.g. a Poisson or point process in general) is typically a prerequisite to formal inference procedures. This analysis paradigm is now a standard procedure for production of scientific conclusions from experimental results and many important and useful tools have been built for appropriate modeling and efficient inferences.

However, these analysis methods only focus on parametric representations at each given time and therefore can prove limited in data-driven problems in the space of spike trains directly. If we look each spike train in a sample as one single point in an infinite dimensional spike train function space, in a non-parametric way, we may naturally ask questions like “What is the central tendency of the sample?” and “What is the variability of the sample?” Though basic, a commonly-used spike train probability model will not be able to address them (such model only characterizes variability at each specific time). In this case, an overall measurement is desired for spike train variability across the entire time domain. We propose a principled, metric-based framework to address this issue. The new procedure is illustrated in Fig. 1B. In contrast to some informal exploratory analysis for a parametric model in the classical case, we propose to compute metric-based summary statistics in the spike train space, and then use the results to build a (parametric or nonparametric) probability model in the function space. Fig. 2A illustrates the neural activity of one motor cortical neuron with 4 different motor behaviors (6 spike trains in each case). The typical pattern with each behavior is characterized by a mean spike train in Fig. 2B. Once the model is built, one will be able to perform classical statistical inferences such as confidence intervals, hypothesis tests, ANOVA, regressions, bootstraps, and PCA in the spike train space. Moreover, inferences can also be performed by directly using summary statistics (dashed line in Fig. 1B). For example, a sample mean, which characterizes the typical firing pattern in a neuron, can be used in classification problems to distinguish the origin of the spike train or decode the associated stimulus/behavior.