Team ID: 5

Name (tasks): Oscar Alvardo

Name (tasks): Karshima Johnson (Spike Trains and Spike Train Analysis, Statistics Regarding Neurons, Population and Study Design, Correlation Matrices for entire time frame, Visuals of Krebs, Data Transformation)

Name (tasks): Christina Thai (Introduction, Background, Summary of Data, Visual of Probe Locations in Brain, Time Series Visual of Average Number of Neurons/Region, Kendall Tau’s correlation) ***

0.1 Introduction

This document is the final report for the Spring 2019 course STA 141A at University of California, Davis. It seeks to examine whether activity in one region affects the activity in another region.

To reduce the variability of cortical responses to sensory stimuli, analyzing the neural spikes in multiple brain regions during spontaneous activity will yield more accurate, less noisy, results.

Using the data gathered from contributors Nick Steinmetz, Marius Pachitariu, Carsen Stringer, Matteo Carandini, and Kenneth Harris’ experiment, we seek to analyze the areas of the brain that are used in conjunction, regardless of stimuli. The original experiment investigates fluctuations in arousal levels indicated by running, pupil area, and whisking. Our variable of interest is the cortical responses as these behaviors are ongoing.

0.2 Background

In this project, we analyze the reduced data set Reduced_Krebs.mat containing Neuropixel recordings of a mouse named Krebs during spontaneous activity. The full data set can be found at this webpage.

The data used has already been preprocessed using a Matlab script, which can be found in the code Appendix of the report.

1 Summary of Data

This report focuses on data collected from an experiment done with a mouse named Krebs. The neuron activation recorded from stimuli from whisking (measured by summed videographic motion energy within the whisker region).

This data set contains spike trains of neurons in nine regions in the mouse brain.

These spike trains for the duration of recording period are visualized as follows. Each point represents a neuron firing within its brain region at the given time period. A spike corresponds to all of the neural firings at a time frame. Each time frame index corresponds to 30ms. In total, we have 39053 time frames, corresponding to 1171.59s of recorded data for Krebs. Sequences of spikes, form a spike train such as follows.

Within the spike tains we notice that certain brain regions (e.g. Region 4) have missing spikes, which we believe could be a result of technological data or error in recording with the sensory probes.

The following represent elements of behavior data: avgframe Average Frame of Recording
faceSVD Singular Value Decompositon of Compressed Image of Face
motionMask Corresponding Masks to Singular Values

The data set being referenced Reduced_Krebs.mat contains the first 50 singular values and their masks. These are used to reproduce an image based on the average of the frames within the data set using infrared camera.

The following columns represent the nine regions for Krebs in the mouse brain:

stall.CP Caudate Putamen
stall.FrMoCtx Frontal Motor Cortex
stall.HPF Hippocampal Formation
stall.LS Lateral Septum
stall.MB Midbrain
stall.SC Superior Colliculus
stall.SomMoCtxx Somatomotor Cortex
stall.TH Thalamus
stall.V1 Primary Visual Cortex

The data set contains spike trains of neurons in nine regions of the mouse brain. Each corresponding column represents a specific region of the brain; for example, the vector stall.CP relates to the activity and activation of neurons located in the Caudate Putamen.

A more in depth explanation on the structure of our data is that it is a binary matrix with values of 1 denoting an existance of neural activity and 0 being that there is no neural activity at the specific time frame. Each row represents a neuron within the region.

1.1 Visual of Brain Regions

Probe Locations of Mouse Brain

Probe Locations of Mouse Brain

The figure above is the reconstructed probe locations of recordings in Krebs. The poles are generated using the borders of the brain regions and map depth of the region in microns relative to position of the principal axes of the body (AP/DV/LR).

Note: AP refers the antero-posterior, DV refers to the dorso-ventral, and LR refers to the left-right axis formation.

1.2 Statistical questions of interest

We wish to examine which brain regions are being are activated(?) relatively simultaneously. In other words, we seek to examine if brain activity in the Thalamus can also imply activity in the Primary Visual Cortex.

Additionally, we want to examine neural activity throughout the entire time period and locate where there may be significant changes in neural activity.

2 Analysis Plan

2.1 Population and study design

The initial study monitored large populations of neurons in awake head-fixed mice. These mice were unengaged from behavioral task and then their spontaneously performed behaviors (i.e. whisking, sniffing, and other facial movements) were monitored videographically. The study recorded simultaneous neurological data of six mice over nine sessions and worked to identify the relationship between these behaviors and their neural activity.

Further, the studies were testing in various settings such as with and without darkness and recordings in darkness or with a gray screen. They observed no differences between these recordings, thus we take it to imply that Krebs’ sample data is representative of the mouse’s overall behavior in this experiment.

In addition, we acknowledge that mouse models for genetic research prove to be excellent tools for insight into immune, endocrine, cardiovascular, skeletal, and in particular, nervous systems shared by mammals. Thus conclusions obtained through the study may be a gateway to understanding the human brain.

2.2 Data Transformation

In order to compare neural activity between brain regions (e.g. comparing the Primary Visual Cortex to the Midbrain), we consider the proportion of neurons firing at a given time frame (equivalent to 30ms) to the total number of neurons in the brain region. This will also be referred to as the amount of neural activity in a region at a given time or neural firing rate.

Neurons Per Brain Region
CP FrMoCtx HPF LS MB SC SoMoCtxx TH V1
Neurons per Region 176 78 265 122 127 42 91 227 334

2.3 Kendall Tau’s Correlation

We choose to examine the pairwise correlations with Kendall’s method with ranks since this method does not need any assumptions on the distribution. This non-parametric correlation method measures the number of discordant and concordant pairs. Assume there are pairs \((X_i, Y_i)\) and \((X_j,Y_j)\) where \(i < j\). If \(X_i < X_j\) and \(Y_i < Y_j\), then \((X_i - X_j) \times (Y_i - Y_j) > 0\) and \((X_i, Y_i)\) and \((X_j,Y_j)\). Thus \((X_i, Y_i)\) \((X_j,Y_j)\) is a concordant pair and an increase in X implies an increase in Y. Similar logic is applied to the definition of discordant pairs, which can be seen in the mathematical definition: \((X_i - X_j) \times (Y_i - Y_j) < 0\). Tied pairs mean \((X_i - X_j) \times (Y_i - Y_j) = 0\) Kendall’s \(\tau = \frac{C - D}{n(n-1)/2)}\), where C is the number of concordant pairs and D represents discordant pairs. If most pairs are concordant, then there is a postive linear relationship. Similarly, if most pairs are discordinant, then the data suggests a negative linear relationship. If there are an equal number of discordant and concordant pairs, this suggests no linear relationship, i.g. X and Y are independent.

3 Results

3.1 Descriptive Anaysis

Within our data, we observe the frequencies of neural firings from each brain region.

To highlight features of each brain region, we observe the maximum, minimum, average, and variance of proportion of fired neurons compared to total neurons per region. We do not include zero as a minimum for neural activity due to the fact that it represents a lack of activity, not within our interests.

Statistics Regarding Proportion of Firing Neurons
CP FrMoCtx HPF LS MB SC SoMoCtxx TH V1
Maximum Firing Rate 0.664773 0.820513 0.943396 0.901639 1.559055 1.285714 0.967033 1.044053 0.904192
Minimum Firing Rate 0.005682 0.012821 0.003774 0.008197 0.007874 0.023810 0.010989 0.004405 0.002994
Average Firing Rate 0.057040 0.134761 0.286161 0.118649 0.413877 0.275907 0.168206 0.262155 0.142400
Variance of Firing Rate 0.002815 0.011700 0.023139 0.004305 0.028875 0.037070 0.013314 0.012252 0.011586

It is notable that there are recorded neuron firing rates greater than one. This can be attributed to the fact that one time frame is equivalent to 30ms. It has been recorded that mice can have neurons fire multiple times in 30 ms, therefore it is logical to observe that the proportion of fired neurons in one time frame is larger than the total number of neurons within the region. We note that the Midbrain has the highest maximum and average firing rate. This implies that the Midbrain has the most neural activity…. the Superior Colliculus has the highest minimum and

3.2 Interactive Plots of Brain Region Activity

Recall that the brain regions are: stall.CP Caudate Putamen
stall.FrMoCtx Frontal Motor Cortex
stall.HPF Hippocampal Formation
stall.LS Lateral Septum
stall.MB Midbrain
stall.SC Superior Colliculus
stall.SomMoCtxx Somatomotor Cortex
stall.TH Thalamus
stall.V1 Primary Visual Cortex

The following interactive plot merges and overlaps the plots of the spike trains of each brain region overlapping across the entire recording period.

Viewing them as overlapping spikes allows us to see which regions, if any, with similar “spikes” or pattern. We can analyze the spikes using with:
1. Proportion of neurons in the area have a similar density of neurons firing.
2. Time periods with is a similar spike trend/pattern in the proportion of neurons firing.

3.3 Reducing Variability in Each Region

Since plotting each proportion of neurons firing at each millisecond is extremely volatile, we proposed to analyze the average proportion of neurons firing at a new interval 1 through 1302 where each “time” is considered a time window of 30ms.

As expected, we found that the basic trends of the previous (and volatile) plot remains the same. Thus, we can identify any time periods where there are any significant changes in brain activity (stationary or otherwise). See section Patterns of Brain Region Activity over Entire Time Period for more details.

The above plot shows the variance of each 30ms time window for each brain region. Although there seems to be slight

3.3.1 Patterns of Brain Region Activity over Entire Time Period

Stationary Spike
1 - 10,000 10,001 - 12,000
12,001 - 19,500 20,000 - 29,000
end

3.4 Correlation Analysis

We choose to examine correlation using the Kendall method because it does not require the assumption of a linear relationship. Another reason why we chose to use Kendall Tau’s method for correlation because the ranking of data points mitigates the effects of outliers.

3.5 Pearson’s Correlation


Here, Pearson’s correlation

3.6 Kendall’s Correlation Matrix

Kendall’s correlation is much smaller value, which we believe is due to the mitigation of outliers.

4 Discussion

This project shed light on how to better visualize spike trains. Although our analysis was relatively elementary, visualizing spike trains is the first step to supplement more comprehensive studies. We hope this visual tool can be replicated on other neural pixel datasets as well.

Since the computational power required to analyze the whole data set from the original experiement is massive, we perhaps found ourselves at a disadvantage.

Note: Discuss the limitations of the presented projects, and comment on how this project enlightens future research or analysis.

5 Session information

## R version 3.6.0 (2019-04-26)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] R.matlab_3.6.2     dygraphs_1.1.1.6   dplyr_0.8.0.1     
##  [4] qlcMatrix_0.9.7    sparsesvd_0.1-4    slam_0.1-45       
##  [7] Matrix_1.2-17      matrixStats_0.54.0 knitr_1.22        
## [10] corrplot_0.84      ggcorrplot_0.1.3   ggplot2_3.1.1     
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.1        highr_0.8         pillar_1.3.1     
##  [4] compiler_3.6.0    plyr_1.8.4        xts_0.11-2       
##  [7] R.utils_2.8.0     R.methodsS3_1.7.1 tools_3.6.0      
## [10] docopt_0.6.1      digest_0.6.18     jsonlite_1.6     
## [13] evaluate_0.13     tibble_2.1.1      gtable_0.3.0     
## [16] lattice_0.20-38   pkgconfig_2.0.2   rlang_0.3.4      
## [19] yaml_2.2.0        xfun_0.6          withr_2.1.2      
## [22] stringr_1.4.0     htmlwidgets_1.3   grid_3.6.0       
## [25] tidyselect_0.2.5  glue_1.3.1        R6_2.4.0         
## [28] rmarkdown_1.12    purrr_0.3.2       magrittr_1.5     
## [31] scales_1.0.0      htmltools_0.3.6   assertthat_0.2.1 
## [34] colorspace_1.4-1  stringi_1.4.3     lazyeval_0.2.2   
## [37] munsell_0.5.0     crayon_1.3.4      R.oo_1.22.0      
## [40] zoo_1.8-5