This is the neuron shape statistics analysis aimed for Q140/WT, for 6 mice brains: FMQC7-5, MQC15-1, MQC18-3, MQC6-2, MQC6-3, MQC9-3. Shape statistics are retrieved from L-measure

1. Data Exploration

1.1 Frequency distribution of Genotype

There are total number of 552 neurons

## 
## Q140   WT 
##  267  285

1.2 Frequency distribution of mice

## 
## FMQC7-5 MQC15-1 MQC18-3  MQC6-2  MQC6-3  MQC9-3 
##      37      40     173      57     211      34


1.3 Distribution of Numeric Variables

Most variables follow normal distribution, except for Diameter, Volume, PathDistance, Contraction, Fractal Dim, which look skewed and have potential outliers
Click to expand all histograms


1.4 Shape statistics correlation heatmap, clustering

The heatmap shows the robust correlations of shape statistics. The darker red shows stronger positive correlations, and darker blue shows stronger negative correlations. Shape statistics that have high correlation are clustered into leaves in the clustering results on the left.For example, Euc Distance and Path Distance are clustered into one group, n bifs, n branch and n tips are clustered into one group.

The following clustering graph further showed we should collapse some variables to reduce the redundancy. We combine the highly-correlated shape statistics into clusters, using eigene-statistics to represent these clusters. Shape statistics are standardized to have mean = 0 and variance = 1 in the following analysis to make sure they are in a common scale without distorting the differences in the range of their values.


1.5 Neuron Clustering

Cell clustering with indicator of Q140/WT, Mouse ID, and Brain sections. Genotype, Mouse ID and Brain sections are similarly distributed across each cluster.


2. Data Visualization

2.1 PCA of eigine-statistics of cell shape statistics by Q140/WT, mouse ID and brain section

The point shapes represent different clusters, point color indicates the Q140/WT, mouse ID and brain section

2.2 tSNE of eigine-statistics of cell shape statistics by Q140/WT, mouse ID and brain section

The point shapes represent different clusters, point color indicates the Q140/WT, mouse ID and brain section

2.3 Violin plots and Kruskal-Wallis test of shape statistics

2.3.1 By brain section and mouse ID

The violin plot shows not only the summary statistics, but also shows the full distribution of data. Kruskal-Wallis rank sum test that has p-value < 0.05 indicates a statistically significant difference of the shape statistic (in ranks) across different groups. Some shape statistic are statistically significant different in not only each brain sections, but also across mouse IDs. For example, N bifs, N branch, Depth, Pk classic

2.3.2 By genotype

Some shape statistics are statistically significantly different by Q140/WT, including N bifs(p: 0.0064), N branch(p: 0.0066), N tips (p: 0.0073), Width(p: 4.9e-06), Height(p: 6.5e-06), Depth(p: 0.00011), EucDistance(p: 9.4e-09), PathDistance(p: 3.8e-10), Contraction(p: 0.0058),Pk classic(p: 8.9e-06), Bif ampl local(p: 9.7e-08), their figures are shown below

Click to expand more
2.3.3 By genotype and brain section

Some shape statistics are statistically significantly different by Q140/WT in different brain regions:

  • In brain section1, Width(p:0.0018), Height(p: 0.00044), Length(p:0.011), EucDistance(p: 0.00044), PathDistance(p: 0.00038), Fragmentation(p: 0.0088)
  • In brain section2, no statistically significant differences
  • In brain section3, Width(p: 0.0068), Depth(p:0.022), Volume(p:0.018), PathDistance(p:0.016)
  • In brain section4, Width(p: 0.0047), Depth(p:0.027), Diameter(p: 0.00066), Surface(p: 0.0097), Volume(p: 0.0037), EucDistance(p:0.0037), PathDistance(p: 0.0044), Pk classic(p: 0.012), Bif ampl local(p: 1.5e-05)
  • In brain section5, Width(p: 0.011), N branch(p: 0.01), N tips(p: 0.0096), Depth(p:0.0029), Contraction(p: 0.02), Pk classic(p: 0.0017), Bif ampl local(p:0.011)
  • In brain section6, Pk classic(p: 6.5e-05), Bif ampl local(p: 2.2e-07)
    Click to expand more


3. Classification Q140/WT

Use Random Forest and Random GLM to classify Q140/WT using cell shape statistics


The preliminary results indicate further hyperparameter tuning may improve the classification accuracy of Q140/WT.