#Read in file
myGex <- read.table("ENCFF166SFX.tsv", header = TRUE)
#This histogram shows gene expression.
#Most transcripts are expressed at low levels, which is normal, while a few are highly expressed.
#Values are in log form due to the varying levels of expression.
library(ggplot2)
ggplot(myGex, aes(x = log1p(TPM))) +
geom_histogram(bins = 50, fill = "red") +
labs(
title = "Distribution of Transcript Expression",
x = "log(TPM + 1)",
y = "Number of Transcripts"
)

#In this case, the majority of the transcripts are expressed at lower levels, which is common for neurological tissue. However, some useful transcripts may not be expressed properly due to the progression of Alzheimer's disease.
#This shows the distribution of transcript lengths in the dataset. It’s useful to see if the data has mostly short, medium, or long transcripts for analysis.
ggplot(myGex, aes(x = length)) +
geom_histogram(binwidth = 100, fill = "red", color = "black") +
labs(
title = "Distribution of Transcript Lengths",
x = "Transcript Length (bp)",
y = "Number of Transcripts"
)

#The transcripts here are mostly short. Short transcripts may correspond to different gene functions. Excessively short transcripts could signal sample quality issues. However, the more likely reason is tissue damage from advancing Alzheimer's disease.
#This is another way to view the data in the form of a scatterplot. This plot in particular checks if very long transcripts tend to have higher or lower expression.
ggplot(myGex, aes(x = length, y = TPM)) +
geom_point(alpha = 0.2, color = "black") +
labs(
title = "Transcript Length vs Expression (TPM)",
x = "Transcript Length (bp)",
y = "TPM"
)

# The results of this graph indicate most transcripts are short and lowly expressed. This suggests that the sample did not capture many long transcripts. The sample may have degraded during testing, or the more likley cause could be the effect of Alzheimer's disease. The disease of this patient is likely at a far stage where there is serious damage to brain tissue and likely its funtion.