Reading data of people with Schizophrenia/ Schizoaffective Disorder
# Load required library
library(readr)
# Load the dataset from the URL without specifying col_types
url <- "https://vincentarelbundock.github.io/Rdatasets/csv/heplots/NeuroCog.csv"
neuro_cog <- read_csv(url)
## New names:
## Rows: 242 Columns: 11
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (2): Dx, Sex dbl (9): ...1, Speed, Attention, Memory, Verbal, Visual, ProbSolv,
## SocialCog...
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...1`
#Use the summary function to gain an overview of the data set
summary(neuro_cog)
## ...1 Dx Speed Attention
## Min. : 14.00 Length:242 Min. :-1.00 Min. : 1.00
## 1st Qu.: 81.25 Class :character 1st Qu.:33.00 1st Qu.:32.00
## Median :142.50 Mode :character Median :41.00 Median :41.00
## Mean :140.42 Mean :40.31 Mean :39.75
## 3rd Qu.:202.75 3rd Qu.:48.75 3rd Qu.:50.00
## Max. :263.00 Max. :74.00 Max. :67.00
## Memory Verbal Visual ProbSolv
## Min. : 3.00 Min. :20.00 Min. :12.00 Min. :29.00
## 1st Qu.:36.00 1st Qu.:33.00 1st Qu.:29.00 1st Qu.:39.00
## Median :44.00 Median :40.00 Median :38.00 Median :45.00
## Mean :42.44 Mean :41.37 Mean :37.12 Mean :45.83
## 3rd Qu.:51.00 3rd Qu.:48.00 3rd Qu.:44.00 3rd Qu.:52.00
## Max. :71.00 Max. :78.00 Max. :65.00 Max. :65.00
## SocialCog Age Sex
## Min. :10.00 Min. :18.00 Length:242
## 1st Qu.:34.00 1st Qu.:32.00 Class :character
## Median :44.00 Median :40.00 Mode :character
## Mean :43.93 Mean :40.89
## 3rd Qu.:53.00 3rd Qu.:50.00
## Max. :72.00 Max. :66.00
# Calculate the mean and median for the "Age" attribute
mean_age <- mean(neuro_cog$Age)
median_age <- median(neuro_cog$Age)
# Calculate the mean and median for the "Memory" attribute
mean_memory <- mean(neuro_cog$Memory)
median_memory <- median(neuro_cog$Memory)
# Display the mean and median for the selected attributes
cat("Mean Age:", mean_age, "\n")
## Mean Age: 40.89256
cat("Median Age:", median_age, "\n")
## Median Age: 40
cat("Mean Memory:", mean_memory, "\n")
## Mean Memory: 42.44215
cat("Median Memory:", median_memory, "\n")
## Median Memory: 44
# Select Columns
selected_columns <- c("Age", "Memory", "Dx")
# Create the New Data Frame with Subset of Rows and Columns using subset()
new_data_subset <- subset(neuro_cog, select = selected_columns, Memory > 41)
# Display enough rows to see examples of Step 2
head(new_data_subset, 10)
# Create new column names
new_column_names <- c("ParticipantAge", "MemoryScore", "GroupType")
# Assign the new column names to the data frame
colnames(new_data_subset) <- new_column_names
# Display 5 rows to see examples of Step 3
head(new_data_subset, 5)
Comparison of Original Data and New Subset Data: We compared the original data to a new subset focusing on individuals with Schizophrenia/Schizoaffective disorder and the control group. The subset included only those with memory scores greater than 41. As a result, the mean and median memory scores in the subset data are higher, indicating better memory performance. The mean and median age in the subset data are slightly lower, possibly due to chance or the memory score distribution within specific age groups.
#Use the summary function to create an overview of the new data frame
summary(new_data_subset)
## ParticipantAge MemoryScore GroupType
## Min. :19.00 Min. :42.00 Length:145
## 1st Qu.:31.00 1st Qu.:45.00 Class :character
## Median :38.00 Median :49.00 Mode :character
## Mean :39.62 Mean :50.43
## 3rd Qu.:48.00 3rd Qu.:54.00
## Max. :64.00 Max. :71.00
#Calculate Mean and Median for Selected Attributes in the new data frame
mean_participant_age <- mean(new_data_subset$ParticipantAge)
median_participant_age <- median(new_data_subset$ParticipantAge)
mean_memory_score <- mean(new_data_subset$MemoryScore)
median_memory_score <- median(new_data_subset$MemoryScore)
#Print Mean and Median for Selected Attributes
cat("Mean Participant Age (New Data Frame):", mean_participant_age, "\n")
## Mean Participant Age (New Data Frame): 39.62069
cat("Median Participant Age (New Data Frame):", median_participant_age, "\n")
## Median Participant Age (New Data Frame): 38
cat("Mean Memory Score (New Data Frame):", mean_memory_score, "\n")
## Mean Memory Score (New Data Frame): 50.43448
cat("Median Memory Score (New Data Frame):", median_memory_score, "\n")
## Median Memory Score (New Data Frame): 49
#Rename values in the "GroupType" column
new_data_subset$GroupType <- gsub("Schizophrenia", "SZ", new_data_subset$GroupType)
new_data_subset$GroupType <- gsub("Schizoaffective", "SA", new_data_subset$GroupType)
new_data_subset$GroupType <- gsub("Control", "CTL", new_data_subset$GroupType)
#Display the updated data frame
new_data_subset