mydata <- data.frame("ID" = c( 1,2,3),
"Age" = c(24, 28, 30),
"Gender" = c("M", "F", "F"))#Manually inserting the data
I would like to create new dataset, including only ID and Age (below in the bracket [ ]what i write before the comma it goes for rows and after the comma it goes for columns)
mydata2 <- mydata[ ,-3] #Deleting the third column
Create mydata3, including only first row of data
mydata3 <- mydata [1 , ]
mydata[1,2] <- 22 #Changing particular value in a data frame
Take mydata and insert a new variable, called Height, with values 180, 170, 172
mydata$Height <- c(180, 170, 172)
print (mydata)
## ID Age Gender Height
## 1 1 22 M 180
## 2 2 28 F 170
## 3 3 30 F 172
Create new variable, called HeightNew, which increases each student by 2 cm
mydata$Height1 <- mydata$Height + 2
print (mydata)
## ID Age Gender Height Height1
## 1 1 22 M 180 182
## 2 2 28 F 170 172
## 3 3 30 F 172 174
summary (mydata [ ,c(-1,-3)])
## Age Height Height1
## Min. :22.00 Min. :170 Min. :172
## 1st Qu.:25.00 1st Qu.:171 1st Qu.:173
## Median :28.00 Median :172 Median :174
## Mean :26.67 Mean :174 Mean :176
## 3rd Qu.:29.00 3rd Qu.:176 3rd Qu.:178
## Max. :30.00 Max. :180 Max. :182
Explanation of results: q1: 25 Range for Height: 10 cm Interq. range for Age: q3-q1 = 29 - 25 = 4
mean(mydata$Age)
## [1] 26.66667
standard deviation in R
sd(mydata$Height)
## [1] 5.291503
we would like yo use function, called describe. First, we need to install library psych
#install.packages("psych")
library(psych)
describe(mydata[ , c(-1, -3) ])
## vars n mean sd median trimmed mad min max range skew
## Age 1 3 26.67 4.16 28 26.67 2.97 22 30 8 -0.29
## Height 2 3 174.00 5.29 172 174.00 2.97 170 180 10 0.32
## Height1 3 3 176.00 5.29 174 176.00 2.97 172 182 10 0.32
## kurtosis se
## Age -2.33 2.40
## Height -2.33 3.06
## Height1 -2.33 3.06