Let’s create a dataframe using 3 vectors :
vec1 = c(1,2,3)
vec2 = c("R","Python","Java")
vec3 = c("For Prototyping", "For Prototyping", "For Scaleup")
df = data.frame(vec1,vec2,vec3)
print(df)
## vec1 vec2 vec3
## 1 1 R For Prototyping
## 2 2 Python For Prototyping
## 3 3 Java For Scaleup
To access the 1st and 2nd row of the dataframe df, we have to write the following code :
print(df[1:2,])
## vec1 vec2 vec3
## 1 1 R For Prototyping
## 2 2 Python For Prototyping
To access the 1st and 2nd column of the dataframe df, we have to write the following code :
print(df[,1:2])
or,
print(df[1:2])
## vec1 vec2
## 1 1 R
## 2 2 Python
## 3 3 Java
Let’s create another dataframe :
pd = data.frame("Name"= c("Senthil","Senthil","Sam","Sam"),
"Month"= c("Jan","Feb","Jan","Feb"),
"BS"= c(141.2,139.3,135.2,160.1),
"BP"= c(90,78,80,81))
print(pd)
## Name Month BS BP
## 1 Senthil Jan 141.2 90
## 2 Senthil Feb 139.3 78
## 3 Sam Jan 135.2 80
## 4 Sam Feb 160.1 81
The function subset()extracts the subset of the dataframe based on certain conditions given.
pd2 = subset(pd,Name = "Senthil", BS>150)
print(pd2)
## Name Month BS BP
## 4 Sam Feb 160.1 81
Let’s Change the value of 2nd column and 2nd row of dataframe df to “R” :
df[[2]][2]="R"
print(df)
## vec1 vec2 vec3
## 1 1 R For Prototyping
## 2 2 R For Prototyping
## 3 3 Java For Scaleup
or, we can directly do this from a GUI by using the following command
edit(df)
We can create and edit the dataframes from GUI by using the following simple codes :
For creating an empty dataframe :
MyTable = data.frame()
For adding data to the above created dataframe :
MyTable = edit(MyTable)
To view the created dataframe :
View(MyTable)
For adding an extra row, we can use rbind() function :
df = rbind(df,data.frame(vec1 = 4,
vec2 = "C",
vec3 = "For Scaleup"))
print(df)
## vec1 vec2 vec3
## 1 1 R For Prototyping
## 2 2 R For Prototyping
## 3 3 Java For Scaleup
## 4 4 C For Scaleup
For adding an extra column, we can use cbind() function :
df = cbind(df, vec4 = c(10,20,30,40))
print(df)
## vec1 vec2 vec3 vec4
## 1 1 R For Prototyping 10
## 2 2 R For Prototyping 20
## 3 3 Java For Scaleup 30
## 4 4 C For Scaleup 40
We can also do the same operations from a GUI by using the edit() function as discussed earlier.
Let’s create another dataframe in which we will show the data of df except the data of 3rd row and 1st column :
df2 = df[-3,-1]
print(df2)
## vec2 vec3 vec4
## 1 R For Prototyping 10
## 2 R For Prototyping 20
## 4 C For Scaleup 40
So, just by puttinge a negative sign (-), we can remove the desired entries of a dataframe.
To delete the 3rd column :
df3 = df[,!names(df) %in% c("vec3")]
print(df3)
## vec1 vec2 vec4
## 1 1 R 10
## 2 2 R 20
## 3 3 Java 30
## 4 4 C 40
To delete the 3rd row :
df4 = df[!df$vec1==3,]
print(df4)
## vec1 vec2 vec3 vec4
## 1 1 R For Prototyping 10
## 2 2 R For Prototyping 20
## 4 4 C For Scaleup 40
When character columns are created in a data frame, they stored as factors.
df[3,1] = 3.1 command changes the value of 3rd row and 1st column to 3.1 in the dataframe df and also changes the datatype of Vec1 from Integer datatype to decimal. But, df[3,3] = "Others" command neither changes the value of 3rd column and 3rd row to Others nor it changes the datatype of Vec3 (A character column) but, rather returns an error.
To avoid such factor issue of character columns, we have to pass an additional argument in the data frame command, as follows -:
df=data.frame(vec1,vec2,vec3,stringsAsFactors = FALSE)
print(df)
## vec1 vec2 vec3
## 1 1 R For Prototyping
## 2 2 Python For Prototyping
## 3 3 Java For Scaleup
By default, the stringAsFactors argument is TRUE and that’s why it takes a character column as a factor.
After passing the stringAsFactor argument as FALSE, if we command df[3,3] = "Others" then, the value of 3rd column and 3rd row will be changed to Others.