Creating A Dataframe

Let’s create a dataframe using 3 vectors :

vec1 = c(1,2,3)
vec2 = c("R","Python","Java")
vec3 = c("For Prototyping", "For Prototyping", "For Scaleup")

df = data.frame(vec1,vec2,vec3)
print(df)
##   vec1   vec2            vec3
## 1    1      R For Prototyping
## 2    2 Python For Prototyping
## 3    3   Java     For Scaleup

Accessing Elements From A Dataframe

To access the 1st and 2nd row of the dataframe df, we have to write the following code :

print(df[1:2,])
##   vec1   vec2            vec3
## 1    1      R For Prototyping
## 2    2 Python For Prototyping

To access the 1st and 2nd column of the dataframe df, we have to write the following code :

print(df[,1:2])

or,

print(df[1:2])
##   vec1   vec2
## 1    1      R
## 2    2 Python
## 3    3   Java

Extracting Subset From A Dataframe

Let’s create another dataframe :

pd = data.frame("Name"= c("Senthil","Senthil","Sam","Sam"),
                "Month"= c("Jan","Feb","Jan","Feb"),
                "BS"= c(141.2,139.3,135.2,160.1),
                "BP"= c(90,78,80,81))
print(pd)
##      Name Month    BS BP
## 1 Senthil   Jan 141.2 90
## 2 Senthil   Feb 139.3 78
## 3     Sam   Jan 135.2 80
## 4     Sam   Feb 160.1 81

The function subset()extracts the subset of the dataframe based on certain conditions given.

pd2 = subset(pd,Name = "Senthil", BS>150)
print(pd2)
##   Name Month    BS BP
## 4  Sam   Feb 160.1 81

Editing A Dataframe

Let’s Change the value of 2nd column and 2nd row of dataframe df to “R” :

df[[2]][2]="R"
print(df)
##   vec1 vec2            vec3
## 1    1    R For Prototyping
## 2    2    R For Prototyping
## 3    3 Java     For Scaleup

or, we can directly do this from a GUI by using the following command

edit(df)

Utilising GUI to Create & Edit Dataframe

We can create and edit the dataframes from GUI by using the following simple codes :

For creating an empty dataframe :

MyTable = data.frame()

For adding data to the above created dataframe :

MyTable = edit(MyTable)

To view the created dataframe :

View(MyTable)

Adding Extra Rows & Columns To Dataframe

For adding an extra row, we can use rbind() function :

df = rbind(df,data.frame(vec1 = 4,
                         vec2 = "C",
                         vec3 = "For Scaleup"))
print(df)
##   vec1 vec2            vec3
## 1    1    R For Prototyping
## 2    2    R For Prototyping
## 3    3 Java     For Scaleup
## 4    4    C     For Scaleup

For adding an extra column, we can use cbind() function :

df = cbind(df, vec4 = c(10,20,30,40))
print(df)
##   vec1 vec2            vec3 vec4
## 1    1    R For Prototyping   10
## 2    2    R For Prototyping   20
## 3    3 Java     For Scaleup   30
## 4    4    C     For Scaleup   40

We can also do the same operations from a GUI by using the edit() function as discussed earlier.

Deleting Rows & Columns From Dataframe

Let’s create another dataframe in which we will show the data of df except the data of 3rd row and 1st column :

df2 = df[-3,-1]
print(df2)
##   vec2            vec3 vec4
## 1    R For Prototyping   10
## 2    R For Prototyping   20
## 4    C     For Scaleup   40

So, just by puttinge a negative sign (-), we can remove the desired entries of a dataframe.

Conditional Deletion

To delete the 3rd column :

df3 = df[,!names(df) %in% c("vec3")]
print(df3)
##   vec1 vec2 vec4
## 1    1    R   10
## 2    2    R   20
## 3    3 Java   30
## 4    4    C   40

To delete the 3rd row :

df4 = df[!df$vec1==3,]
print(df4)
##   vec1 vec2            vec3 vec4
## 1    1    R For Prototyping   10
## 2    2    R For Prototyping   20
## 4    4    C     For Scaleup   40

Manipulating Rows - The Factor Issue

When character columns are created in a data frame, they stored as factors.

df[3,1] = 3.1 command changes the value of 3rd row and 1st column to 3.1 in the dataframe df and also changes the datatype of Vec1 from Integer datatype to decimal. But, df[3,3] = "Others" command neither changes the value of 3rd column and 3rd row to Others nor it changes the datatype of Vec3 (A character column) but, rather returns an error.

To avoid such factor issue of character columns, we have to pass an additional argument in the data frame command, as follows -:

df=data.frame(vec1,vec2,vec3,stringsAsFactors = FALSE)
print(df)
##   vec1   vec2            vec3
## 1    1      R For Prototyping
## 2    2 Python For Prototyping
## 3    3   Java     For Scaleup

By default, the stringAsFactors argument is TRUE and that’s why it takes a character column as a factor.

After passing the stringAsFactor argument as FALSE, if we command df[3,3] = "Others" then, the value of 3rd column and 3rd row will be changed to Others.