Question 1: Tidy Data
What is tidy data?
Tidy data is the output form of the data after dataset tidying process, which is in turn a part of data cleaning.Tidy datasets provide a standardized way to link the structure of a dataset (its physical layout) with its semantics (its meaning). A dataset is messy or tidy depending on how rows, columns and tables are matched up with observations, variables and types.
Structure is the form and shape of your data. In statistics, most datasets are rectangular data tables(data frames) and are made up of rows and columns.
Semantics is the meaning for the dataset. Datasets are a collection of values, either quantitative or qualitative. These values are organized in 2 ways — variable & observation.
The three rules of Tidy Data: - Each variable is a column - Each variable is a row - Each type of observational unit is a table
Question 2: Long and Wide
What are wide and long data formats? 2 use cases of each data frame structure along with example script in R. Which packages can be used for creating long and wide data formats?
The Long Format
A table stored in ‘long’ format has a single column for each variable in the system. An example dataset created is given below:
## ID Name Year Score
## 1 1 Avinash 2017 93
## 2 2 Shreyas 2017 97
## 3 3 Rajesh 2017 94
## 4 4 Sai 2017 97
## 5 5 Deepa 2017 86
## 6 1 Avinash 2018 79
## 7 2 Shreyas 2018 83
## 8 3 Rajesh 2018 88
## 9 4 Sai 2018 93
## 10 5 Deepa 2018 100
## 11 1 Avinash 2019 76
## 12 2 Shreyas 2019 79
## 13 3 Rajesh 2019 93
## 14 4 Sai 2019 75
## 15 5 Deepa 2019 85
## 16 1 Avinash 2020 87
## 17 2 Shreyas 2020 85
## 18 3 Rajesh 2020 85
## 19 4 Sai 2020 79
## 20 5 Deepa 2020 99
## 21 1 Avinash 2021 86
## 22 2 Shreyas 2021 83
## 23 3 Rajesh 2021 99
## 24 4 Sai 2021 93
## 25 5 Deepa 2021 91
In the above case, each data point represents the socre of an individual in a particular year, so our variables are ID, Name, Year, Score
The Wide Format
A table stored in ‘wide’ format spreads a variable across several columns. The wide format of the same sample dataset above is given below:
## ID Name 2017 2018 2019 2020 2021
## 1 1 Avinash 93 79 76 87 86
## 2 2 Shreyas 97 83 79 85 83
## 3 3 Rajesh 94 88 93 85 99
## 4 4 Sai 97 93 75 79 93
## 5 5 Deepa 86 100 85 99 91
Most R functions need data in the long format, and it is often easier to process data in a long format.
But on the other hand, it is easier for people to view and comprehend wide format, especially when it is being input and validated, where human comprehension is important for ensuring quality and accuracy.
Datasets tend to start out life in wide format, and then become long as it becomes used more for processing. Fortunately converting back and forth is pretty easy nowadays, especially with the tidyr package.
Let’s see this with an example:
library(tidyr)
set.seed(12345)
longdata <- data.frame(ID = 1:5,
expand.grid(
Name = c("Avinash", "Shreyas", "Rajesh", "Sai", "Deepa"),
Year = 2017:2021),
Score = round(runif(25,75,100),0)
)
head(longdata, 10)## ID Name Year Score
## 1 1 Avinash 2017 93
## 2 2 Shreyas 2017 97
## 3 3 Rajesh 2017 94
## 4 4 Sai 2017 97
## 5 5 Deepa 2017 86
## 6 1 Avinash 2018 79
## 7 2 Shreyas 2018 83
## 8 3 Rajesh 2018 88
## 9 4 Sai 2018 93
## 10 5 Deepa 2018 100
This data as we can see is in the long format, and is veary easy to work with in R, we can visualise or even summarize data easily in this format. But, if a layman wants to get some insights just looking at the data, without visualization, it is difficult to comprehend. So, now we will proceed to convert this long data to wide data using the spread() function from the tidyr package:
## ID Name 2017 2018 2019 2020 2021
## 1 1 Avinash 93 79 76 87 86
## 2 2 Shreyas 97 83 79 85 83
## 3 3 Rajesh 94 88 93 85 99
## 4 4 Sai 97 93 75 79 93
## 5 5 Deepa 86 100 85 99 91
As we can see now, the data is veary appealing and we can draw initial insights from the data just by looking at it. Initial insights help to give direction to further data analysis.
Now, we can also conver this same wide data to the long format we saw earlier, using the gather() function of the tidyr package:
longdata2 <- widedata %>%
gather("2017","2018","2019","2020","2021", key = Year, value = Score)
head(longdata2,10)## ID Name Year Score
## 1 1 Avinash 2017 93
## 2 2 Shreyas 2017 97
## 3 3 Rajesh 2017 94
## 4 4 Sai 2017 97
## 5 5 Deepa 2017 86
## 6 1 Avinash 2018 79
## 7 2 Shreyas 2018 83
## 8 3 Rajesh 2018 88
## 9 4 Sai 2018 93
## 10 5 Deepa 2018 100
longdata and longdata2 are identical datasets, and converting it to wide and then to long again had no loss of information. We can validate if the two datasets are same:
## [1] TRUE
Some packages are meant to work with wide format well while most work great with long data. It is upto the analyst which format to use depending on his/her goals.
Question 3: Barplot and Histogram
Import the ‘iris’ dataset available in R, visualise the following two type of charts using the dataset
-Barplot
-Histogram
Explain where these two types of plots differ with their respective implementations.
## Warning: package 'ggplot2' was built under R version 3.6.3
## Warning: package 'gridExtra' was built under R version 3.6.3
Histogram
Histogram is defined as a type of bar chart that is used to represent statistical information by way of bars to show the frequency distribution of continuous data. It indicates the number of observations which lie in-between the range of values, known as class or bin.
# Sepal length
HisSl <- ggplot(data=iris, aes(x=Sepal.Length))+
geom_histogram(binwidth=0.2, color="black", aes(fill=Species)) +
xlab("Sepal Length (cm)") +
ylab("Frequency") +
theme(legend.position="none")+
ggtitle("Histogram of Sepal Length")+
geom_vline(data=iris, aes(xintercept = mean(Sepal.Length)),
linetype="dashed",color="grey")
# Sepal width
HistSw <- ggplot(data=iris, aes(x=Sepal.Width)) +
geom_histogram(binwidth=0.2, color="black", aes(fill=Species)) +
xlab("Sepal Width (cm)") +
ylab("Frequency") +
theme(legend.position="none")+
ggtitle("Histogram of Sepal Width")+
geom_vline(data=iris, aes(xintercept = mean(Sepal.Width)),
linetype="dashed",color="grey")
# Petal length
HistPl <- ggplot(data=iris, aes(x=Petal.Length))+
geom_histogram(binwidth=0.2, color="black", aes(fill=Species)) +
xlab("Petal Length (cm)") +
ylab("Frequency") +
theme(legend.position="none")+
ggtitle("Histogram of Petal Length")+
geom_vline(data=iris, aes(xintercept = mean(Petal.Length)),
linetype="dashed",color="grey")
# Petal width
HistPw <- ggplot(data=iris, aes(x=Petal.Width))+
geom_histogram(binwidth=0.2, color="black", aes(fill=Species)) +
xlab("Petal Width (cm)") +
ylab("Frequency") +
theme(legend.position="right" )+
ggtitle("Histogram of Petal Width")+
geom_vline(data=iris, aes(xintercept = mean(Petal.Width)),
linetype="dashed",color="grey")
# Plot all visualizations
grid.arrange(HisSl + ggtitle(""),
HistSw + ggtitle(""),
HistPl + ggtitle(""),
HistPw + ggtitle(""),
nrow = 2,
top = textGrob("Iris Frequency Histogram",
gp=gpar(fontsize=15))
)Given above is a grid comparing the histograms for the continuous and numerical parameters, namely Petal.Length, Petal.Width, Sepal.Length and Sepal.Width.
The x-axis in the above graphs are continuous variables while the y-axis is also continuous and plots the frequency. The classes have been differentiated and color coded according to the categorical variable Species so that we can gain insights into the data.
Barplot
Bar Plot is a chart that graphically represents the comparison between categories of data. It displays grouped data by way of parallel rectangular bars of equal width but varying the length. Each rectangular block indicates specific category and the length of the bars depends on the values they hold.
#Petal Length
barpl <- ggplot(data=iris,aes(x=Species,y=Petal.Length, fill=Species)) +
stat_summary(geom = "bar", fun = "mean", color = "black", width = 0.75)+
ylab("Mean Petal Length")+ theme(legend.position="none")
#Petal Width
barpw <- ggplot(data=iris,aes(x=Species,y=Petal.Width, fill=Species)) +
stat_summary(geom = "bar", fun = "mean", color = "black", width = 0.75)+
ylab("Mean Petal Width")+ theme(legend.position="none")
#Sepal Length
barsl <- ggplot(data=iris,aes(x=Species,y=Sepal.Length, fill=Species)) +
stat_summary(geom = "bar", fun = "mean", color = "black", width = 0.75)+
ylab("Mean Sepal Length")+ theme(legend.position="none")
#Sepal Width
barsw <- ggplot(data=iris,aes(x=Species,y=Sepal.Width, fill=Species)) +
stat_summary(geom = "bar", fun = "mean", color = "black", width = 0.75)+
ylab("Mean Sepal Width")
grid.arrange(barpl + ggtitle(""),
barpw + ggtitle(""),
barsl + ggtitle(""),
barsw + ggtitle(""),
nrow = 2,
top = textGrob("Iris Mean Lengths/Widths Barplot",
gp=gpar(fontsize=15))
)Given above is a grid comparing the bar plots of the mean of variables Petal.Length, Petal.Width, Sepal.Length and Sepal.Width across the categorical parameter Species which has the factor setosa, versicolor and virginica.
The x-axis in the above graphs are discrete bins while the y-axis is continuous and plots the mean of the lenth/width.
Question 4: Outliers
In the same ‘iris’ dataset, you have been assigned the task of identifying outliers. You must be able to provide a comprehensive glimpse of the dataset using a suitable visualisation. What method would you choose and why?
## Sepal.Length Sepal.Width Petal.Length Petal.Width
## Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100
## 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300
## Median :5.800 Median :3.000 Median :4.350 Median :1.300
## Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
## 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
## Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
## Species
## setosa :50
## versicolor:50
## virginica :50
##
##
##
Boxplot
When trying to visualise the outliers, box-plots are the most intutive and the first choice to do so.
Box plots show the five-number summary of a set of data: including the minimum score, first (lower) quartile, median, third (upper) quartile, and maximum score.
Following are the boxplots for each of the variable across the three Species.
plbox <- ggplot(data=iris,aes(x=Species, y=Petal.Length,color=Species)) +
geom_boxplot() +theme_minimal()+theme(legend.position="none")
slbox <- ggplot(data=iris,aes(x=Species, y=Sepal.Length,color=Species)) +
geom_boxplot() +theme_minimal()+theme(legend.position="none")
pwbox <- ggplot(data=iris,aes(x=Species, y=Petal.Width,color=Species)) +
geom_boxplot() +theme_minimal()+theme(legend.position="none")
swbox <- ggplot(data=iris,aes(x=Species, y=Sepal.Width,color=Species)) +
geom_boxplot() +theme_minimal()+theme(legend.position="none")
grid.arrange(plbox + ggtitle(""),
slbox + ggtitle(""),
pwbox + ggtitle(""),
swbox + ggtitle(""),
nrow = 2,
top = textGrob("Boxplots for Outlier Visualisation",
gp=gpar(fontsize=15))
)The dots represent outliers.
Outliers are defined as values that are 3 times IQR above third quantile or 3 times IQR below first quantile.
Suspected outliers are 1.5 times IQR (inter quartile range) above third quantile or 1.5 times IQR below first quantile.
We can easily apply filters on the basis of above information and pin point the exact outlier data points.
k-means Clustering
Another way to detect outliers is clustering. By grouping data into clusters, those data not assigned to any clusters are taken as outliers. With k-means, the data are partitioned into k groups by assigning them to the closest cluster centers. After that, we can calculate the distance (or dissimilarity) between each object and its cluster center, and pick those with largest distances as outliers.
By using clustering for finding outliers, we are finding the outliers common across the variables, while if we are using the box plot, we are just able to get the outliers one variable at a time.
## Sepal.Length Sepal.Width Petal.Length Petal.Width
## 1 6.314583 2.895833 4.973958 1.7031250
## 2 5.175758 3.624242 1.472727 0.2727273
## 3 4.738095 2.904762 1.790476 0.3523810
# Calculate distances between objects and cluster centers
centers <- kmeans.result$centers[kmeans.result$cluster,]
distances <- sqrt(rowSums((iris2 - centers)^2))
# Picking top 5 largest distances
outliers <- order(distances, decreasing=T)[1:5]
print(iris2[outliers,])## Sepal.Length Sepal.Width Petal.Length Petal.Width
## 119 7.7 2.6 6.9 2.3
## 118 7.7 3.8 6.7 2.2
## 132 7.9 3.8 6.4 2.0
## 123 7.7 2.8 6.7 2.0
## 106 7.6 3.0 6.6 2.1
Below we are plotting the data with Sepal.Width on the y-axis and Sepal.Length on the x-axis.
# Plot clusters
plot(iris2[,c("Sepal.Length", "Sepal.Width")], pch="o",
col = kmeans.result$cluster, cex = 0.3)
# Plot cluster centers
points(kmeans.result$centers[,c("Sepal.Length", "Sepal.Width")],
col=1:3, pch=8, cex=1.5)
# Plot outliers
points(iris2[outliers, c("Sepal.Length", "Sepal.Width")], pch="+", col=4, cex=1.5)In the above figure, cluster centers are labeled with asteriks “*”and outliers with plus “+”
Question 5: Depth for Diamonds
Import the ‘diamonds’ dataset that is provided by the R ggplot package. Your aim is to create a scatter plot of price vs depth. How will you proceed to find the depth value where most diamonds are found? Show in chart.
We have to create a scatter plot of price vs depth. We will use ggplot() and then use geom_point() to to get what we need. But since there is very high amiunt of data points which would overlap, hence to make out visualisation a little bit clear, we will use aplha=0.1 which would change the opacity of the points to 10%.
Still we cannot draw any conclusions on the depth value where most diamonds are found. This is because, even if we have changed the aplha to 0.1, when 10 points overlap they will have complete dark black point and we cannot distinguish between 10 points overlapping or 100.
To deal with this issue, we will now use the geom_bind2d() from the ggplot2 package to create a scatter plot with color coding according to the density. One may see this as similar to a heat map.
We can see that the highest density is at around the depth value of 61. We can clearly see this with the help of a histogram as given below:
The histogram just verifies our conclusions from the scatter plot, that the depth value where most diamonds are found is at around 61 where the count is approximately 2500.
Question 6: State Names and Vowels
In this case, we will use the ‘USArrests’ dataset provided by R
- Abbreviate the names of states.
- Select the states that contain the letter ‘b’
- Count the frequency of the vowels in names and plot their frequency distribution plot.
## Murder Assault UrbanPop Rape
## Alabama 13.2 236 58 21.2
## Alaska 10.0 263 48 44.5
## Arizona 8.1 294 80 31.0
## Arkansas 8.8 190 50 19.5
## California 9.0 276 91 40.6
## Colorado 7.9 204 78 38.7
The USArrests dataset has been loaded successfully and above are some head() elements.
We will now see the rownames of our dataset, and also store them in states_full
## [1] "Alabama" "Alaska" "Arizona" "Arkansas"
## [5] "California" "Colorado" "Connecticut" "Delaware"
## [9] "Florida" "Georgia" "Hawaii" "Idaho"
## [13] "Illinois" "Indiana" "Iowa" "Kansas"
## [17] "Kentucky" "Louisiana" "Maine" "Maryland"
## [21] "Massachusetts" "Michigan" "Minnesota" "Mississippi"
## [25] "Missouri" "Montana" "Nebraska" "Nevada"
## [29] "New Hampshire" "New Jersey" "New Mexico" "New York"
## [33] "North Carolina" "North Dakota" "Ohio" "Oklahoma"
## [37] "Oregon" "Pennsylvania" "Rhode Island" "South Carolina"
## [41] "South Dakota" "Tennessee" "Texas" "Utah"
## [45] "Vermont" "Virginia" "Washington" "West Virginia"
## [49] "Wisconsin" "Wyoming"
These are complete state names, which we need to abbreviate. Lucky for us, R comes in list of the sate names stored in state.name and their abbreviations state.abb. Our task is now only to map the state.name to states_full, with the condition state.name == states_full, which we are passing inside the which() function.
We then store the results in states_abbr
## [1] "AL" "AK" "AZ" "AR" "CA" "CO" "CT" "DE" "FL" "GA" "HI" "ID" "IL" "IN" "IA"
## [16] "KS" "KY" "LA" "ME" "MD" "MA" "MI" "MN" "MS" "MO" "MT" "NE" "NV" "NH" "NJ"
## [31] "NM" "NY" "NC" "ND" "OH" "OK" "OR" "PA" "RI" "SC" "SD" "TN" "TX" "UT" "VT"
## [46] "VA" "WA" "WV" "WI" "WY"
Now for the subsections 2 and 3 of this question it is ambiguous, whether the abbreviated state names to be used or the abbreviated state names. To deal with this confusion, both cases have been considered.
To check if a pattern is in a string we use the grepl() function, which returns TRUE if the pattern is present. Hence, we now have a vector of logical values which we can use to subset the data
The pattern we are passing is [Bb] which means we are checking if B or b is present in the string
## [1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [25] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [49] FALSE FALSE
We now use full to subset the data and select the states that contain the letter ‘b’
## [1] "Alabama" "Nebraska"
“Alabama” and “Nebraska” have b in them.
To cover the other case for abbreviated state names, we follow the same procedure.
## character(0)
We have no abbreviated state names with the letter b in them.
Now we have to create a frequency distribution for the frequency of vowels in the state names. We will start with the case for full state names.
We will use the tokenizers package to tokenize the state names into their constituent alphabets
## Warning: package 'tokenizers' was built under R version 3.6.3
## [1] "m" "a" "s" "s" "a" "c" "h" "u" "s" "e" "t" "t" "s"
We now create an empty data frame which will keep a record of our total frequencies
## Var1 Freq
## 1 a 0
## 2 e 0
## 3 i 0
## 4 o 0
## 5 u 0
We will now first start with calculating the number of vowels in the first state name and then run a for loop to calculate for others.
We create a temporary dataset with the frequency of the alphabets, using the table() function
## Var1 Freq
## 1 a 4
## 2 b 1
## 3 l 1
## 4 m 1
We now proceed to merge this to the frequency data frame with just the vowels, which we created earlier.
## Var1 Freq.x Freq.y
## 1 a 0 4
## 2 e 0 NA
## 3 i 0 NA
## 4 o 0 NA
## 5 u 0 NA
Only the a alphabet was common among the two datasets an its value has been added and the remaining missing values were taken care of using NA.
Now, what we want to do is, add the Freq.x and Freq.y columns, but we cannot add anything with NA. So, we will replace all NA with 0.
## Var1 Freq.x Freq.y
## 1 a 0 4
## 2 e 0 0
## 3 i 0 0
## 4 o 0 0
## 5 u 0 0
Now we add the two columns into a new column Freq
## Var1 Freq.x Freq.y Freq
## 1 a 0 4 4
## 2 e 0 0 0
## 3 i 0 0 0
## 4 o 0 0 0
## 5 u 0 0 0
Now we do not need Freq.x and Freq.y columns so we will remove them.
Now we have our final result.
## Var1 Freq
## 1 a 4
## 2 e 0
## 3 i 0
## 4 o 0
## 5 u 0
Now we need to keep repeating this process with other states and keep updating our final table freq_full. We now run a for loop for that process as below
freq_full <- data.frame(Var1= c("a","e","i","o","u"), Freq = c(0,0,0,0,0))
for(n in 1:length(tokenized_states_full)){
temp <- as.data.frame(table(tokenized_states_full[[n]]))
freq_full <- merge(freq_full, temp, by.x = "Var1", by.y = "Var1", all.x = TRUE)
freq_full[is.na(freq_full)] = 0
freq_full$Freq <- freq_full$Freq.x + freq_full$Freq.y
freq_full <- freq_full[-(2:3)]
}
colnames(freq_full) <- c("Vowels", "Frequency") #Renaming the columns
freq_full## Vowels Frequency
## 1 a 61
## 2 e 28
## 3 i 44
## 4 o 36
## 5 u 8
The above frequency table shows the frequencies of the vowels in the full state names.
For visualising the frequency distribution, we will now proceed to plot a pie-chart Pie-chart is just a bar chart with polar coordinates.
ggplot(freq_full, aes(x="", y=Frequency, fill= Vowels)) +
geom_bar(stat="identity", width=1, color = "white") +
coord_polar("y", start=0) + theme_void()+
labs(title = "Frequency Distribution for Vowels in State Names",
y = "Frequency", x = "" )+
scale_fill_brewer(palette="Set1")Now we repeat the whole procedure for the abbreviated state names.
tokenized_states_abbr <- tokenize_characters(states_abbr)
freq_abbr <- data.frame(Var1= c("a","e","i","o","u"), Freq = c(0,0,0,0,0))
for(n in 1:length(tokenized_states_abbr)){
temp <- as.data.frame(table(tokenized_states_abbr[[n]]))
freq_abbr <- merge(freq_abbr, temp, by.x = "Var1", by.y = "Var1", all.x = TRUE)
freq_abbr[is.na(freq_abbr)] = 0
freq_abbr$Freq <- freq_abbr$Freq.x + freq_abbr$Freq.y
freq_abbr <- freq_abbr[-(2:3)]
}
colnames(freq_abbr) <- c("Vowels", "Frequency")
freq_abbr## Vowels Frequency
## 1 a 12
## 2 e 3
## 3 i 8
## 4 o 5
## 5 u 1
ggplot(freq_abbr, aes(x="", y=Frequency, fill= Vowels)) +
geom_bar(stat="identity", width=1, color = "white") +
coord_polar("y", start=0) + theme_void()+
labs(title = "Frequency Distribution for Vowels in Abbreviated State Names",
y = "Frequency", x = "" )+
scale_fill_brewer(palette="Set2")Question 7: Mystery Method
What is the value of fn(3)? Can you explain what is happening at each step? mystery_method <- function(x) { function(z) Reduce(function(y, w) w(y), x, z) } fn <- mystery_method(c(function(x) x + 1, function(x) x * x))
First, we will run the given code, and find out the value of fn(3)
mystery_method <- function(x){function(z) Reduce(function(y, w) w(y), x, z)}
fn <- mystery_method(c(function(x) x + 1, function(x) x * x))
fn(3)## [1] 16
We can see that the value of fn(3) is 16. We will need this value to validate our breaking down of the function in each step given below.
fn calls mystery_method(). In mystery_method() we are passing c(function(x) x + 1, function(x) x * x) as x. Hence we will replace x by c(function(x) x + 1, function(x) x * x), create a new function fn2 and run the function again and validate our progress.
## [1] 16
We are still getting the same answer 16 and hence we are on the right path. A lot of functions are being passed, and it is disturbing our analysis of the function. So, we will name these functions as fnA, fnB and fnC and create a new function fn3 which is easier to analyse.
fnA <- function(y, w) w(y)
fnB <- function(x) x + 1
fnC <- function(x) x * x
fn3 <- function(z) Reduce(fnA , c(fnB,fnC),z)
fn3(3)## [1] 16
Output is still 16, validating our progress.
Reduce() reduces a vector, x, to a single value by recursively calling a function, f, two arguments at a time. It combines the first two elements with f, then combines the result of that call with the third element, and so on.
Now, the Reduce() function is taking three arguments fnA , c(fnB,fnC) and z. A three argument Reduce call will initialize at the third argument, which is z. So, first z will be passed as an argument to fnB so it will be fnB(z) The inner function, fnA take an argument and a function and apply that function to the argument. Hence, fnB(z) and fnC will be passed to fnA
## [1] 16
The result is still 16 validating our progress. Now, let us take even a deeper look and solve our mystery_method()
The following is what is happening inside the fnA
## [1] 16
The following is the output of fnB
## [1] 4
fnC taking the output value from fnB(3) i.e. 4.
## [1] 16
And this is how we have arrived at the value 16
Created using R Markdown by Shreyas Khadse shreyaskhadse9976@gmail.com