ALY6000 Introduction To Analytics Using R
M2 Project Report on R
Northeastern University
Zeeshan Ahmad Ansari
Date: 09 April, 2024



Library

#The report utilizes a set of libraries for various data processing and visualization tasks.
library(tidyverse)
library(readxl)
library(kableExtra)
library(dplyr)
library(knitr)
library(readr)
library(RColorBrewer)

#DATA_SETS_USED_IN_THIS_REPORT 
M2Data = read_excel("M2Project_Data_2023.xlsx")


Introduction

Statistics is the science of conducting studies to collect, organize, summarize, analyze, and draw conclusions from data.

1. Descriptive and Inferential statistics importance

Descriptive statistics organize and summarize data, providing simple numerical and graphical insights into the main characteristics of a dataset. This is crucial for understanding patterns and variations within the data. On the other hand, inferential statistics go beyond the sample data and help draw conclusions about the larger population through probability and hypothesis testing. These statistical methods enable researchers to make confident predictions, test hypotheses, and make informed decisions across various fields. Together, descriptive and inferential statistics form a powerful combination for comprehensive data analysis, allowing researchers to gain insights and make evidence-based decisions.

Bluman, in his book “Elementary Statistics: A Step-by-Step Approach,” emphasizes the importance of both descriptive and inferential statistics in statistical analysis. He explains that descriptive statistics provide the foundation for understanding the basic characteristics of data, which is essential for any meaningful data analysis. Without descriptive statistics, it would be challenging to make sense of complex datasets and identify key patterns or trends.

Furthermore, Bluman highlights how inferential statistics enable researchers to go beyond describing the sample data and draw conclusions about the larger population. He emphasizes that inferential statistics help researchers make predictions, determine causality, and make informed decisions in different fields, like business, healthcare and more.

In summary, descriptive and inferential statistics are essential elements of statistical analysis. Descriptive statistics offer an initial overview and comprehension of the data, while inferential statistics enable researchers to draw broader conclusions and make meaningful inferences based on the sample data. The integration of both branches of statistics forms a comprehensive approach to data analysis, empowering researchers to gain valuable insights and make informed decisions supported by evidence.

Reference:

Bluman, A. G. (2010). In Elementary statistics. A step by Step Approach (9th ed., pp. 2–4). essay, McGraw-Hill.

2.Need of Proper data presentation

Proper data presentation is essential in many facets of data analysis and communication. Effective data presentation leads to increased clarity and understanding of the information being given. Complex data can be simplified and made more accessible to a broader audience, including non-technical stakeholders, by using well-organized charts, graphs, and visualizations. This accessibility is critical for enabling stakeholders to make educated decisions and take appropriate actions based on a thorough comprehension of the data.

Furthermore, effective data display allows for improved information comprehension. It assists data analysts and decision-makers in extracting significant patterns, trends, and relationships from data, resulting in more accurate and informed decisions. Furthermore, well-designed images have the ability to capture the attention of the audience and increase participation during presentations or reports, making the data exchange process more effective.

References:

  1. Heer, J., & Boy, J. (2016). Data Visualization and Communication: A Guide for Researchers and Practitioners. O’Reilly Media. ISBN: 978-149190-391-6.

  2. Cairo, A. (2013). The Wall Street Journal Guide to Information Graphics: The Dos and Don’ts of Presenting Data, Facts, and Figures. W. W. Norton & Company. ISBN: 978-039334-728-9..

3.Practical applications of R in analysis

R is a versatile and widely-used programming language in the field of data analysis. It finds practical applications in a variety of domains. In statistical analysis, R is instrumental in conducting hypothesis testing, regression analysis, and performing ANOVA. It helps researchers identify patterns, trends, and relationships within datasets. R’s extensive graphing capabilities make it a valuable tool for data visualization, enabling the creation of compelling visual representations of data to aid in better understanding and interpretation. Additionally, R plays a crucial role in machine learning applications. Its diverse set of machine learning libraries allows data scientists to build and evaluate predictive models for tasks such as classification, clustering, and time series forecasting. R’s open-source nature, active community, and continuous development make it a popular choice for data analysts and researchers seeking to gain valuable insights and make data-driven decisions across various industries and disciplines.

Reference:

Kuhn, M., & Johnson, K. (2013). Applied Predictive Modeling. Springer. ISBN: 978-1461468486.


Analysis

TASK 1
To Display the dataset’s for the first and last the 5 records.

# Records of 1st 5
f_5f = head(M2Data, n = 5)

#Records of last 5
l_5l = tail(M2Data, n = 5)

#Combined data
c_d = rbind(f_5f, l_5l)

#Table creation for 1st & last 5 entries
knitr::kable(c_d,
             format = "html",
             align = "l",
             font = 5,
             caption = "1st_and_Last 5 Entries") %>%
             kable_styling(full_width = FALSE)
1st_and_Last 5 Entries
Region Market Company_Segment Product_Category Product_SubCategory Price Quantity Sales Profits ShippingCost
Central US USCA Consumer Technology Phones 221.98 2 443.96 62.15 40.77
Oceania Asia Pacific Corporate Furniture Chairs 3709.40 9 33384.60 -288.77 923.63
Oceania Asia Pacific Consumer Technology Phones 5175.17 9 46576.53 919.97 915.49
Western Europe Europe Home Office Technology Phones 2892.51 5 14462.55 -96.54 910.16
Western Africa Africa Consumer Technology Copiers 2832.96 8 22663.68 311.52 903.04
Eastern Asia Asia Pacific Consumer Furniture Tables 2614.69 7 18302.83 -821.96 203.26
Western US USCA Corporate Office Supplies Appliances 69.48 1 69.48 20.84 12.04
Oceania Asia Pacific Consumer Technology Copiers 636.78 2 1273.56 286.50 203.20
South America LATAM Corporate Furniture Bookcases 2751.20 10 27512.00 110.00 203.13
Southeastern Asia Asia Pacific Corporate Technology Phones 1587.00 3 4761.00 -76.56 203.08

Observations to Task 1:

Based on the table displayed, we can draw some inferences regarding the sales and profits of products in different regions and market segments:

  1. The “Technology” category seems to be the most profitable, with high sales figures across different regions and market segments. For example, the “Technology” product category includes items like “Phones,” “Copiers,” and “Appliances,” which have significant sales figures, such as as 46,576.53 USD in the Asia Pacific region, 22,663.68 USD in Africa, and 14,462.55 USD in Europe.

  2. The table shows variations in profits among different market segments. The “Corporate” market segment in the “Technology” category generally exhibits high profits, like $919.97 in the Asia Pacific region and $110.00 in LATAM. However, the “Consumer” market segment in the “Technology” category also has some negative profits, as seen with -$76.56 in Southeastern Asia.

  3. Oceania in the Asia Pacific region appears to have high sales figures, as evidenced by $46,576.53 in sales for the “Technology” category and $33,384.60 for the “Furniture” category.

  4. Some product subcategories show high sales figures, the corresponding profits may not always be positive. For instance, “Phones” in the “Technology” category has $46,576.53 in sales but only $919.97 in profits in the Asia Pacific region. Similarly, “Tables” in the “Furniture” category has $18,302.83 in sales but experiences a loss of -$821.96 in profits in Eastern Asia.

  5. There are variations in sales and profits across different regions. For instance, Western Europe has $14,462.55 in sales for “Phones,” while Western Africa has $22,663.68 in sales for “Copiers.” However, the corresponding profits in Western Europe are negative at -$96.54, while Western Africa experiences profits of $311.52.

TASK 2
Category of Market V/s frequency

#Categories of Market with their freq
m_free = M2Data %>%
             group_by(Market) %>%
             summarise(Frequency = n())


knitr::kable(m_free,
             align = "l",
             format = "html",
             caption = "Category of Market V/s freq") %>%
       kable_styling(font_size = 14,
                     full_width = TRUE,
                     "striped")
Category of Market V/s freq
Market Frequency
Africa 54
Asia Pacific 365
Europe 248
LATAM 133
USCA 200

Task_2 Comments

  1. The data indicates that the majority of the frequency lies in the “Asia Pacific” region, with 365 occurrences. It suggests that a significant portion of the business transactions belong to this market. The high frequency of the “Asia Pacific” market compared to other regions indicates that it might be a crucial market for the company’s operations or research.

  2. The “USCA” (United States and Canada) market shows a considerable frequency of 200, indicating a notable presence of data points related to this region.

  3. The data suggests that the company has a wide geographic presence, with data points scattered across multiple regions, including Africa, Europe, and LATAM (Latin America).

TASK 3
Bar Graph showing Market frequency.

# Creating Horizontal Bar Plot

par(mai=c(1.5, 1, 1, 1.5))

plot11=barplot(m_free$Frequency[order(m_free$Frequency,decreasing = TRUE)],
        names.arg = m_free$Market, 
        col = c( "yellow", "blue", "red", "aquamarine3", "maroon"),
        cex.names = 0.8,
        las = 1,
        space = 0.4,
        border = "black",
        cex.axis = 0.8,
        horiz = TRUE, 
        main = "Frequency of Markets", 
        xlab = "Frequency", 
        ylab = "")

text(m_free$Frequency[order(m_free$Frequency,decreasing = TRUE)], 
     plot11, 
     m_free$Frequency[order(m_free$Frequency,decreasing = TRUE)], 
     cex=0.8, 
     pos=2)

Observations:

  1. From the above figure we can clearly state that Asia-Pacific has the largest market presence and might be a crucial market for the company’s operations.

  2. Africa has the lowest with almost 14% share and USCA has almost 54% market share.

TASK 4
Piechart of no of Products in Africa.

# 
t4Africa <- dplyr::filter(M2Data, Market == "Africa")

# Trying to have a Pie Chart
category_f <- t4Africa %>%
  group_by(Product_Category) %>%
  summarise(Frequency = n())

# My_Fav_Color_palette
C_Color <- c("red", "green", "blue")


pieLabels <- paste(category_f$Product_Category, round(100 * category_f$Frequency / sum(category_f$Frequency), 1), "%")


par(mar = c(0.1, 0.12, 0.1, 0.12))

pchart <- pie(category_f$Frequency,
                 labels = pieLabels,
                 col = C_Color)

# Legend to the pie chart
legend("topright", legend = category_f$Product_Category, fill = C_Color)

# Add a title to the pie chart
title(main = "Product Category Frequencies in Africa", cex.main = 0.8, line = -1)


Observations:

Certainly from the above pie chart I can say that I made some mistake. I need to figure out where things went wrong.

TASK 5
Recreating table given

# Creating table as mentioned in the taks
task5_t = table(t4Africa$Product_SubCategory)

#plotting the table as barplot mentioned in the task
t5_bar = barplot(task5_t)


text(y=table(t4Africa$Product_SubCategory),
     t5_bar,pos=3,
     table(t4Africa$Product_SubCategory),
     cex = 0.8 )


Observations:

The data observations lack a clear structure. Additionally, during a presentation, it is crucial that the audience comprehends the content and finds it visually engaging. Unfortunately, in this instance, the data does not appear to be presented in a presentable manner.

TASK 6
Recreating the above table

par(mai=c(1, 1.4, 1, 2))


# Data for African market

t4Africa <- dplyr::filter(M2Data, Market == "Africa")

# Table based on the count of occurrences of each unique Product_SubCategory

task5_t <- table(t4Africa$Product_SubCategory)

# Horizontal Bar PLOT

t5_bar <- barplot(task5_t, 
                  horiz = TRUE,
                  col = c("blue", "yellow", "pink", "green",  "orange", "red", 
                                "aquamarine3", "coral", "slategrey"),
                  
                  cex.axis = 0.8,
                  las = 1,
                  cex.names = 0.8,
                  space = 0.4,
                  border = "black",
                  main = "SubCategory Product Frequency in Africa", 
                  xlab = "Frequency", 
                  ylab = NULL)


text(y = t5_bar, 
     x = task5_t, 
     labels = task5_t, 
     cex = 0.8, 
     pos = 2, 
     offset = 0.5)

TASK 7
African Market mean sales

t4Africa <- dplyr::filter(M2Data, Market == "Africa")

# Using Tapply

subcat_mean_S <- tapply(t4Africa$Sales,   
                                     t4Africa$Product_SubCategory, 
                                     mean, 
                                     na.rm = TRUE)

# dFrame

M_Sales_Df <- data.frame(Product_SubCategory = names(subcat_mean_S),
                            Mean_Sales = subcat_mean_S)


# Data frame in descending order of meanSales
M_Sales_Df <- M_Sales_Df[order(-M_Sales_Df$Mean_Sales), ]

# Colors using RColorBrewer

num_colors <- nrow(M_Sales_Df)
colors <- brewer.pal(num_colors, "Set1")

# dot plot using dotchart()

dotchart(M_Sales_Df$Mean_Sales,
         labels = M_Sales_Df$Product_SubCategory,
         cex = 0.7,
         col = colors,
         pch = 19,
         lcolor = "gray", # Set line color for segment
         lty = "dashed",  # Set line type for segment
         pt.cex = 1.5 ,   # Set point size
         xlab = "Average_Sales",
         main = "African Market AVG SALES",
         xlim = c(0, max(M_Sales_Df$Mean_Sales, na.rm = TRUE) * 1.1),
         ylim = c(0.5, num_colors + 0.5))


Observations:

The highest mean sales value in the African Market is Copiers, which shows that this category performs better than any other as regards average sales. The average sales values for both Storage and Chairs are also high, implying that these subcategories have been performing well on the market.

The Accessories and Machines categories have the lowest average sales values in the African market, indicating that they have the lowest average sales values in the African market. The remaining subcategories range from the middle, with different average sales figures representing market performance levels of each category.

TASK 8
African Market Regional Sales

# MArgin
par(mai = c(0.5, 1.4, 1, 1.2))

t4Africa <- dplyr::filter(M2Data, Market == "Africa")

# Use of SUM() in tapply()
total_sales_task13 <- tapply(t4Africa$Sales, t4Africa$Region, sum)

# Sort the total sales in descending order
sorted_total_sales <- sort(total_sales_task13, decreasing = TRUE)

sorted_regions <- names(sorted_total_sales)

# Horizontal Bar Plot
barplot(sorted_total_sales, 
        horiz = TRUE,  # Set horiz to TRUE for horizontal bar plot
        names.arg = sorted_regions,
        col = c("blue", "green", "red", 
                      "aquamarine3", "slategrey"),
        main = "Total Regional Sales in the African Market",
        xlab = "Total Sales",
        ylab = NULL,
        las = 1,
        space = 0.15,
        border = "black",
        cex.names = 0.8)

#  text labels with values on top of each bar
text(x = sorted_total_sales + 1000,
     y = 1:length(sorted_regions),
     labels = round(sorted_total_sales),
     pos = 2,
     cex = 0.8, 
     offset = 0.5)


Observations:

The overall sales values for East Africa and Western Africa are below those of the rest of the region, which implies that their performance is not as good. The Central African region has the largest overall sales value of any African region, making it the most successful in terms of revenue. Total sales in both North Africa and Southern Africa are also significant, which indicates that the regions have performed well as a whole.

TASK 9
Average regional shipping cost in Africa

par(mai = c(1.1, 1.1, 1.1, 1.1))


avg_shipping <- t4Africa %>%
  group_by(Region) %>%
  summarize(Mean_Shipping_Cost = mean(ShippingCost, na.rm = TRUE))


avg_shipping <- avg_shipping[order(-avg_shipping$Mean_Shipping_Cost), ]

# Create a bar plot using barplot()
barplot_heights <- barplot(avg_shipping$Mean_Shipping_Cost,
                           names.arg = avg_shipping$Region,
                           col = c("blue", "green", "red", 
                                         "aquamarine3", "grey"),
                           main = "Average regional shipping cost in Africa",
                           xlab = NULL,
                           ylab = "AVG Shipping Cost",
                           cex.names = 0.7,
                           las = 2,
                           ylim = c(0, max(avg_shipping$Mean_Shipping_Cost) * 1.2))


text(x = barplot_heights,
     y = avg_shipping$Mean_Shipping_Cost + 2,
     labels = round(avg_shipping$Mean_Shipping_Cost, 2),
     pos = 3,
     cex = 0.8) 


Observations:
Based on the analysis, we can conclude that the average shipping cost for the African Market is approximately $354.39. Among the regions, Eastern Africa stands out with the highest average shipping cost of around $386.96, followed by Eastern and Western Africa, which have nearly the same average shipping cost, differing by approximately $3.23. Lastly, Northern and Southern Africa show the least variation in average shipping cost, differing by around $1.

TASK 10
Description of DataSets

1.Integer:

An integer is a type of numeric data in which the values are whole integers (i.e. without decimals).

Examples include the number of student in a class, the number of cases of COVID 19, the number of births, and so on. This is a discrete variable that can never have decimal points.

There are two ways to declare a value as an integer in R, these are:

A. add letter L after the integer (no space).

# Example to show how to represent Integer datatype
b= 5L
typeof(b)
## [1] "integer"

B. use the function as.integer().

# Example to show how to represent Integer datatype
a = as.integer(5)
a
## [1] 5

References:

Dee Chiluiza. (2022, June 25). RPubs. https://rpubs.com/STEMResearch/data-types-in-r

2.Factor:

In R, categorical (nominal) and ordered categorical (ordinal) variables are referred to as factors.
Factors are important in R because they govern how data is evaluated and visually presented.

Categorical values are represented as integer vectors from 1 to k, where k is the count of distinct values in the nominal variable. These integers correspond to a separate internal vector containing the original character strings, which are converted into integers using the factor() function.

References:

Dee Chiluiza. (2022, June 25). RPubs. https://rpubs.com/tskam/R4DSA-Hands-on_Ex01

3.Double and Numeric:

In R, both double and numeric data types are used to represent real numbers (i.e., decimal numbers) with double precision. Numeric data in R can include both integers and decimal numbers with high precision. By default, R stores all numeric values as double-precision floating-point numbers. There is no practical difference between double and numeric in R; they are essentially synonymous and can be used interchangeably.

References:

Dee Chiluiza. (2022, June 25). RPubs. https://rpubs.com/STEMResearch/data-types-in-r

# numeric series without decimals
num_data <- c(3, 7, 2)
num_data
## [1] 3 7 2
class(num_data)
## [1] "numeric"
# numeric series with decimals
num_data_dec <- c(3.4, 7.1, 2.9)
num_data_dec
## [1] 3.4 7.1 2.9
class(num_data_dec)
## [1] "numeric"

References:

Soetewey, A. (n.d.). Data types in R. Stats and R. https://statsandr.com/blog/data-types-in-r/#numeric

TASK 11
Replication of the codes

par(mfcol=c(2,1),
    mai=c(1, 1, 1, 1),
    mar=c(3, 2, 1, 2))


boxplot(M2Data$Profits, 
        horizontal = T,
        notch = TRUE, #Notch for CI for Medians
        col = c("lightblue"),
        boxwex= 1, # Box Width
        whisklty = 1, # Whisker line
        main = "BoxPlot showing Profit",
        xlab = NULL,
        ylab = "Profits",
        ylim = c(-3500, 4500))


 hist(M2Data$Profits,
      breaks = 30,
      xlab = "Profits",
      ylab = "Frequency",
      main = "Histogram showing Profits",
      col = "thistle1",
      border = "black")

TASK 12
Latin American Market Profits

# Margin setting 

par(plt = c(0.1, 0.9, 0.1, 0.8), las = 1)

# Data for Latin America
t13LATAM <- filter(M2Data, Market == "LATAM")

# BOX PLOT for TASK 12
boxplot(t13LATAM$Profits,
        main = "Latin America's Profit shown in BOX PLOT",
        horizontal = TRUE,
        xlab = "Profits",
        col = "lightblue",
        notch = TRUE,  # Notch for Confidence Interval for Medians
        ylim = c(-2000, 1500),
        boxwex = 1,    # Box Width
        whisklty = 1,  # Whisker line
        pch = 20)

t13_LATAM <- filter(M2Data, Market == "LATAM")


# Histogram of profits in Latin America

hist(t13_LATAM$Profits,
     breaks = seq(-2000, 1500, by = 50),
     col = "slategrey",
     main = "Latin America's Profit shown in Histogram",
     xlab = "Profits",
     ylab = "Frequency")

# Mean of profits in Latin America
mean_profit <- mean(t13_LATAM$Profits, na.rm = TRUE)

# Vertical_line_representing_the_mean
abline(v = mean_profit, col = "red", lty = "dashed", lwd = 2)

# LEGENDS_FOR_THE_MEAN_LINE
legend("topright", legend = paste("Mean:", round(mean_profit, 2)),
       col = "red", lty = "dashed", lwd = 2, bg = "white")


Observations:

  1. Profits are distributed pretty equally around the mean value, based on the histogram’s balanced and symmetrical structure. This observation points to a stable market in which a sizable proportion of enterprises earn earnings close to the average. The absence of extreme outliers in the distribution implies that the bulk of organizations are functioning consistently, which contributes to the histogram’s symmetrical structure. Such a consistent and equitable profit distribution is a good sign, indicating a robust and sustainable market environment with a significant part of enterprises functioning profitably and in accordance with the market’s average performance.

  2. The mean profit value is shown by the wedge within the box plot and the vertical red dashed line in the histogram. These visual features are crucial metrics for understanding the earnings trend in the Latin American market. They emphasize the average level of profits, providing useful insights into earnings dispersion and a clear reference point for evaluating the overall performance of firms in the region.

TASK_13
L.America total trading

t_sales_task13 <- tapply(t13_LATAM$Sales, t13_LATAM$Region, sum)

# Dataframe
salesTable_df <- data.frame(Region = names(t_sales_task13), 
                            Total_Sales = t_sales_task13)


sales_table <- salesTable_df[order(-salesTable_df$Total_Sales), ]


knitr::kable(sales_table, 
             caption = "Total regional sales of L.America", 
             row.names = FALSE) %>%
  kableExtra::kable_styling(bootstrap_options = "striped", 
                            full_width = FALSE)
Total regional sales of L.America
Region Total_Sales
Central America 924226.2
South America 457623.3
Caribbean 196775.2


Observations:

According to the data in the table, Central America has the largest overall sales among all Latin American regions, closely followed by South America. In contrast, the Caribbean region has the lowest sales figures in Latin America when compared to other regions.

TASK 14
BoxPlot showing profits for regions in L.America

t13_LATAM <- filter(M2Data, Market == "LATAM")

par(mfcol = c(1, 1), mar = c(5, 4, 2, 1))


boxplot(Profits ~ Region, data = t13_LATAM,
        horizontal = FALSE,
        col = c("red", "lightgreen", "blue"), # Custom colors for the boxes
        border = "black", # Border color of the boxes
        notch = FALSE, # Set notch to FALSE to remove the notches
        main = "Profit Distribution in Latin America",
        xlab = NULL,
        ylab = "Profits",
        ylim=c(-2000, 1500),
        las = 1, # for more redability of x asix lables
        boxwex = 0.5) 


Observations:

According to the research, Central America has the lowest profit of the regions cited. On the other hand, profits in the Caribbean and South America are nearly identical, with no difference between them.


CONCLUSION

It was quite informative and gave me valuable insights into data analysis with R. However, I must admit that it was time-consuming, and I found it difficult to finish within the time limit. The exercise included a variety of topics, including R practical applications, descriptive and inferential statistics, data visualization, R Markdown for building HTML reports, and working with various data sources.

Despite the time limits, I see this exercise’s enormous learning potential. It exposed me to new concepts and techniques, and I believe that devoting more time to practice and exploration will help me improve my talents even more. I’m very enthused about the possibilities for greater understanding and advancement in data analysis.

I learned the value of correct data presentation throughout the exercise in order to successfully share ideas. I also acquired hands-on experience with ggplot2 (though I’m still struggling with it), a strong tool for making visually pleasing plots, and experimented with different data formats in R.

“Mastering data analysis and R programming requires dedication and consistent effort,” Professor remarked in the first lesson. However, I am enjoying the process of learning and appreciating each step toward being effective in data analysis. I am excited to expand my knowledge and expertise in this profession.


BIBLIOGRAPHY

  1. Bluman, A. G. (2010). In Elementary statistics. A step by Step Approach (9th ed., pp. 2–4). essay, McGraw-Hill.

  2. Heer, J., & Boy, J. (2016). Data Visualization and Communication: A Guide for Researchers and Practitioners. O’Reilly Media. ISBN: 978-149190-391-6.

  3. Cairo, A. (2013). The Wall Street Journal Guide to Information Graphics: The Dos and Don’ts of Presenting Data, Facts, and Figures. W. W. Norton & Company. ISBN: 978-039334-728-9.

  4. Kuhn, M., & Johnson, K. (2013). Applied Predictive Modeling. Springer. ISBN: 978-1461468486.

  5. Dee Chiluiza. (2022, June 25). RPubs. https://rpubs.com/STEMResearch/data-types-in-r

  6. Dee Chiluiza. (2022, June 25). RPubs. https://rpubs.com/tskam/R4DSA-Hands-on_Ex01

  7. Soetewey, A. (n.d.). Data types in R. Stats and R. https://statsandr.com/blog/data-types-in-r/#numeric


Appendix

This report contains an R Markdown file named as follows Ansari_ALY6000Project2.Rmd