A Small Project to understand basic features in R

This mock project consists of data points with different features such as the following:

Look at the data in the form of a Table

# Summarize key features of the data:
df
##    Fecha  Nombre Cuesta Cantidad Dinero
## 1   3/25 Cupcake      2       30     60
## 2   3/25  Cookie      1       20     20
## 3   3/25  Muffin      3       12     36
## 4   3/26 Cupcake      2       40     80
## 5   3/26     Pie      5       15     75
## 6   3/27 Cupcake      2       35     70
## 7   3/27  Cookie      1       25     25
## 8   3/27  Muffin      3       14     42
## 9   3/28 Cupcake      2       32     64
## 10  3/28     Pie      5       16     80
## 11  3/29 Cupcake      2       38     76
## 12  3/29  Cookie      1       22     22
## 13  3/29  Muffin      3       13     39
## 14  3/30 Cupcake      2       36     72
## 15  3/30     Pie      5       17     85
## 16  3/31 Cupcake      2       39     78
## 17  3/31  Cookie      1       24     24
## 18  3/31  Muffin      3       15     45
## 19  3/31     Pie      5       18     90
## 20  3/31 Cupcake      2       41     82
## 21  3/31  Cookie      1       26     26
glimpse(df)
## Rows: 21
## Columns: 5
## $ Fecha    <chr> "3/25", "3/25", "3/25", "3/26", "3/26", "3/27", "3/27", "3/27…
## $ Nombre   <chr> "Cupcake", "Cookie", "Muffin", "Cupcake", "Pie", "Cupcake", "…
## $ Cuesta   <int> 2, 1, 3, 2, 5, 2, 1, 3, 2, 5, 2, 1, 3, 2, 5, 2, 1, 3, 5, 2, 1
## $ Cantidad <int> 30, 20, 12, 40, 15, 35, 25, 14, 32, 16, 38, 22, 13, 36, 17, 3…
## $ Dinero   <int> 60, 20, 36, 80, 75, 70, 25, 42, 64, 80, 76, 22, 39, 72, 85, 7…
str(df)
## 'data.frame':    21 obs. of  5 variables:
##  $ Fecha   : chr  "3/25" "3/25" "3/25" "3/26" ...
##  $ Nombre  : chr  "Cupcake" "Cookie" "Muffin" "Cupcake" ...
##  $ Cuesta  : int  2 1 3 2 5 2 1 3 2 5 ...
##  $ Cantidad: int  30 20 12 40 15 35 25 14 32 16 ...
##  $ Dinero  : int  60 20 36 80 75 70 25 42 64 80 ...
summary(df)
##     Fecha              Nombre              Cuesta         Cantidad    
##  Length:21          Length:21          Min.   :1.000   Min.   :12.00  
##  Class :character   Class :character   1st Qu.:2.000   1st Qu.:16.00  
##  Mode  :character   Mode  :character   Median :2.000   Median :24.00  
##                                        Mean   :2.524   Mean   :25.14  
##                                        3rd Qu.:3.000   3rd Qu.:35.00  
##                                        Max.   :5.000   Max.   :41.00  
##      Dinero     
##  Min.   :20.00  
##  1st Qu.:36.00  
##  Median :64.00  
##  Mean   :56.71  
##  3rd Qu.:78.00  
##  Max.   :90.00
skim_without_charts(df)
Data summary
Name df
Number of rows 21
Number of columns 5
_______________________
Column type frequency:
character 2
numeric 3
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Fecha 0 1 4 4 0 7 0
Nombre 0 1 3 7 0 4 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Cuesta 0 1 2.52 1.40 1 2 2 3 5
Cantidad 0 1 25.14 10.01 12 16 24 35 41
Dinero 0 1 56.71 24.43 20 36 64 78 90

Summarize Via Table

We can aggregate the data based on name type. That is, group by name and summarize the total price and number of goods sold by using the group_by() and summarise() functions.

TP <- df %>% group_by(Nombre) %>% summarise(Total_Price = sum(Cuesta * Cantidad), Total_Sold = sum(Cantidad))
TP %>% flextable()

Nombre

Total_Price

Total_Sold

Cookie

117

117

Cupcake

582

291

Muffin

162

54

Pie

330

66

Slide with Graphing

Takeaways

Cupcakes are the most popular item at the bakery. Over 250 Cupcakes were sold at the local bakery this week compared to a less popular item, the Muffin, which had 54 of them sold. Although Pies and Muffins had similar quantities (cuantidad) sold, Pies brought much more profit since it cost $5 per each.