Lab 3.1

Customer Dataset

Robin Chavez

Lab 3.1 Make a Plot

Summary of Data (Customers)

    Channel          Region          Fresh             Milk      
 Min.   :1.000   Min.   :1.000   Min.   :     3   Min.   :   55  
 1st Qu.:1.000   1st Qu.:2.000   1st Qu.:  3128   1st Qu.: 1533  
 Median :1.000   Median :3.000   Median :  8504   Median : 3627  
 Mean   :1.323   Mean   :2.543   Mean   : 12000   Mean   : 5796  
 3rd Qu.:2.000   3rd Qu.:3.000   3rd Qu.: 16934   3rd Qu.: 7190  
 Max.   :2.000   Max.   :3.000   Max.   :112151   Max.   :73498  
    Grocery          Frozen        Detergents_Paper    Delicassen     
 Min.   :    3   Min.   :   25.0   Min.   :    3.0   Min.   :    3.0  
 1st Qu.: 2153   1st Qu.:  742.2   1st Qu.:  256.8   1st Qu.:  408.2  
 Median : 4756   Median : 1526.0   Median :  816.5   Median :  965.5  
 Mean   : 7951   Mean   : 3071.9   Mean   : 2881.5   Mean   : 1524.9  
 3rd Qu.:10656   3rd Qu.: 3554.2   3rd Qu.: 3922.0   3rd Qu.: 1820.2  
 Max.   :92780   Max.   :60869.0   Max.   :40827.0   Max.   :47943.0  

First and Last 10 Rows

   Channel Region Fresh  Milk Grocery Frozen Detergents_Paper Delicassen
1        2      3 12669  9656    7561    214             2674       1338
2        2      3  7057  9810    9568   1762             3293       1776
3        2      3  6353  8808    7684   2405             3516       7844
4        1      3 13265  1196    4221   6404              507       1788
5        2      3 22615  5410    7198   3915             1777       5185
6        2      3  9413  8259    5126    666             1795       1451
7        2      3 12126  3199    6975    480             3140        545
8        2      3  7579  4956    9426   1669             3321       2566
9        1      3  5963  3648    6192    425             1716        750
10       2      3  6006 11093   18881   1159             7425       2098
    Channel Region Fresh  Milk Grocery Frozen Detergents_Paper Delicassen
431       1      3  3097  4230   16483    575              241       2080
432       1      3  8533  5506    5160  13486             1377       1498
433       1      3 21117  1162    4754    269             1328        395
434       1      3  1982  3218    1493   1541              356       1449
435       1      3 16731  3922    7994    688             2371        838
436       1      3 29703 12051   16027  13135              182       2204
437       1      3 39228  1431     764   4510               93       2346
438       2      3 14531 15488   30243    437            14841       1867
439       1      3 10290  1981    2232   1038              168       2125
440       1      3  2787  1698    2510     65              477         52

Structure of Data

'data.frame':   440 obs. of  8 variables:
 $ Channel         : int  2 2 2 1 2 2 2 2 1 2 ...
 $ Region          : int  3 3 3 3 3 3 3 3 3 3 ...
 $ Fresh           : int  12669 7057 6353 13265 22615 9413 12126 7579 5963 6006 ...
 $ Milk            : int  9656 9810 8808 1196 5410 8259 3199 4956 3648 11093 ...
 $ Grocery         : int  7561 9568 7684 4221 7198 5126 6975 9426 6192 18881 ...
 $ Frozen          : int  214 1762 2405 6404 3915 666 480 1669 425 1159 ...
 $ Detergents_Paper: int  2674 3293 3516 507 1777 1795 3140 3321 1716 7425 ...
 $ Delicassen      : int  1338 1776 7844 1788 5185 1451 545 2566 750 2098 ...

Summary Statistics

Fresh and Grocery Variables

     Fresh           Grocery     
 Min.   :     3   Min.   :    3  
 1st Qu.:  3128   1st Qu.: 2153  
 Median :  8504   Median : 4756  
 Mean   : 12000   Mean   : 7951  
 3rd Qu.: 16934   3rd Qu.:10656  
 Max.   :112151   Max.   :92780  

Create a table

Channel

Table of Frequencies
Channel Frequency
1 298
2 142

Versions of Charts

Version 1

Version 2

Version 3

Version 4

Version 5

Final Version

Interpretation of Data

The chart shows that there is a positive correlation between spending on fresh products and spending on grocery items. This means that as customers spend more on fresh products, they also tend to spend more on grocery items. This can be because of different activities customers do:
 
 1. Customers who cook more at home may be more likely to buy both fresh and grocery items.
 2. Customers who have a higher income may be more likely to be able to afford to buy both fresh and grocery items.
 3. Certain grocery stores may sell both fresh and grocery items, so customers who shop at these stores may be more likely to buy both types of products.

 The chart provides valuable insights into the relationship between fresh and grocery spending, highlighting potential connections among lifestyle choices, income considerations, and the shopping environment.

Improvements

After sharing with my peers my chart I had before I was able to make some new changes to my final version of the chart. 
1. I got rid of the geom_smooth(), because according to my peers it was easier to visualize what the graph shows without having the pattern. 

2. I scaled the y axis, this was new for me because I did not know we could scale both x and y axes. The graph now shows a better understanding of the Fesh and Grocery products and how they are distributed into the different channels.
 
3. The scales::dollar function was added, so it brought more meaning to what the chart wants to show. UNderstadning that it is measured in the amount of dollars tend to spend when bying fresh and grocery products. 

4. Finally, in the Source, I added where the dataset Customers came from. Before I had just: 'Source: Customers', but then it changed to 'Source: Customers dataset from datasetICR package', so it also demonstrated the veracity of the information and where it was retrieved from.