## readxl works best with a newer version of the tibble package.
## You currently have tibble v1.4.2.
## Falling back to column name repair from tibble <= v1.4.2.
## Message displays once per session.
A company that manufactures riding mowers wants to identify the best sales prospects for an intensive sales campaign. In particular, the manufacturer is interested in classifying households as prospective owners or nonowners on the basis of Income ( in $ 1000s) and Lot Size ( in 1000 ft2). The marketing expert looked at a random sample of 24 households, included in the file RidingMowers.xls.
The file LaptopSalesJanuary2008.xls contains data for all sales of laptops at a computer chain in London in January 2008. This is a subset of the full dataset that includes data for the entire year.
Data file: Cereals.csv “Data were collected on the nutritional information and consumer rating of 77 breakfast cereals. The consumer rating is a rating of cereal”healthiness" for consumer information (not a rating by consumers). For each cereal, the data include 13 numerical variables, the information is based on a bowl of cereal rather than a serving size, because most people simply fill a cereal bowl (resulting in constant volume, but not weight). The description of the different variables is given in The following Table."
For each of the dataset described above,
Create a variable dictionary (description, Type of Variable, number of null (missing) values, unit)
ESTA PARTE SE ENCUENTRA EN DOCUMENTO WORD
## mean median Std Dev max
## Retail Price 4.879357e+02 490.000 61.5305867 665.00
## Screen Size (Inches) 1.500000e+01 15.000 0.0000000 15.00
## Battery Life (Hours) 5.138707e+00 5.000 0.8152124 6.00
## RAM (GB) 1.547409e+00 2.000 0.4977786 2.00
## Processor Speeds (GHz) 1.757482e+00 2.000 0.2499037 2.00
## HD Size (GB) 1.503823e+02 120.000 102.5037081 300.00
## OS X Customer 5.308678e+05 531150.500 4414.4817427 549065.00
## OS Y Customer 1.798869e+05 181106.000 4647.0432672 199846.00
## OS X Store 5.307478e+05 529902.000 4159.4985492 541428.00
## OS Y Store 1.798077e+05 179641.000 3995.1250317 190628.00
## CustomerStoreDistance 3.679882e+03 3382.458 2068.9050186 19892.14
## min
## Retail Price 300.0
## Screen Size (Inches) 15.0
## Battery Life (Hours) 4.0
## RAM (GB) 1.0
## Processor Speeds (GHz) 1.5
## HD Size (GB) 40.0
## OS X Customer 512253.0
## OS Y Customer 164886.0
## OS X Store 517917.0
## OS Y Store 168302.0
## CustomerStoreDistance 0.0
## mean median Std Dev max min
## Income 68.4375 64.8 19.793144 110.1 33
## Lot_Size 18.9500 19.0 2.428275 23.6 14
## mean median Std Dev max min
## calories 107.0270270 110.00000 19.8438928 160.00000 50.00000
## protein 2.5135135 2.50000 1.0758016 6.00000 1.00000
## fat 1.0000000 1.00000 1.0068260 5.00000 0.00000
## sodium 162.3648649 180.00000 82.7697871 320.00000 0.00000
## fiber 2.1756757 2.00000 2.4233912 14.00000 0.00000
## carbo 14.7297297 14.50000 3.8916746 23.00000 5.00000
## sugars 7.1081081 7.00000 4.3591113 15.00000 0.00000
## potass 98.5135135 90.00000 70.8786815 330.00000 15.00000
## vitamins 29.0540541 25.00000 22.2943521 100.00000 0.00000
## shelf 2.2162162 2.00000 0.8320674 3.00000 1.00000
## weight 1.0308108 1.00000 0.1534155 1.50000 0.50000
## cups 0.8216216 0.75000 0.2357153 1.50000 0.25000
## rating 42.3717869 40.25309 14.0337125 93.70491 18.04285
## Warning in cor(LaptopSalesJanuary2008): the standard deviation is zero
## Retail Price Screen Size (Inches)
## Retail Price 1.000000000 NA
## Screen Size (Inches) NA 1
## Battery Life (Hours) 0.491384279 NA
## RAM (GB) 0.288121734 NA
## Processor Speeds (GHz) 0.151104411 NA
## HD Size (GB) 0.486015284 NA
## OS X Customer 0.003470388 NA
## OS Y Customer -0.005723961 NA
## OS X Store -0.005961426 NA
## OS Y Store -0.010147738 NA
## CustomerStoreDistance 0.012849144 NA
## Battery Life (Hours) RAM (GB)
## Retail Price 0.491384279 0.288121734
## Screen Size (Inches) NA NA
## Battery Life (Hours) 1.000000000 -0.080518951
## RAM (GB) -0.080518951 1.000000000
## Processor Speeds (GHz) -0.028400218 -0.013973477
## HD Size (GB) -0.165880556 -0.059340743
## OS X Customer -0.007483103 0.015278243
## OS Y Customer -0.001145769 -0.009150758
## OS X Store -0.013375189 -0.002301917
## OS Y Store -0.003097381 -0.011536512
## CustomerStoreDistance 0.002879215 0.003550472
## Processor Speeds (GHz) HD Size (GB) OS X Customer
## Retail Price 0.151104411 0.4860152838 0.003470388
## Screen Size (Inches) NA NA NA
## Battery Life (Hours) -0.028400218 -0.1658805562 -0.007483103
## RAM (GB) -0.013973477 -0.0593407434 0.015278243
## Processor Speeds (GHz) 1.000000000 -0.0249308433 -0.006822880
## HD Size (GB) -0.024930843 1.0000000000 -0.001047220
## OS X Customer -0.006822880 -0.0010472202 1.000000000
## OS Y Customer 0.015387950 -0.0003341631 0.127147965
## OS X Store -0.002632854 -0.0003881928 0.791713058
## OS Y Store 0.003843190 0.0012811391 0.128052036
## CustomerStoreDistance -0.011324659 0.0128208013 -0.085233169
## OS Y Customer OS X Store OS Y Store
## Retail Price -0.0057239606 -0.0059614262 -0.010147738
## Screen Size (Inches) NA NA NA
## Battery Life (Hours) -0.0011457691 -0.0133751891 -0.003097381
## RAM (GB) -0.0091507576 -0.0023019169 -0.011536512
## Processor Speeds (GHz) 0.0153879499 -0.0026328536 0.003843190
## HD Size (GB) -0.0003341631 -0.0003881928 0.001281139
## OS X Customer 0.1271479651 0.7917130579 0.128052036
## OS Y Customer 1.0000000000 0.2295129854 0.739738605
## OS X Store 0.2295129854 1.0000000000 0.214007940
## OS Y Store 0.7397386048 0.2140079397 1.000000000
## CustomerStoreDistance -0.2649425080 -0.1049821186 -0.073016796
## CustomerStoreDistance
## Retail Price 0.012849144
## Screen Size (Inches) NA
## Battery Life (Hours) 0.002879215
## RAM (GB) 0.003550472
## Processor Speeds (GHz) -0.011324659
## HD Size (GB) 0.012820801
## OS X Customer -0.085233169
## OS Y Customer -0.264942508
## OS X Store -0.104982119
## OS Y Store -0.073016796
## CustomerStoreDistance 1.000000000
## Income Lot_Size
## Income 1.000000 0.172151
## Lot_Size 0.172151 1.000000
## calories protein fat sodium fiber
## calories 1.00000000 0.03399166 0.5073732397 0.2962474981 -0.29521183
## protein 0.03399166 1.00000000 0.2023533963 0.0115588913 0.51400610
## fat 0.50737324 0.20235340 1.0000000000 0.0008219036 0.01403587
## sodium 0.29624750 0.01155889 0.0008219036 1.0000000000 -0.07073492
## fiber -0.29521183 0.51400610 0.0140358654 -0.0707349230 1.00000000
## carbo 0.27060605 -0.03674326 -0.2849336855 0.3284091857 -0.37908370
## sugars 0.56912054 -0.28658397 0.2871524866 0.0370589612 -0.15094850
## potass -0.07136125 0.57874284 0.1996367171 -0.0394380876 0.91150392
## vitamins 0.25984556 0.05479952 -0.0305139099 0.3315759640 -0.03871734
## shelf 0.08924278 0.19563468 0.2779797246 -0.1218968162 0.31378736
## weight 0.69645215 0.23067141 0.2217141647 0.3125335701 0.24629218
## cups 0.08919615 -0.24209861 -0.1575787041 0.1195841083 -0.51369716
## rating -0.69378466 0.46716218 -0.4050501988 -0.3830123581 0.60341090
## carbo sugars potass vitamins shelf
## calories 0.27060605 0.569120535 -0.071361247 0.25984556 0.08924278
## protein -0.03674326 -0.286583967 0.578742837 0.05479952 0.19563468
## fat -0.28493369 0.287152487 0.199636717 -0.03051391 0.27797972
## sodium 0.32840919 0.037058961 -0.039438088 0.33157596 -0.12189682
## fiber -0.37908370 -0.150948502 0.911503921 -0.03871734 0.31378736
## carbo 1.00000000 -0.452069189 -0.365002934 0.25357897 -0.18899627
## sugars -0.45206919 1.000000000 0.001413982 0.07295438 0.06144909
## potass -0.36500293 0.001413982 1.000000000 -0.00263583 0.39458548
## vitamins 0.25357897 0.072954382 -0.002635830 1.00000000 0.28440479
## shelf -0.18899627 0.061449088 0.394585485 0.28440479 1.00000000
## weight 0.14480528 0.460547135 0.420561534 0.32043480 0.19284304
## cups 0.35828371 -0.032436100 -0.501688318 0.13362965 -0.35103354
## rating 0.05594129 -0.755955089 0.415782443 -0.21448095 0.05103975
## weight cups rating
## calories 0.6964521 0.08919615 -0.69378466
## protein 0.2306714 -0.24209861 0.46716218
## fat 0.2217142 -0.15757870 -0.40505020
## sodium 0.3125336 0.11958411 -0.38301236
## fiber 0.2462922 -0.51369716 0.60341090
## carbo 0.1448053 0.35828371 0.05594129
## sugars 0.4605471 -0.03243610 -0.75595509
## potass 0.4205615 -0.50168832 0.41578244
## vitamins 0.3204348 0.13362965 -0.21448095
## shelf 0.1928430 -0.35103354 0.05103975
## weight 1.0000000 -0.20171465 -0.30046104
## cups -0.2017146 1.00000000 -0.22250440
## rating -0.3004610 -0.22250440 1.00000000
En el caso de los cereales observamos que existe una correlacion fuerte entre el potasio y la fibra mientras que observamos una correlacion fuerte negativa entre las calorias y el raiting.
## The following objects are masked from RidingMowers:
##
## Income, Lot_Size
ggplot2.scatterplot(data=RidingMowers_,xName=‘Lotsize(in 1000ft2)’, yName=‘income(in $1000s)’, groupName=“owner/nonowner” , main=“Movers data”)
## [1] 168302
## [1] 517917
## [1] 190628
## [1] 541428
OS X Store tiene el promedio menor (168302) mientras que OS Y Store tiene el promedio mayor (541428).
La distribucion de OS Y Store parece ser mas normal con outliyers relativamente simetricos aunque todavia refleja un sesgo hacia la izquierda. La distribucion de OS X store solo tiene outliyers a la derecha y esta sesgada significativamente hacia la derecha.
Use the data for the breakfast cereal cereals2.
Las variables con mas variabilidad incluyen a “shelf”, “protein”, “weight”, “vitamins”, y “sugars”
Las variables con segos incluyen a “fiber”, “fat”, “potass”, y “raiting”
La distribucion de los “hot-cereals” muestra una media sesgasa a la izquierda aunque el rseto de la distribucion pareciera tener un comportamiento relativamente simetrico con excepcion de los outliyers. Estos outliyers deben analizarse con mas cuidado al ser relativamente demasiados. La dsitribucion de los “cold-cereals” parece tener pocas observaciones como para ser analizada a treves de un boxplot. No presenta variabilidad ni outliyers.
## [1] 3 1 2
Al examinar los boxplots observamos que sus distribuciones en funcion al shelf-height no varian significativamente entre si por tanto el self-height no pareciera influenciar el consumer raiting.