Sandra Meneses
17.01.2017
A smaller firm produces special coffee machines.
The number of machines per day varies day by day. On 50 days the number of machines was determined. Based on this sample statistical measures should be calculated to describe the frequency distribution.
See Youtube how these machines are working .
You will learn to:
RAW data allow to get a very detailed idea of the production capacity .
num_per_day <- c(7,6,6,10,6,6,7,4,7,4,5,7,9,6,5,5,9,7,5,9,5,5,4,8,5,8,
5,6,4,6,4,7,6,5,8,4,5,5,6,5,8,5,6,6,5,7,7,6,7,8)
machine <- as.data.frame(num_per_day)
but we need to have an managerial summary e.g. to be able to forecast.
First we can see a frecuency table:
table(machine)
machine
4 5 6 7 8 9 10
6 14 12 9 5 3 1
Now we can use the functions: mean, median, or just get all the data with the function summary.
summary(machine)
num_per_day
Min. : 4.00
1st Qu.: 5.00
Median : 6.00
Mean : 6.12
3rd Qu.: 7.00
Max. :10.00
To calculate the mode there is not built-in function in R, then we use one of the many ways to do it.
as.numeric(names(sort(-table(machine)))[1])
[1] 5
For this example the library plotly will be used
plot_ly(machine,x= num_per_day,width = 500,height = 400) %>% add_histogram()
The mode is the value with the highest frequency.
The mean of a set of observations is their average.
It is equal to the sum of all observations divided by the number of observations in the set.
\[ f(\bar{x}) = \frac{1}{n}{\sum_{n=1}^{n}x_n} \]
– If the frequency table is given:
\[ f(\bar{x}) = \frac{1}{n}{\sum_{n=1}^{m}{x_i}{h(x_i)}} = {\sum_{n=1}^{m}{x_i}{f(x_i)}} \]
The position within this group is defined in terms of a percentile!
Definition of a percentile:
The Pth percentile of a group of numbers is the value below which lie P% (P percent) of the numbers in the group. The position of the Pth percentile is given by (n +1)*P/100, where n is the sample size.
wealth <- c(18,18,18,18,19,20,20,20,21,22,22,23,24,26,27,32,33,49,52,56)
forbes <- as.data.frame(wealth)
Find the 50th and 80th percentiles of this set of the world's top 20 net worths! With the funtion quantile the respective quantiles are given.
quantile(forbes$wealth,probs = c(0.5,0.8))
50% 80%
22.0 32.2
A boxplot is shown to see the distribution of the data.
plot_ly(forbes,x= wealth,width = 400,height = 150) %>% add_boxplot()
Explain how to calculate mode, median and mean based on RAW data.
The following table shows a tabular comparison of scales of measurements and measures of central tendency. Fill in the details and explain your decision in detail!
| Nominal | Ordinal | Metrical | |
|---|---|---|---|
| Mode | |||
| Median | |||
| Mean |
Define the term percentile/quantile and quartile.