Problem 3: Cotton Content Analysis

Data

The percentage of cotton in material used to manufacture men’s shirts is given below:

# Data input (manually input data from the given excel file)
cotton_content <- c(34.2, 37.8, 33.6, 32.6, 33.8, 35.8, 34.7, 34.6, 33.1, 36.6, 34.7, 33.1, 34.2, 37.6, 33.6, 33.6, 34.5, 35.4, 35.5, 34.6, 33.4, 37.3, 32.5, 34.1, 35.6, 34.6, 35.4, 35.9, 34.7, 34.6, 34.1, 34.7, 36.3, 33.8, 36.2, 34.7, 34.6, 35.5, 35.1, 35.7, 35.1, 37.1, 36.8, 33.6, 35.2, 32.8, 36.8, 36.8, 34.7, 34.0, 35.1, 32.9, 35.5, 32.1, 37.9, 34.3, 33.6, 34.1, 35.3, 33.5, 34.9, 34.5, 36.4, 32.7)

(a) Compute the Sample Mean, Variance, and Median

We will now calculate the sample mean, variance, and median of the cotton content data.

# Sample mean
mean_cotton <- mean(cotton_content)

# Sample variance
var_cotton <- var(cotton_content)

# Median
median_cotton <- median(cotton_content)

Printing the Results

# Output results
mean_cotton
## [1] 34.81406
var_cotton
## [1] 1.874878
median_cotton
## [1] 34.7

(b) Construct a Stem-and-Leaf Display

Creating a stem-and-leaf display of the cotton content data.

# Stem-and-leaf display
stem(cotton_content)
## 
##   The decimal point is at the |
## 
##   32 | 156789
##   33 | 11456666688
##   34 | 011122355666667777779
##   35 | 11123445556789
##   36 | 2346888
##   37 | 13689

(c) Construct a Histogram for the Cotton Content

To create a histogram, we can use the hist() function in R:

hist(cotton_content, main = "Histogram of Cotton Content", 
     xlab = "Cotton Percentage", 
     col = "red", 
     border = "black")

The histogram appears slightly skewed to the right with a higher frequency of values in the lower range (around 33 to 34%). This suggests that most cotton content values fall between 33% and 35%, with fewer values at the higher end.

(d) Construct a Box Plot of the Data and Comment on the Information

To create a box plot, you can use the boxplot() function:

boxplot(cotton_content, main = "Box Plot of Cotton Content", 
        ylab = "Cotton Percentage", 
        col = "lightgreen", 
        border = "black")

The median is around 34.6, and the interquartile range (IQR) spans from approximately 33.3 to 35.6. This suggests that the central 50% of the data lies in this range.