1. Introduction

1.1 Dataset Selection

Objective: Introduce the “mtcars” dataset and explain the reason for its selection.

The “mtcars” dataset is a well-known dataset in R that provides useful details about different car models. It was compiled from the 1974 Motor Trend US magazine and includes data on 32 cars, such as miles per gallon (mpg), number of cylinders (cyl), displacement (disp), horsepower (hp), and other factors that significantly impact a car’s performance and fuel efficiency.

I chose the “mtcars” dataset because it’s widely utilized in basic data analysis and statistical modeling. It’s great for examining how various car features relate to fuel efficiency and performance. Its simplicity and real-world applicability make it perfect for practicing data manipulation, visualization, and modeling in R.

# Load the mtcars dataset
data(mtcars)

# Display the first few rows of the dataset
head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

2. Description of the Dataset

2.1 Variables

Objective: Describe the variables present in the “mtcars” dataset.

# Display column names and their corresponding variables
names(mtcars)
##  [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear"
## [11] "carb"

Interpretation of these results:

The “mpg” column represents miles per gallon (continuous variable).

The “cyl” column represents the number of cylinders (categorical variable).

The “disp” column represents displacement (continuous variable).

The “hp” column represents horsepower (continuous variable).

The “drat” column represents rear axle ratio (continuous variable).

The “wt” column represents weight (continuous variable).

The “qsec” column represents quarter mile time (continuous variable).

The “vs” column represents engine (0 = V-shaped, 1 = straight; categorical variable).

The “am” column represents transmission (0 = automatic, 1 = manual; categorical variable).

The “gear” column represents number of forward gears (categorical variable).

The “carb” column represents number of carburetors (categorical variable).


2.2 Observations

Objective: Discuss the number of observations in the “mtcars” dataset.

# Check the number of rows/observations in the dataset
nrow(mtcars)
## [1] 32

Each row in the “mtcars” dataset represents a different car model. There are a total of 32 rows/observations in the dataset.


2.3 Missing Values

Objective: Analyze if there are any missing values in the dataset.

# Check for missing values
any(is.na(mtcars))
## [1] FALSE

There are no missing values in “mtcars” dataset.


3. Conclusion

3.1 Summary Statistics for Numeric Variables:

summary(mtcars[, c("mpg", "disp", "hp", "drat", "wt", "qsec")])
##       mpg             disp             hp             drat      
##  Min.   :10.40   Min.   : 71.1   Min.   : 52.0   Min.   :2.760  
##  1st Qu.:15.43   1st Qu.:120.8   1st Qu.: 96.5   1st Qu.:3.080  
##  Median :19.20   Median :196.3   Median :123.0   Median :3.695  
##  Mean   :20.09   Mean   :230.7   Mean   :146.7   Mean   :3.597  
##  3rd Qu.:22.80   3rd Qu.:326.0   3rd Qu.:180.0   3rd Qu.:3.920  
##  Max.   :33.90   Max.   :472.0   Max.   :335.0   Max.   :4.930  
##        wt             qsec      
##  Min.   :1.513   Min.   :14.50  
##  1st Qu.:2.581   1st Qu.:16.89  
##  Median :3.325   Median :17.71  
##  Mean   :3.217   Mean   :17.85  
##  3rd Qu.:3.610   3rd Qu.:18.90  
##  Max.   :5.424   Max.   :22.90

The summary statistics provide information about the central tendency, spread, and distribution of the numeric variables in the dataset, including minimum, maximum, median, mean, and quartiles.


3.2 Frequency Tables for Categorical Variables:

table(mtcars$cyl)
## 
##  4  6  8 
## 11  7 14
table(mtcars$vs)
## 
##  0  1 
## 18 14
table(mtcars$am)
## 
##  0  1 
## 19 13
table(mtcars$gear)
## 
##  3  4  5 
## 15 12  5
table(mtcars$carb)
## 
##  1  2  3  4  6  8 
##  7 10  3 10  1  1

The frequency tables displays the count of observations for each category of the categorical variables in the dataset, providing insights into the distribution of these variables.

We do these descriptive aalytics to gain a better understanding of the “mtcars” dataset’s characteristics and distribution of variables.