Name:

  1. Use the data file “mpg” to answer the following questions. The data contains a sample of cars manufactured from 1999-2008.
  1. What are the elements for this data set and how many observations does it include?

The elements in this data set are the different types of cars represented by the Car_ID label.

  1. Does this data represent a sample or a population?

This is a sample of cars produced between 1999 and 2008.

  1. Characterize each variable as categorical or quantitative. If the variable is categorical, determine if it is nominal or ordinal. If the variable is quantitative specify if it is discrete or continuous.
<style type="text/css">
.tg {border-collapse:collapse;border-spacing:0;} .tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; overflow:hidden;padding:10px 5px;word-break:normal;} .tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;} .tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
Variable
Year Quantitative Discrete
Manufacturer Categorical Nominal
Model Categorical Nominal
City MPG Quantitative Cont.
HWY MPG Quantitative Cont.
Class Categorical
  1. Construct a percentage frequency table and bar plot for the variable “Class”. Are the classes represented equally? Which class is represented the most? Label the bar plot correctly with a main tile and axis titles.
class Count Percent
minivan 11 4.7%
suv 62 26.5%
compact 47 20.1%
2seater 5 2.1%
midsize 41 17.5%
subcompact 35 15%
pickup 33 14.1%
  1. Construct a histogram for the variable HWY MPG. Create exactly six bins. Include a title and axis labels in the chart. Discuss the shape of the distribution relating to symmetry and skew.

Distribution looks relatively symmetric.

  1. Calculate the average and median for HWY MPG. Use this information to discuss the symmetry in the distribution. If you were to choose one measure to best describe the center of the distribution, which statistics would you use?
Avg Med
23.44 24

The average and median hwy MPG are relatively close, which would suggest not a lot of skew in the distribution. This is consistent with the histogram.

  1. Calculate the standard deviation and the coefficient of variation (CV). Use these two measures to discuss how much variation there is in the data.
STDEV CV
5.95 0.25