1. Use the data file “mpg” to answer the following questions. The data contains a sample of cars manufactured from 1998-2008.
  1. What are the elements for this data set and how many observations does it include?
  1. Does this data represent a sample or a population?
  1. Characterize each variable as categorical or quantitative. If the variable is categorical, determine if it is nominal or ordinal. If the variable is quantitative specify if it is discrete or continuous.
  1. Construct a percentage frequency table and bar plot for the variable “Class”. Are the classes represented equally? Which class is represented the most? Label the bar plot correctly with a main tile and axis titles.

SUV is the highest represented class with 26% of the observations. There is some variation among the classes and they are not distributed equally. With seven categories we would expect about 14% in each if they were distributed similarly. In this case the lowest category represented 2% of the observations and the highest category represented 26% with varying levels in between.

class Count_Class Percent
suv 62 26
compact 47 20
midsize 41 18
subcompact 35 15
pickup 33 14
minivan 11 5
2seater 5 2
Total 234 100

  1. Construct a histogram for the variable HWY MPG. Group the data as you see fit so that the histogram provides a good representation for the distribution. Include a title and axis labels in the chart. Discuss the shape of the distribution relating to symmetry and skew.

  1. The following three histograms describe the distribution for City MPG. Which of the three do you believe provides the best representation for the distribution? Explain your reasoning?

Figure B provides the best representation overall for the distribution. Figure C would also be considered okay to use. Figure A does not provide enough information.