Name:
- Use the data file “mpg” to answer the following questions. The data
contains a sample of cars manufactured from 1999-2008.
- What are the elements for this data set and how many observations
does it include?
The elements in this data set are the different types of cars
represented by the Car_ID label.
- Does this data represent a sample or a population?
This is a sample of cars produced between 1999 and 2008.
- Characterize each variable as categorical or quantitative. If the
variable is categorical, determine if it is nominal or ordinal. If the
variable is quantitative specify if it is discrete or continuous.
<style type="text/css">
.tg {border-collapse:collapse;border-spacing:0;} .tg
td{border-color:black;border-style:solid;border-width:1px;font-family:Arial,
sans-serif;font-size:14px; overflow:hidden;padding:10px
5px;word-break:normal;} .tg
th{border-color:black;border-style:solid;border-width:1px;font-family:Arial,
sans-serif;font-size:14px;
font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
|
Variable
|
|
|
|
Year
|
Quantitative
|
Discrete
|
|
Manufacturer
|
Categorical
|
Nominal
|
|
Model
|
Categorical
|
Nominal
|
|
City MPG
|
Quantitative
|
Cont.
|
|
HWY MPG
|
Quantitative
|
Cont.
|
|
Class
|
Categorical
|
|
- Construct a percentage frequency table and bar plot for the variable
“Class”. Are the classes represented equally? Which class is represented
the most? Label the bar plot correctly with a main tile and axis
titles.
|
class
|
Count
|
Percent
|
|
minivan
|
11
|
4.7%
|
|
suv
|
62
|
26.5%
|
|
compact
|
47
|
20.1%
|
|
2seater
|
5
|
2.1%
|
|
midsize
|
41
|
17.5%
|
|
subcompact
|
35
|
15%
|
|
pickup
|
33
|
14.1%
|
- Construct a histogram for the variable HWY MPG. Create exactly six
bins. Include a title and axis labels in the chart. Discuss the shape of
the distribution relating to symmetry and skew.
Distribution looks relatively symmetric.
- Calculate the average and median for HWY MPG. Use this information
to discuss the symmetry in the distribution. If you were to choose one
measure to best describe the center of the distribution, which
statistics would you use?
The average and median hwy MPG are relatively close, which would
suggest not a lot of skew in the distribution. This is consistent with
the histogram.
- Calculate the standard deviation and the coefficient of variation
(CV). Use these two measures to discuss how much variation there is in
the data.