Review of R
This workbook is based on Assignment 1 to Assinment 9, previously covered during the semester.
There are no examples included in this workbook, instead you are expected to go through your previous assignments to fin similar questions, and use your answers there, to complete this workbook (Assignment 10).
Exercise 1 (see Assignment 2 for similar examples)
A car dealership collected data on car-sales and organise it according to make and engine type, with the following data obtained
| Audi A4 |
6 |
7 |
| Audi A6 |
4 |
4 |
| BMW 3 Series |
5 |
4 |
| BMW 5 Series |
4 |
2 |
| VW Golf |
7 |
9 |
| VW Passat |
6 |
8 |
Given this data answer the following
Identify the data type given.
Create a .csv file to tabulate this data.
Create two clustered bar plot from this data file, using different colours in each plot.
Reorder these plots in order of decreasing Petrol car sales.
Which of these plots conveys the data in the clearest way?
Exercise 2 (see Assignment 4 for similar examples)
The GDP per capita of 17 euro zone countries of the world is given in the data set Assignmnet10_Exercise2.csv, which was sourced from the World Bank, and is available on Moodle.
The data is shown in the table below:
Data2<-read.csv('Assignment10_Exercise2.csv')
Data2
Using the data given in this table, answer the following:
Identify the data source type.
Use a stem and leaf plot to represent the data.
From the stem and leaf plot, determine if the data is skewed or centred.
Find the median of the data set.
Find the first and third quartiles (\(Q_1\) and \(Q_3\)).
Determine if the data set has any extreme outliers.
Determine if the data set has any mild outliers.
Identify the fences for the data set.
Use a box plot to represent this data set.
Example 3 (see Assignment 5 for similar examples)
Using the data from Exercise 2 answer the following questions:
- Find the mean GDP per capita for the 17 eurzone countries
- Find the standard deviation of this data set
- Find the number of data points in this data set
- Plot the inflation data as a normal distribution.
Exercise 4 (see Assignment 6 for similar examples)
We were given the following data relating Weekly Natural Gas Consumption in a U.S. city along with the Hourly Average Temperature in that city for the same week.
| 1 |
32.0 |
10.4 |
| 2 |
31.0 |
10.7 |
| 3 |
34 |
12.1 |
| 4 |
38 |
11.3 |
| 5 |
42 |
10.0 |
| 6 |
54 |
9.7 |
| 7 |
52 |
9.0 |
| 8 |
61 |
8.2 |
Using this data answer the following:
- Create a data file to represent this data.
- Import this data into R.
- Create two data vectors corresponding to temperature and consumption.
- Create a scatter plot to represent this data
- Use the lm() function to create a linear model for these data vectors.
- Using this model estimate the parameters \(a\) and \(b\) for the simple linear model.
- Plot the line of best fit along with the scatter plot for the data set.
Exercise 5 (see Assignment 8 and 9 for similar examples)
Eight makes of car were compared using three different criteria:
- Price (euro) 2. Engine (cc) 3. Efficiency (km/L)
The data collected are given in the table below:
| Audi |
42881 |
1968 |
24.7 |
| BMW |
44151 |
1990 |
23.9 |
| Citroen |
24523 |
1610 |
25.3 |
| Hyundai |
27770 |
1685 |
26.1 |
| Jaguar |
54456 |
1999 |
25.8 |
| Mercedes |
47976 |
1950 |
26.0 |
| Mitsubishi |
29455 |
2300 |
20.8 |
| Toyota |
28867 |
1995 |
22.6 |
Using the data in this table, answer the following:
Create 8 data vectors to represent each car make.
Combine these data vectors using the rbind() function, to create a single data structure.
Rescale the dimensions of this data stricture, so each dimension is measured in the same order.
Create a table of Euclidean and Manhattan distances for this re-scaled data structure.
Perform a cluster analysis of the Euclidean and Manhattan distances, using both single linkage and complete linkage.
Create two heat map to represent these distances (Euclidean and Manhattan) using the function fviz_dist().
Create 4 dendrograms to represent the clustering using the function fviz_dend(). Each dendrogram should have 4 clusters.
LS0tCnRpdGxlOiAiRGF0YSBWaXN1YWxpc2F0aW9uIDIwMTkgLSBBc3NpZ25tZW50IDEwIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgojIyMjICBMaXN0IG9mIFIgY29sb3JzOiBodHRwOi8vd3d3LnN0YXQuY29sdW1iaWEuZWR1L350emhlbmcvZmlsZXMvUmNvbG9yLnBkZgoKIyBSZXZpZXcgb2YgUgoKKiBUaGlzIHdvcmtib29rIGlzIGJhc2VkIG9uIEFzc2lnbm1lbnQgMSB0byBBc3Npbm1lbnQgOSwgcHJldmlvdXNseSBjb3ZlcmVkIGR1cmluZyB0aGUgc2VtZXN0ZXIuCgoqIFRoZXJlIGFyZSBubyBleGFtcGxlcyBpbmNsdWRlZCBpbiB0aGlzIHdvcmtib29rLCBpbnN0ZWFkIHlvdSBhcmUgZXhwZWN0ZWQgdG8gZ28gdGhyb3VnaCB5b3VyIHByZXZpb3VzIGFzc2lnbm1lbnRzIHRvIGZpbiBzaW1pbGFyIHF1ZXN0aW9ucywgYW5kIHVzZSB5b3VyIGFuc3dlcnMgdGhlcmUsIHRvIGNvbXBsZXRlIHRoaXMgd29ya2Jvb2sgKEFzc2lnbm1lbnQgMTApLgoKIyMgRXhlcmNpc2UgMSAoc2VlIEFzc2lnbm1lbnQgMiBmb3Igc2ltaWxhciBleGFtcGxlcykKCkEgY2FyIGRlYWxlcnNoaXAgY29sbGVjdGVkIGRhdGEgb24gY2FyLXNhbGVzIGFuZCBvcmdhbmlzZSBpdCBhY2NvcmRpbmcgdG8gbWFrZSBhbmQgZW5naW5lIHR5cGUsIHdpdGggdGhlIGZvbGxvd2luZyBkYXRhIG9idGFpbmVkCgp8ICAgX19NYWtlX18gICAgfCAgX19QZXRyb2xfXyB8ICBfX0RpZXNlbF9fICB8IAp8LS0tLS0tLS0tLS0tLS0tfC0tLS0tLS0tLS0tLS18LS0tLS0tLS0tLS0tLS18CnwgIEF1ZGkgQTQgICAgICB8ICAgICAgNiAgICAgIHwgICAgICA3ICAgICAgIHwKfCAgQXVkaSBBNiAgICAgIHwgICAgICA0ICAgICAgfCAgICAgIDQgICAgICAgfAp8ICBCTVcgMyBTZXJpZXMgfCAgICAgIDUgICAgICB8ICAgICAgNCAgICAgICB8CnwgIEJNVyA1IFNlcmllcyB8ICAgICAgNCAgICAgIHwgICAgICAyICAgICAgIHwKfCAgVlcgR29sZiAgICAgIHwgICAgICA3ICAgICAgfCAgICAgIDkgICAgICAgfAp8ICBWVyBQYXNzYXQgICAgfCAgICAgIDYgICAgICB8ICAgICAgOCAgICAgICB8IApHaXZlbiB0aGlzIGRhdGEgYW5zd2VyIHRoZSBmb2xsb3dpbmcKCjEuIElkZW50aWZ5IHRoZSBkYXRhIHR5cGUgZ2l2ZW4uCgoyLiBDcmVhdGUgYSBfXy5jc3ZfXyBmaWxlIHRvIHRhYnVsYXRlIHRoaXMgZGF0YS4KCjMuIENyZWF0ZSBfX3R3b19fIGNsdXN0ZXJlZCBiYXIgcGxvdCBmcm9tIHRoaXMgZGF0YSBmaWxlLCB1c2luZyBkaWZmZXJlbnQgY29sb3VycyBpbiBlYWNoIHBsb3QuCgo0LiBSZW9yZGVyIHRoZXNlIHBsb3RzIGluIG9yZGVyIG9mIF9fZGVjcmVhc2luZyBQZXRyb2wgY2FyX18gc2FsZXMuCgo0LiBXaGljaCBvZiB0aGVzZSBwbG90cyBjb252ZXlzIHRoZSBkYXRhIGluIHRoZSBjbGVhcmVzdCB3YXk/CgoKIyBFeGVyY2lzZSAyIChzZWUgQXNzaWdubWVudCA0IGZvciBzaW1pbGFyIGV4YW1wbGVzKQoKKiBUaGUgR0RQIHBlciBjYXBpdGEgb2YgMTcgZXVybyB6b25lIGNvdW50cmllcyAgb2YgdGhlIHdvcmxkIGlzIGdpdmVuIGluIHRoZSBkYXRhIHNldCBfX0Fzc2lnbm1uZXQxMF9FeGVyY2lzZTIuY3N2X18sIHdoaWNoIHdhcyBzb3VyY2VkIGZyb20gdGhlIFtXb3JsZCBCYW5rXShodHRwOi8vZGF0YWJhbmsud29ybGRiYW5rLm9yZy9kYXRhL2hvbWUuYXNweCksIGFuZCBpcyAgYXZhaWxhYmxlIG9uIE1vb2RsZS4KCiogVGhlIGRhdGEgaXMgc2hvd24gaW4gdGhlIHRhYmxlIGJlbG93OgoKYGBge3J9CkRhdGEyPC1yZWFkLmNzdignQXNzaWdubWVudDEwX0V4ZXJjaXNlMi5jc3YnKQpgYGAKCmBgYHtyfQpEYXRhMgpgYGAKClVzaW5nIHRoZSBkYXRhIGdpdmVuIGluIHRoaXMgdGFibGUsIGFuc3dlciB0aGUgZm9sbG93aW5nOgoKMS4gSWRlbnRpZnkgdGhlIGRhdGEgc291cmNlIHR5cGUuCgoyLiBVc2UgYSBzdGVtIGFuZCBsZWFmIHBsb3QgdG8gcmVwcmVzZW50IHRoZSBkYXRhLgoKMy4gRnJvbSB0aGUgc3RlbSBhbmQgbGVhZiBwbG90LCBkZXRlcm1pbmUgaWYgdGhlIGRhdGEgaXMgc2tld2VkIG9yIGNlbnRyZWQuCgo0LiBGaW5kIHRoZSBtZWRpYW4gb2YgdGhlIGRhdGEgc2V0LgoKNS4gRmluZCB0aGUgZmlyc3QgYW5kIHRoaXJkIHF1YXJ0aWxlcyAoJFFfMSQgYW5kICRRXzMkKS4KCjYuIERldGVybWluZSBpZiB0aGUgZGF0YSBzZXQgaGFzIGFueSBleHRyZW1lIG91dGxpZXJzLgoKNy4gRGV0ZXJtaW5lIGlmIHRoZSBkYXRhIHNldCBoYXMgYW55IG1pbGQgb3V0bGllcnMuCgo4LiBJZGVudGlmeSB0aGUgZmVuY2VzIGZvciB0aGUgZGF0YSBzZXQuCgo5LiBVc2UgYSBib3ggcGxvdCB0byByZXByZXNlbnQgdGhpcyBkYXRhIHNldC4KCgojIyBFeGFtcGxlIDMgKHNlZSBBc3NpZ25tZW50IDUgZm9yIHNpbWlsYXIgZXhhbXBsZXMpCgpVc2luZyB0aGUgZGF0YSBmcm9tIF9fRXhlcmNpc2UgMl9fIGFuc3dlciB0aGUgZm9sbG93aW5nIHF1ZXN0aW9uczoKCjEuIEZpbmQgdGhlIG1lYW4gR0RQIHBlciBjYXBpdGEgZm9yIHRoZSAxNyBldXJ6b25lIGNvdW50cmllcwoyLiBGaW5kIHRoZSBzdGFuZGFyZCBkZXZpYXRpb24gb2YgdGhpcyBkYXRhIHNldAozLiBGaW5kIHRoZSBudW1iZXIgb2YgZGF0YSBwb2ludHMgaW4gdGhpcyBkYXRhIHNldAo1LiBQbG90IHRoZSBpbmZsYXRpb24gZGF0YSBhcyBhIG5vcm1hbCBkaXN0cmlidXRpb24uCgoKCiMjIEV4ZXJjaXNlIDQgKHNlZSBBc3NpZ25tZW50IDYgZm9yIHNpbWlsYXIgZXhhbXBsZXMpCldlIHdlcmUgZ2l2ZW4gdGhlIGZvbGxvd2luZyBkYXRhIHJlbGF0aW5nIFdlZWtseSBOYXR1cmFsIEdhcyBDb25zdW1wdGlvbiBpbiBhIFUuUy4gY2l0eSBhbG9uZyB3aXRoIHRoZSBIb3VybHkgQXZlcmFnZSBUZW1wZXJhdHVyZSBpbiB0aGF0IGNpdHkgZm9yIHRoZSBzYW1lIHdlZWsuCgp8IF9fV2Vla19fIHwgX19Ib3VybHkgQXZnLiBUZW1wLiAoJF57XGNpcmN9JEYpX18gICB8IF9fV2Vla2x5IE5hdHVyYWwgR2FzIENvbnN1bXB0aW9uIChNTWNmKV9fIHwKfC0tLS0tLS0tLS18LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tfC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS18CnwgICAgIDEgICAgfCAgICAgICAgICAgICAgMzIuMCAgICAgICAgICAgICAgICAgICAgIHwgICAgICAgICAgICAgICAgICAgMTAuNCAgICAgICAgICAgICAgICAgICAgfAp8ICAgICAyICAgIHwgICAgICAgICAgICAgIDMxLjAgICAgICAgICAgICAgICAgICAgICB8ICAgICAgICAgICAgICAgICAgIDEwLjcgICAgICAgICAgICAgICAgICAgIHwKfCAgICAgMyAgICB8ICAgICAgICAgICAgICAzNCAgICAgICAgICAgICAgICAgICAgICAgfCAgICAgICAgICAgICAgICAgICAxMi4xICAgICAgICAgICAgICAgICAgICB8CnwgICAgIDQgICAgfCAgICAgICAgICAgICAgMzggICAgICAgICAgICAgICAgICAgICAgIHwgICAgICAgICAgICAgICAgICAgMTEuMyAgICAgICAgICAgICAgICAgICAgfAp8ICAgICA1ICAgIHwgICAgICAgICAgICAgIDQyICAgICAgICAgICAgICAgICAgICAgICB8ICAgICAgICAgICAgICAgICAgIDEwLjAgICAgICAgICAgICAgICAgICAgIHwKfCAgICAgNiAgICB8ICAgICAgICAgICAgICA1NCAgICAgICAgICAgICAgICAgICAgICAgfCAgICAgICAgICAgICAgICAgICA5LjcgICAgICAgICAgICAgICAgICAgICB8CnwgICAgIDcgICAgfCAgICAgICAgICAgICAgNTIgICAgICAgICAgICAgICAgICAgICAgIHwgICAgICAgICAgICAgICAgICAgOS4wICAgICAgICAgICAgICAgICAgICAgfAp8ICAgICA4ICAgIHwgICAgICAgICAgICAgIDYxICAgICAgICAgICAgICAgICAgICAgICB8ICAgICAgICAgICAgICAgICAgIDguMiAgICAgICAgICAgICAgICAgICAgIHwKClVzaW5nIHRoaXMgZGF0YSBhbnN3ZXIgdGhlIGZvbGxvd2luZzoKCjEuIENyZWF0ZSBhIGRhdGEgZmlsZSB0byByZXByZXNlbnQgdGhpcyBkYXRhLgoyLiBJbXBvcnQgdGhpcyBkYXRhIGludG8gX19SX18uCjMuIENyZWF0ZSB0d28gZGF0YSB2ZWN0b3JzIGNvcnJlc3BvbmRpbmcgdG8gdGVtcGVyYXR1cmUgYW5kIGNvbnN1bXB0aW9uLgo0LiBDcmVhdGUgYSBzY2F0dGVyIHBsb3QgdG8gcmVwcmVzZW50IHRoaXMgZGF0YQo1LiBVc2UgdGhlIF9fbG0oKV9fIGZ1bmN0aW9uIHRvIGNyZWF0ZSBhIGxpbmVhciBtb2RlbCBmb3IgdGhlc2UgZGF0YSB2ZWN0b3JzLgo2LiBVc2luZyB0aGlzIG1vZGVsIGVzdGltYXRlIHRoZSBwYXJhbWV0ZXJzICRhJCBhbmQgJGIkIGZvciB0aGUgc2ltcGxlIGxpbmVhciBtb2RlbC4KNy4gUGxvdCB0aGUgbGluZSBvZiBiZXN0IGZpdCBhbG9uZyB3aXRoIHRoZSBzY2F0dGVyIHBsb3QgZm9yIHRoZSBkYXRhIHNldC4KCiMjIyBFeGVyY2lzZSA1IChzZWUgQXNzaWdubWVudCA4IGFuZCA5IGZvciBzaW1pbGFyIGV4YW1wbGVzKQoKRWlnaHQgbWFrZXMgb2YgY2FyIHdlcmUgY29tcGFyZWQgdXNpbmcgdGhyZWUgZGlmZmVyZW50IGNyaXRlcmlhOgoKCiAgMS4gUHJpY2UgKGV1cm8pICAgIDIuIEVuZ2luZSAoY2MpICAgIDMuIEVmZmljaWVuY3kgKGttL0wpCgpUaGUgZGF0YSBjb2xsZWN0ZWQgYXJlIGdpdmVuIGluIHRoZSB0YWJsZSBiZWxvdzoKCnwgTWFrZSAgICAgICB8UHJpY2UgKGV1cm8pfCBFbmdpbmUgKGNjKXwgRWZmaWNpZW5jeSB8CnwtLS0tLS0tLS0tLS18LS0tLS0tLS0tLS18LS0tLS0tLS0tLS0tLXwtLS0tLS0tLS0tLS18CnwgQXVkaSAgICAgICB8IDQyODgxICAgICB8ICAgICAxOTY4ICAgIHwgICAgIDI0LjcgICB8CnwgQk1XICAgICAgICB8IDQ0MTUxICAgICB8ICAgICAxOTkwICAgIHwgICAgIDIzLjkgICB8CnwgQ2l0cm9lbiAgICB8IDI0NTIzICAgICB8ICAgICAxNjEwICAgIHwgICAgIDI1LjMgICB8CnwgSHl1bmRhaSAgICB8IDI3NzcwICAgICB8ICAgICAxNjg1ICAgIHwgICAgIDI2LjEgICB8CnwgSmFndWFyICAgICB8IDU0NDU2ICAgICB8ICAgICAxOTk5ICAgIHwgICAgIDI1LjggICB8CnwgTWVyY2VkZXMgICB8IDQ3OTc2ICAgICB8ICAgICAxOTUwICAgIHwgICAgIDI2LjAgICB8CnwgTWl0c3ViaXNoaSB8IDI5NDU1ICAgICB8ICAgICAyMzAwICAgIHwgICAgIDIwLjggICB8CnwgVG95b3RhICAgICB8IDI4ODY3ICAgICB8ICAgICAxOTk1ICAgIHwgICAgIDIyLjYgICB8CgpVc2luZyB0aGUgZGF0YSBpbiB0aGlzIHRhYmxlLCBhbnN3ZXIgdGhlIGZvbGxvd2luZzoKCjEuIENyZWF0ZSA4IF9fZGF0YSB2ZWN0b3JzX18gdG8gcmVwcmVzZW50IGVhY2ggY2FyIG1ha2UuCgoyLiBDb21iaW5lIHRoZXNlIGRhdGEgdmVjdG9ycyB1c2luZyB0aGUgX19yYmluZCgpX18gZnVuY3Rpb24sIHRvIGNyZWF0ZSBhIHNpbmdsZSBfX2RhdGEgc3RydWN0dXJlX18uCgozLiBSZXNjYWxlIHRoZSBkaW1lbnNpb25zIG9mIHRoaXMgZGF0YSBzdHJpY3R1cmUsIHNvIGVhY2ggZGltZW5zaW9uIGlzIG1lYXN1cmVkIGluIHRoZSBzYW1lIG9yZGVyLgoKNC4gQ3JlYXRlIGEgdGFibGUgb2YgRXVjbGlkZWFuIGFuZCBNYW5oYXR0YW4gZGlzdGFuY2VzIGZvciB0aGlzIHJlLXNjYWxlZCBkYXRhIHN0cnVjdHVyZS4KCjUuIFBlcmZvcm0gYSBjbHVzdGVyIGFuYWx5c2lzIG9mIHRoZSBFdWNsaWRlYW4gYW5kIE1hbmhhdHRhbiBkaXN0YW5jZXMsIHVzaW5nIGJvdGggc2luZ2xlIGxpbmthZ2UgYW5kIGNvbXBsZXRlIGxpbmthZ2UuIAoKNi4gQ3JlYXRlIHR3byBoZWF0IG1hcCB0byByZXByZXNlbnQgdGhlc2UgZGlzdGFuY2VzIChFdWNsaWRlYW4gYW5kIE1hbmhhdHRhbikgdXNpbmcgdGhlIGZ1bmN0aW9uIF9fZnZpel9kaXN0KClfXy4KCjcuIENyZWF0ZSA0IGRlbmRyb2dyYW1zIHRvIHJlcHJlc2VudCB0aGUgY2x1c3RlcmluZyB1c2luZyB0aGUgZnVuY3Rpb24gX19mdml6X2RlbmQoKV9fLiBFYWNoIGRlbmRyb2dyYW0gc2hvdWxkIGhhdmUgNCBjbHVzdGVycy4KCgo=