This is a final project to show off what you have learned. Select your data set from the list below: http://vincentarelbundock.github.io/Rdatasets/ (click on the csv index for a list). Another good source is found here: https://archive.ics.uci.edu/ml/datasets.html
The presentation approach is up to you but it should contain the following:
Data Exploration: This should include summary statistics, means, medians, quartiles, or any other relevant information about the dataset. Please include some conclusions in the R Markdown text.
Data wrangling: Please perform some basic transformations. They will need to make sense but could include column renaming, creating a subset of the data, replacing values, or creating new columns with derived data (for example -if it makes sense you could sum two columns together)
Graphics: Please make sure to display at least one scatter plot, box plot and histogram. Don’t be limited to this. Please explore the many other options in R packages such as ggplot2.
Meaningful question for analysis: Please state at the beginning a meaningful question for analysis. Use the first three steps and anything else that would be helpful to answer the question you are posing from the data set you chose. Please write a brief conclusion paragraph in R markdown at the end.
BONUS -place the original .csv in a github file and have R read from the link. This will be a very useful skill as you progress in your data science education and career.
Please submit you r.rmd file and the .csv file as well as a link to your RPubs.
Taken from: http://vincentarelbundock.github.io/Rdatasets/csv/datasets/quakes.csv
GitHub Raw file location: https://raw.githubusercontent.com/dvillalobos/CUNY-Bridge/master/quakes.csv
The data set give the locations of 1000 seismic events of MB > 4.0. The events occurred in a cube near Fiji since 1964.
quakes
A data frame with 1000 observations on 5 variables.
[1] lat numeric Latitude of event
[2] long numeric Longitude
[3] depth numeric Depth (km)
[4] mag numeric Richter Magnitude
[5] stations numeric Number of stations reporting
Details:
There are two clear planes of seismic activity. One is a major plate junction; the other is the Tonga trench off New Zealand. These data constitute a subsample from a larger dataset of containing 5000 observations.
Source
This is one of the Harvard PRIM-H project data sets. They in turn obtained it from Dr. John Woodhouse, Dept. of Geophysics, Harvard University.
This project is produced to report an analysis inspired by the earthquakes in Fiji. In this study we will explore the frequency of magnitudes and frequency of depths for all the quakes registered. Also we will explore the geographical location for all the quakes as well.
Now, let’s represent a series of scattered plots to visualize the relationships in between Magnitude vs Depth, Stations vs Depth and Magnitude vs Stations.
Hover the mouse over to visualize their respective values.
From the above graph we can visualize that the strongest quake was not very deep at all .
Hover the mouse over to visualize their respective values.
From the above graph we can visualize that a large number of stations report the reading of deep quakes.
Hover the mouse over to visualize their respective values.
From the above graph we can visualize a tendency on how a stronger the quake the more stations will register the event.
The below table represent a frequency of quakes based on Magnitudes.
## Magnitude Frequency
## 1 4.0 46
## 2 4.1 55
## 3 4.2 90
## 4 4.3 85
## 5 4.4 101
## 6 4.5 107
## 7 4.6 101
## 8 4.7 98
## 9 4.8 65
## 10 4.9 54
## 11 5.0 47
## 12 5.1 43
## 13 5.2 29
## 14 5.3 21
## 15 5.4 20
## 16 5.5 14
## 17 5.6 9
## 18 5.7 8
## 19 5.9 2
## 20 6.0 3
## 21 6.1 1
## 22 6.4 1
The below chart express all the Magnitudes and also express the frequencies reported for that magnitude in quantity and percentage.
Hover your mouse over and play with the data to explore even further.
By taking all the readings we can find some important results as follows:
summary(quakes$Magnitude)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.00 4.30 4.60 4.62 4.90 6.40
The weakest quake had a Magnitude 4.0 Richter Magnitude.
The average Magnitude for all quakes is 4.6 Richter Magnitude.
The Mean Magnitude for all the quakes is 4.62 Richter Magnitude.
The strongest Magnitude for all quakes is a 6.4 Richter Magnitude.
The below table represent a frequency of quakes based on Depth. For reporting purposes I am including only a few results since the resulting table is a little long.
## Depth Frequency
## 1 40 12
## 2 41 4
## 3 42 11
## 4 43 5
## 5 44 5
## 6 45 8
From the above table, we can draw our data as follows:
From the above graph, we can visualize the depth and frequency distribution of the quakes.
By taking all the readings we can find some important results as follows:
summary(quakes$Depth)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 40.0 99.0 247.0 311.4 543.0 680.0
The less deep quake was registered at 40 km.
The average depth for all quakes is 247 km.
The Mean depth for all the quakes is 311.4 km.
The deepest quake was registered at 680 km.
Based on the above information, we can calculate some interesting findings, for example the Energy generated by each quake:
The formula is: \[E= 10^{1.5 \cdot R + 4.8}\]
Energy <- 10^(1.5 * quakes$Magnitude + 4.8)
summary(Energy)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 6.310e+10 1.778e+11 5.012e+11 2.064e+12 1.413e+12 2.512e+14
From which we can deduct:
The minimum Energy released by the smallest quake was 6.310e+10 Joules.
The average Energy released was about 2.064e+12 Joules.
The median Energy released was about 5.012e+11 Joules.
The maximum energy released was about 2.512e+14 Joules.
Since the TNT formula is as follows:
\[ 1 \: TNT = 4.184 \cdot 10^9 Joules\]
TNT = Energy / (4.184 * 10 ^9)
summary(TNT)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 15.08 42.50 119.80 493.30 337.60 60040.00
From which we can deduct:
The minimum Energy released by the smallest quake compares to 15.08 TNT.
The average Energy released compares to 493.30 TNT.
The median Energy released compares to about 119.30 TNT.
The maximum energy released compares to about 60040 TNT.
Let’s map all the quakes in Fiji, to see the distribution.
Let’s draw all quakes reported.
Let’s zoom a little bit.
From the above we can conclude that the bigger the quake the more damage is capable of. Also, we can conclude that the majority of quakes have a magnitude of 4.5 Richter Magnitude and the majority of quakes have an average depth of 247 km.
We can conclude that the bigger the quake the more destructive it becomes since its Energy goes exponential.
Based on the plotting of the coordinates we can conclude that out of all the quakes, nor even one has been on land. Which means that on every quake a Tsunami alert has to be issued.