x <- enframe(get_label(quakes))
colnames(x) <- c("variable", "details")
x$details[1] = "Latitude of event"
x$details[2] = "Longitude"
x$details[3] = "Depth (km)"
x$details[4] = "Richter Magnitude"
x$details[5] = "Number of stations reporting"
x
## # A tibble: 5 x 2
## variable details
## <chr> <chr>
## 1 lat Latitude of event
## 2 long Longitude
## 3 depth Depth (km)
## 4 mag Richter Magnitude
## 5 stations Number of stations reporting
new_codebook_rmd()
codebook(quakes)
## No missing values.
Dataset name: quakes
The dataset has N=1000 rows and 5 columns. 1000 rows have no missing values on any column.
|
#Variables
Distribution of values for lat
0 missing values.
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| lat | numeric | 0 | 1 | -39 | -20 | -11 | -20.64275 | 5.028791 | <U+2581><U+2581><U+2585><U+2587><U+2583> | NA |
Distribution of values for long
0 missing values.
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| long | numeric | 0 | 1 | 166 | 181 | 188 | 179.462 | 6.069497 | <U+2582><U+2581><U+2581><U+2587><U+2583> | NA |
Distribution of values for depth
0 missing values.
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| depth | numeric | 0 | 1 | 40 | 247 | 680 | 311.371 | 215.5355 | <U+2587><U+2583><U+2582><U+2583><U+2585> | NA |
Distribution of values for mag
0 missing values.
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| mag | numeric | 0 | 1 | 4 | 4.6 | 6.4 | 4.6204 | 0.402773 | <U+2587><U+2587><U+2583><U+2581><U+2581> | NA |
Distribution of values for stations
0 missing values.
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| stations | numeric | 0 | 1 | 10 | 27 | 132 | 33.418 | 21.90039 | <U+2587><U+2582><U+2581><U+2581><U+2581> | NA |
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| lat | numeric | 0 | 1 | -39 | -20.3 | -10.7 | -20.64275 | 5.028791 | <U+2581><U+2581><U+2585><U+2587><U+2583> | NA |
| long | numeric | 0 | 1 | 166 | 181.4 | 188.1 | 179.46202 | 6.069497 | <U+2582><U+2581><U+2581><U+2587><U+2583> | NA |
| depth | numeric | 0 | 1 | 40 | 247.0 | 680.0 | 311.37100 | 215.535498 | <U+2587><U+2583><U+2582><U+2583><U+2585> | NA |
| mag | numeric | 0 | 1 | 4 | 4.6 | 6.4 | 4.62040 | 0.402773 | <U+2587><U+2587><U+2583><U+2581><U+2581> | NA |
| stations | numeric | 0 | 1 | 10 | 27.0 | 132.0 | 33.41800 | 21.900386 | <U+2587><U+2582><U+2581><U+2581><U+2581> | NA |
The following JSON-LD can be found by search engines, if you share this codebook publicly on the web.
{
"name": "quakes",
"datePublished": "2021-12-31",
"description": "The dataset has N=1000 rows and 5 columns.\n1000 rows have no missing values on any column.\n\n\n## Table of variables\nThis table contains variable names, labels, and number of missing values.\nSee the complete codebook for more.\n\n|name |label | n_missing|\n|:--------|:-----|---------:|\n|lat |NA | 0|\n|long |NA | 0|\n|depth |NA | 0|\n|mag |NA | 0|\n|stations |NA | 0|\n\n### Note\nThis dataset was automatically described using the [codebook R package](https://rubenarslan.github.io/codebook/) (version 0.9.2).",
"keywords": ["lat", "long", "depth", "mag", "stations"],
"@context": "http://schema.org/",
"@type": "Dataset",
"variableMeasured": [
{
"name": "lat",
"@type": "propertyValue"
},
{
"name": "long",
"@type": "propertyValue"
},
{
"name": "depth",
"@type": "propertyValue"
},
{
"name": "mag",
"@type": "propertyValue"
},
{
"name": "stations",
"@type": "propertyValue"
}
]
}`
ggplot(quakes, aes(x=depth, y=mag))+geom_bin2d()+labs(title="Magnitude of earthquake according to the depth", x="Depth(km)", y="Magnitude(Scalar Ritcher)")
Analysis : null / no relationship
ggplot(quakes, aes(x=depth, y=stations))+geom_bar(stat="identity")+labs(title="Number of stations reporting about earthquake depending on depth", x="Depth(km)", y="Number of Stations")
Analysis : Low amount of stations are alerted between 300km-400km, Moderate amount of stations are alerted between 500km-700km, High amount of stations are alerted below 300km depth
ggplot(quakes, aes(long,lat))+geom_point(size = .25, show.legend = FALSE) + coord_quickmap() + labs(title="Earthquakes plotted on map", x="Longitude", y="Latitude")
Analysis : Earthquakes occured the most in at latitude between -25 to -5 and longitude between 180 to 185
mean(quakes$depth)
## [1] 311.371
median(quakes$depth)
## [1] 247
IQR(quakes$depth)
## [1] 444
mean(quakes$mag)
## [1] 4.6204
median(quakes$mag)
## [1] 4.6
IQR(quakes$mag)
## [1] 0.6
i) filter()
Only show rows with mag greater than 5.5
quakes %>% filter(mag>5.5)
## lat long depth mag stations
## 1 -20.70 169.92 139 6.1 94
## 2 -13.64 165.96 50 6.0 83
## 3 -22.55 185.90 42 5.7 76
## 4 -23.34 184.50 56 5.7 106
## 5 -15.56 167.62 127 6.4 122
## 6 -26.00 182.12 205 5.6 98
## 7 -32.22 180.20 216 5.7 90
## 8 -22.13 180.38 577 5.7 104
## 9 -24.57 178.40 562 5.6 80
## 10 -15.33 186.75 48 5.7 123
## 11 -17.84 181.30 535 5.7 112
## 12 -22.91 183.95 64 5.9 118
## 13 -34.68 179.82 75 5.6 79
## 14 -19.89 174.46 546 5.7 99
## 15 -18.82 182.21 417 5.6 129
## 16 -37.03 177.52 153 5.6 87
## 17 -11.40 166.07 93 5.6 94
## 18 -15.93 167.91 183 5.6 109
## 19 -21.08 180.85 627 5.9 119
## 20 -21.14 174.21 40 5.7 78
## 21 -12.23 167.02 242 6.0 132
## 22 -17.85 181.44 589 5.6 115
## 23 -20.25 184.75 107 5.6 121
## 24 -21.59 170.56 165 6.0 119
ii) arrange()
Sort i) in ascending order of stations
quakes %>% filter(mag>5.5) %>% arrange(stations)
## lat long depth mag stations
## 1 -22.55 185.90 42 5.7 76
## 2 -21.14 174.21 40 5.7 78
## 3 -34.68 179.82 75 5.6 79
## 4 -24.57 178.40 562 5.6 80
## 5 -13.64 165.96 50 6.0 83
## 6 -37.03 177.52 153 5.6 87
## 7 -32.22 180.20 216 5.7 90
## 8 -20.70 169.92 139 6.1 94
## 9 -11.40 166.07 93 5.6 94
## 10 -26.00 182.12 205 5.6 98
## 11 -19.89 174.46 546 5.7 99
## 12 -22.13 180.38 577 5.7 104
## 13 -23.34 184.50 56 5.7 106
## 14 -15.93 167.91 183 5.6 109
## 15 -17.84 181.30 535 5.7 112
## 16 -17.85 181.44 589 5.6 115
## 17 -22.91 183.95 64 5.9 118
## 18 -21.08 180.85 627 5.9 119
## 19 -21.59 170.56 165 6.0 119
## 20 -20.25 184.75 107 5.6 121
## 21 -15.56 167.62 127 6.4 122
## 22 -15.33 186.75 48 5.7 123
## 23 -18.82 182.21 417 5.6 129
## 24 -12.23 167.02 242 6.0 132
Sort i) in descending order of stations
quakes %>% filter(mag>5.5) %>% arrange(desc(stations))
## lat long depth mag stations
## 1 -12.23 167.02 242 6.0 132
## 2 -18.82 182.21 417 5.6 129
## 3 -15.33 186.75 48 5.7 123
## 4 -15.56 167.62 127 6.4 122
## 5 -20.25 184.75 107 5.6 121
## 6 -21.08 180.85 627 5.9 119
## 7 -21.59 170.56 165 6.0 119
## 8 -22.91 183.95 64 5.9 118
## 9 -17.85 181.44 589 5.6 115
## 10 -17.84 181.30 535 5.7 112
## 11 -15.93 167.91 183 5.6 109
## 12 -23.34 184.50 56 5.7 106
## 13 -22.13 180.38 577 5.7 104
## 14 -19.89 174.46 546 5.7 99
## 15 -26.00 182.12 205 5.6 98
## 16 -20.70 169.92 139 6.1 94
## 17 -11.40 166.07 93 5.6 94
## 18 -32.22 180.20 216 5.7 90
## 19 -37.03 177.52 153 5.6 87
## 20 -13.64 165.96 50 6.0 83
## 21 -24.57 178.40 562 5.6 80
## 22 -34.68 179.82 75 5.6 79
## 23 -21.14 174.21 40 5.7 78
## 24 -22.55 185.90 42 5.7 76
iii) mutate()
Add a new column based on current column depth
Add column depthinmeter based on column depth (in km)
quakes %>% filter(mag>5.5) %>% select(depth) %>% mutate(depthinmeter = depth*1000)
## depth depthinmeter
## 1 139 139000
## 2 50 50000
## 3 42 42000
## 4 56 56000
## 5 127 127000
## 6 205 205000
## 7 216 216000
## 8 577 577000
## 9 562 562000
## 10 48 48000
## 11 535 535000
## 12 64 64000
## 13 75 75000
## 14 546 546000
## 15 417 417000
## 16 153 153000
## 17 93 93000
## 18 183 183000
## 19 627 627000
## 20 40 40000
## 21 242 242000
## 22 589 589000
## 23 107 107000
## 24 165 165000
iv) select()
Select specific column
quakes %>% filter(mag>5.5) %>% select(depth)
## depth
## 1 139
## 2 50
## 3 42
## 4 56
## 5 127
## 6 205
## 7 216
## 8 577
## 9 562
## 10 48
## 11 535
## 12 64
## 13 75
## 14 546
## 15 417
## 16 153
## 17 93
## 18 183
## 19 627
## 20 40
## 21 242
## 22 589
## 23 107
## 24 165
v) summarise()
Creates a new dataframe with selected column
quakes %>% filter(mag>5.5) %>% summarise(depth,mag)
## depth mag
## 1 139 6.1
## 2 50 6.0
## 3 42 5.7
## 4 56 5.7
## 5 127 6.4
## 6 205 5.6
## 7 216 5.7
## 8 577 5.7
## 9 562 5.6
## 10 48 5.7
## 11 535 5.7
## 12 64 5.9
## 13 75 5.6
## 14 546 5.7
## 15 417 5.6
## 16 153 5.6
## 17 93 5.6
## 18 183 5.6
## 19 627 5.9
## 20 40 5.7
## 21 242 6.0
## 22 589 5.6
## 23 107 5.6
## 24 165 6.0