This project was conducted as a quantitative research analysis using basic statistic concepts. The main idea behind the project was to analyze the speeds of 7 machine models (6 cars 1 motorcycle) and 7 animals from different perspective.
install.packages("tidyverse")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.9
## ✔ tidyr 1.2.0 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(readr)
library(dplyr)
library(ggplot2)
df <- read_csv("machine_nature.csv") %>% as_tibble()
## Rows: 14 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): name, sex, physical_category, speed_locomotion_type
## dbl (3): top_speed_mph, top_speed_km_h, weight_kg
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
df
## # A tibble: 14 × 7
## name sex top_s…¹ top_s…² physi…³ speed…⁴ weigh…⁵
## <chr> <chr> <dbl> <dbl> <chr> <chr> <dbl>
## 1 Bugatti Chiron Super Sport 300… <NA> 304 489 Machine moving 1.98e+3
## 2 Peregrine Falcon Male 242 389 Nature flying 1 e+0
## 3 Peregrine Falcon Fema… 242 389 Nature flying 1.5 e+0
## 4 Porsche 911 Turbo S (2021) <NA> 205 330 Machine moving 1.65e+3
## 5 Golden Eagle Male 200 322 Nature flying 4.6 e+0
## 6 Golden Eagle Fema… 200 322 Nature flying 6.7 e+0
## 7 Chevrolet Corvette Stingray (2… <NA> 194 312 Machine moving 1.65e+3
## 8 Honda Civic (2021) <NA> 137 220 Machine moving 1.31e+3
## 9 Toyota RAV4 (2021) <NA> 120 193 Machine moving 1.58e+3
## 10 Ford F-150 Raptor (2020) <NA> 107 172 Machine moving 2.58e+3
## 11 Mexican free-tailed bat <NA> 101 163 Nature running 1.3 e-2
## 12 Cheetah <NA> 75 121 Nature running 7.2 e+1
## 13 Sailfish <NA> 68 109 Nature swimmi… 9 e+1
## 14 Honda Ruckus (2020) <NA> 40 64 Machine moving 8.8 e+1
## # … with abbreviated variable names ¹top_speed_mph, ²top_speed_km_h,
## # ³physical_category, ⁴speed_locomotion_type, ⁵weight_kg
glimpse(df)
## Rows: 14
## Columns: 7
## $ name <chr> "Bugatti Chiron Super Sport 300+ (2019)", "Pereg…
## $ sex <chr> NA, "Male", "Female", NA, "Male", "Female", NA, …
## $ top_speed_mph <dbl> 304, 242, 242, 205, 200, 200, 194, 137, 120, 107…
## $ top_speed_km_h <dbl> 489, 389, 389, 330, 322, 322, 312, 220, 193, 172…
## $ physical_category <chr> "Machine", "Nature", "Nature", "Machine", "Natur…
## $ speed_locomotion_type <chr> "moving", "flying", "flying", "moving", "flying"…
## $ weight_kg <dbl> 1978.000, 1.000, 1.500, 1646.000, 4.600, 6.700, …
This is quite small data set with 7 variables and 14 row. Variables
are: * name - this includes both name of car brand and
animal name, * sex - this variable is only for animals
specifically for birds as the weight of male and female birds differ, *
top_speed_mph - speed of a car and animal in mile per hour
unit, * top_speed_km_h - speed of a car and animal in
kilometer per hour unit, * physical_category - whether the
object is car or animal, * speed_locomotion_type - what is
the movement type that the objects achieve their highest, *
weight_kg - weights of machine models and animals expressed
in kilogram.
glimpse() function identified that some variable types
should have been changed. These variables are: * sex -
should be converted to factor, *
physical_category - should be converted to
factor, * speed_locomotion_type - should be
converted to factor.
These conversions have been applied to the variables in order to get valid results in the analyze phase.
df$sex <- df$sex %>% as_factor()
df$physical_category <- df$physical_category %>% as_factor()
df$speed_locomotion_type <- df$speed_locomotion_type %>% as_factor()
Additionally, in some stages of analyses it would be necessary to have separate data frames for cars and animals. So, two subsets of the base data frame were created.
df_machines <- df %>% dplyr::filter(df$physical_category=="Machine")
df_machines
## # A tibble: 7 × 7
## name sex top_s…¹ top_s…² physi…³ speed…⁴ weigh…⁵
## <chr> <fct> <dbl> <dbl> <fct> <fct> <dbl>
## 1 Bugatti Chiron Super Sport 300+… <NA> 304 489 Machine moving 1978
## 2 Porsche 911 Turbo S (2021) <NA> 205 330 Machine moving 1646
## 3 Chevrolet Corvette Stingray (20… <NA> 194 312 Machine moving 1654
## 4 Honda Civic (2021) <NA> 137 220 Machine moving 1309
## 5 Toyota RAV4 (2021) <NA> 120 193 Machine moving 1583
## 6 Ford F-150 Raptor (2020) <NA> 107 172 Machine moving 2584
## 7 Honda Ruckus (2020) <NA> 40 64 Machine moving 88
## # … with abbreviated variable names ¹top_speed_mph, ²top_speed_km_h,
## # ³physical_category, ⁴speed_locomotion_type, ⁵weight_kg
df_animals <- df %>% dplyr::filter(df$physical_category=="Nature")
df_animals
## # A tibble: 7 × 7
## name sex top_speed_mph top_spe…¹ physi…² speed…³ weigh…⁴
## <chr> <fct> <dbl> <dbl> <fct> <fct> <dbl>
## 1 Peregrine Falcon Male 242 389 Nature flying 1
## 2 Peregrine Falcon Female 242 389 Nature flying 1.5
## 3 Golden Eagle Male 200 322 Nature flying 4.6
## 4 Golden Eagle Female 200 322 Nature flying 6.7
## 5 Mexican free-tailed bat <NA> 101 163 Nature running 0.013
## 6 Cheetah <NA> 75 121 Nature running 72
## 7 Sailfish <NA> 68 109 Nature swimmi… 90
## # … with abbreviated variable names ¹top_speed_km_h, ²physical_category,
## # ³speed_locomotion_type, ⁴weight_kg
After separate data frames were created individual calculations were
applied to them first starting with summary() function.
Initially, sex variable were dropped from the machines data
frame.
df_machines <- df_machines %>% select(-sex)
df_machines
## # A tibble: 7 × 6
## name top_s…¹ top_s…² physi…³ speed…⁴ weigh…⁵
## <chr> <dbl> <dbl> <fct> <fct> <dbl>
## 1 Bugatti Chiron Super Sport 300+ (2019) 304 489 Machine moving 1978
## 2 Porsche 911 Turbo S (2021) 205 330 Machine moving 1646
## 3 Chevrolet Corvette Stingray (2020) 194 312 Machine moving 1654
## 4 Honda Civic (2021) 137 220 Machine moving 1309
## 5 Toyota RAV4 (2021) 120 193 Machine moving 1583
## 6 Ford F-150 Raptor (2020) 107 172 Machine moving 2584
## 7 Honda Ruckus (2020) 40 64 Machine moving 88
## # … with abbreviated variable names ¹top_speed_mph, ²top_speed_km_h,
## # ³physical_category, ⁴speed_locomotion_type, ⁵weight_kg
After creating subsets and applying necessary conversions results
were checked via glimpse() function again.
glimpse(df)
## Rows: 14
## Columns: 7
## $ name <chr> "Bugatti Chiron Super Sport 300+ (2019)", "Pereg…
## $ sex <fct> NA, Male, Female, NA, Male, Female, NA, NA, NA, …
## $ top_speed_mph <dbl> 304, 242, 242, 205, 200, 200, 194, 137, 120, 107…
## $ top_speed_km_h <dbl> 489, 389, 389, 330, 322, 322, 312, 220, 193, 172…
## $ physical_category <fct> Machine, Nature, Nature, Machine, Nature, Nature…
## $ speed_locomotion_type <fct> moving, flying, flying, moving, flying, flying, …
## $ weight_kg <dbl> 1978.000, 1.000, 1.500, 1646.000, 4.600, 6.700, …
glimpse(df_machines)
## Rows: 7
## Columns: 6
## $ name <chr> "Bugatti Chiron Super Sport 300+ (2019)", "Porsc…
## $ top_speed_mph <dbl> 304, 205, 194, 137, 120, 107, 40
## $ top_speed_km_h <dbl> 489, 330, 312, 220, 193, 172, 64
## $ physical_category <fct> Machine, Machine, Machine, Machine, Machine, Mac…
## $ speed_locomotion_type <fct> moving, moving, moving, moving, moving, moving, …
## $ weight_kg <dbl> 1978, 1646, 1654, 1309, 1583, 2584, 88
glimpse(df_animals)
## Rows: 7
## Columns: 7
## $ name <chr> "Peregrine Falcon", "Peregrine Falcon", "Golden …
## $ sex <fct> Male, Female, Male, Female, NA, NA, NA
## $ top_speed_mph <dbl> 242, 242, 200, 200, 101, 75, 68
## $ top_speed_km_h <dbl> 389, 389, 322, 322, 163, 121, 109
## $ physical_category <fct> Nature, Nature, Nature, Nature, Nature, Nature, …
## $ speed_locomotion_type <fct> flying, flying, flying, flying, running, running…
## $ weight_kg <dbl> 1.000, 1.500, 4.600, 6.700, 0.013, 72.000, 90.000
max(df_machines$top_speed_km_h)
## [1] 489
min(df_machines$top_speed_km_h)
## [1] 64
mean(df_machines$top_speed_km_h)
## [1] 254.2857
median(df_machines$top_speed_km_h)
## [1] 220
range(df_machines$top_speed_km_h)
## [1] 64 489
mode(df_machines$top_speed_km_h)
## [1] "numeric"
Functions revealed that there were no mode value in speed variable of machines.
df_machines$top_speed_km_h %>% boxplot(xlab="Speed: minimum, median, IQR and maximum", horizontal=T)
df_machines$top_speed_km_h %>% hist(main="Histogram of speeds of machines", xlab="Speed range", ylab="Frequency")
max(df_animals$top_speed_km_h)
## [1] 389
min(df_animals$top_speed_km_h)
## [1] 109
mean(df_animals$top_speed_km_h)
## [1] 259.2857
median(df_animals$top_speed_km_h)
## [1] 322
range(df_animals$top_speed_km_h)
## [1] 109 389
mode(df_animals$top_speed_km_h)
## [1] "numeric"
There were no mode value in speed variable of animals data frame either.
df_animals$top_speed_km_h %>% boxplot(xlab="Speed: minimum, median, IQR and maximum", horizontal=T)
df_animals$top_speed_km_h %>% hist(main="Histogram of speeds of animals", xlab="Speed range", ylab="Frequency")
Do speeds of cars differ from the speeds of animals?
Null hypothesis. The speeds of cars are the same as the speeds of animals.
Alternative hypothesis. The speeds of cars differ from the speeds of animals.
t.test(df_machines$top_speed_km_h, df_animals$top_speed_km_h)
##
## Welch Two Sample t-test
##
## data: df_machines$top_speed_km_h and df_animals$top_speed_km_h
## t = -0.071644, df = 11.891, p-value = 0.9441
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -157.2122 147.2122
## sample estimates:
## mean of x mean of y
## 254.2857 259.2857
Is there a relationships between speeds and weights of cars and animals?
Alternative hypothesis. There is a relationship between speeds and weights of cars and animals. Null hypothesis. There is no relationship between speeds and weights of cars and animals.
cor.test(df_machines$top_speed_km_h, df_machines$weight_kg)
##
## Pearson's product-moment correlation
##
## data: df_machines$top_speed_km_h and df_machines$weight_kg
## t = 1.2984, df = 5, p-value = 0.2508
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.4034984 0.9107907
## sample estimates:
## cor
## 0.5021383
ggplot(df_machines,
aes(top_speed_km_h, weight_kg,
color = name)) +
geom_point(size = 3) +
labs(
title="Weight v. Speed",
x="Speed",
y="Weight"
)
cor.test(df_animals$top_speed_km_h, df_animals$weight_kg)
##
## Pearson's product-moment correlation
##
## data: df_animals$top_speed_km_h and df_animals$weight_kg
## t = -2.7947, df = 5, p-value = 0.03823
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.96591723 -0.06739484
## sample estimates:
## cor
## -0.7808244
ggplot(df_animals,
aes(top_speed_km_h, weight_kg, color=name)) +
geom_point(size = 3) +
labs(
title="Weight v. Speed",
x="Speed",
y="Weight"
)
According to the statistics the average speeds of machines and animals are quite close even though maximum speed of car model (Bugatti Chiron Super Sport 300+ (2019) is much higher. And median speed of animals are higher than median speed of machines. Negative correlation between weights and speeds of animals and positive relationship between weights and speeds of machines are the other major findings of the analysis. One of the major limitation to the is the sample size and participation. There are no insect species in sample and movement categories are limited. Also, there is only one motorcycle model (Honda Ruckus) in the sapmle data.
What’s Faster, Nature or Machine? (Accessed: 31/08/2022)
Bugatti Chiron Super Sport 300+ (2019) (Accessed: 31/08/2022)
Porsche 911 Turbo S (2021) (Accessed: 31/08/2022)
Chevrolet Corvette Stingray (2020) (Accessed: 31/08/2022)
Honda Civic (2021) (Accessed: 31/08/2022)
Toyota RAV4 (2021) (Accessed: 31/08/2022)
Ford F-150 Raptor (2020) (Accessed: 31/08/2022)
Honda Ruckus (2020) (Accessed: 31/08/2022) 9.Pelegrine falcon (Accessed: 31/08/2022)
Golden eagle (Accessed: 31/08/2022)
Mexican_free-tailed_bat (Accessed: 31/08/2022)
Cheetah (Accessed: 31/08/2022)
Sailfish (Accessed: 31/08/2022)