title: “Prvý R Markdown dokument – cvičenie 1” author: “Hana Belohorcová” date: “03.11.2025” output: html_document: toc: true toc_depth: 2 toc_float: true number_sections: true theme: cosmo —
Všetky potrebné balíky nahráme na jednom mieste na začiatku skriptu.
Použijeme základné dáta mtcars (sú súčasťou R) a
prehľadnejšie ich prekonvertujeme na tibble.
library(tidyverse)
library(knitr)
# Príprava dát ---------------------------
data_raw <- mtcars |>
rownames_to_column(var = "model") |>
as_tibble()
# Pre lepšiu čitateľnosť premenujeme vybrané stĺpce do snake_case
# (len ukážka – nepremenujeme všetko)
data <- data_raw |>
rename(
hp = hp,
mpg = mpg,
weight_lbs = wt
)
# Rýchly náhľad na dáta
head(data)
## # A tibble: 6 × 12
## model mpg cyl disp hp drat weight_lbs qsec vs am gear carb
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Mazda … 21 6 160 110 3.9 2.62 16.5 0 1 4 4
## 2 Mazda … 21 6 160 110 3.9 2.88 17.0 0 1 4 4
## 3 Datsun… 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
## 4 Hornet… 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
## 5 Hornet… 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
## 6 Valiant 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
|> a pomenované
argumenty,Spočítajme zopár súhrnných ukazovateľov po počte valcov
(cyl).
summary_tbl <- data |>
summarise(
avg_mpg = mean(mpg),
avg_hp = mean(hp),
avg_weight = mean(weight_lbs),
.by = cyl
) |>
arrange(cyl)
kable(summary_tbl, caption = "Súhrnné štatistiky podľa počtu valcov")
| cyl | avg_mpg | avg_hp | avg_weight |
|---|---|---|---|
| 4 | 26.66364 | 82.63636 | 2.285727 |
| 6 | 19.74286 | 122.28571 | 3.117143 |
| 8 | 15.10000 | 209.21429 | 3.999214 |
Vzťah medzi hmotnosťou a spotrebou paliva (míle na galón). Pridáme lineárny trend.
data |>
ggplot(aes(x = weight_lbs, y = mpg, color = factor(cyl))) +
geom_point(size = 2, alpha = 0.8) +
geom_smooth(method = "lm", se = FALSE) +
scale_color_brewer(palette = "Dark2", name = "Valce") +
labs(
x = "Hmotnosť (tis. libier)",
y = "Spotreba (mpg)",
title = "Spotreba vs. hmotnosť auta",
subtitle = "Dataset mtcars",
caption = "Zdroj: vstavané dáta R"
) +
theme_minimal(base_size = 12)
Definujme jednoduchú funkciu, ktorá prepočíta mpg na
spotrebu v litroch na 100 km. Použijeme čistý zápis a vrátime vektor v
rovnakom poradí.
mpg_to_l_per_100km <- function(mpg) {
# 1 míľa = 1.60934 km, 1 galón = 3.78541 l
# l/100km = 100 * (3.78541 / (1.60934 * mpg))
100 * (3.78541 / (1.60934 * mpg))
}
# Aplikácia funkcie a doplnenie do dát
with_consumption <- data |>
mutate(l_per_100km = mpg_to_l_per_100km(mpg))
head(with_consumption |> select(model, mpg, l_per_100km))
## # A tibble: 6 × 3
## model mpg l_per_100km
## <chr> <dbl> <dbl>
## 1 Mazda RX4 21 11.2
## 2 Mazda RX4 Wag 21 11.2
## 3 Datsun 710 22.8 10.3
## 4 Hornet 4 Drive 21.4 11.0
## 5 Hornet Sportabout 18.7 12.6
## 6 Valiant 18.1 13.0
Pre ukážku nastavíme semienko a vylosujeme náhodnú podvzorku 10 áut.
set.seed(42)
sampled <- with_consumption |>
slice_sample(n = 10) |>
arrange(desc(mpg))
kable(sampled |> select(model, mpg, l_per_100km, hp),
caption = "Náhodne vybraná podvzorka (n = 10)")
| model | mpg | l_per_100km | hp |
|---|---|---|---|
| Fiat 128 | 32.4 | 7.259724 | 66 |
| Hornet 4 Drive | 21.4 | 10.991358 | 110 |
| Volvo 142E | 21.4 | 10.991358 | 109 |
| Mazda RX4 | 21.0 | 11.200717 | 110 |
| Pontiac Firebird | 19.2 | 12.250784 | 175 |
| Merc 280 | 19.2 | 12.250784 | 123 |
| Hornet Sportabout | 18.7 | 12.578345 | 175 |
| Chrysler Imperial | 14.7 | 16.001024 | 230 |
| Duster 360 | 14.3 | 16.448605 | 245 |
| Cadillac Fleetwood | 10.4 | 22.616832 | 205 |
## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 20.04.6 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3; LAPACK version 3.9.0
##
## locale:
## [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
## [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
## [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
##
## time zone: UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] knitr_1.50 lubridate_1.9.4 forcats_1.0.1 stringr_1.5.2
## [5] dplyr_1.1.4 purrr_1.1.0 readr_2.1.5 tidyr_1.3.1
## [9] tibble_3.3.0 ggplot2_4.0.0 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] Matrix_1.7-3 gtable_0.3.6 jsonlite_2.0.0 compiler_4.5.1
## [5] tidyselect_1.2.1 jquerylib_0.1.4 splines_4.5.1 scales_1.4.0
## [9] fastmap_1.2.0 lattice_0.22-7 R6_2.6.1 labeling_0.4.3
## [13] generics_0.1.4 bslib_0.9.0 pillar_1.11.1 RColorBrewer_1.1-3
## [17] tzdb_0.5.0 rlang_1.1.6 utf8_1.2.6 stringi_1.8.7
## [21] cachem_1.1.0 xfun_0.54 sass_0.4.10 S7_0.2.0
## [25] timechange_0.3.0 cli_3.6.5 mgcv_1.9-3 withr_3.0.2
## [29] magrittr_2.0.4 digest_0.6.37 grid_4.5.1 rstudioapi_0.17.1
## [33] hms_1.1.4 nlme_3.1-168 lifecycle_1.0.4 vctrs_0.6.5
## [37] evaluate_1.0.5 glue_1.8.0 farver_2.1.2 rmarkdown_2.30
## [41] tools_4.5.1 pkgconfig_2.0.3 htmltools_0.5.8.1