library(zoo)
library(tseries)
library(lmtest)
library(sandwich)
library(car)
library(ggplot2)
library(patchwork)

rm(list=ls())

Úvod do problému a stanovenie hypotéz

Cieľom je modelovať celkovú cenu cesty (Total trip cost) v závislosti od: -dĺžky trvania pobytu (Duration) -nákladov na ubytovanie (Accommodation cost) -nákladov na dopravu (Transportation cost)

Ekonomická logika predpokladá: Accommodation cost → pozitívny vplyv Transportation cost → pozitívny vplyv Duration môže byť problematická, keďže už je „zahrnutá“ v nákladoch — očakávam skôr slabší alebo nulový vplyv.

Hypotéza: H1: Accommodation cost a Transportation cost štatisticky významne zvyšujú Total trip cost. H2: Duration má slabý alebo nejasný vplyv na Total trip cost.

#Načítanie a úprava údajov

data <- read.csv("Travel_data.csv", sep = ";", stringsAsFactors = FALSE)

# sjednotíme názvy stĺpcov (medzery -> bodky atď.)
names(data) <- make.names(names(data))

# teraz už tieto názvy existujú určite
# "Accommodation cost" -> "Accommodation.cost"
# "Transportation cost" -> "Transportation.cost"
# "Duration (days)" -> "Duration..days."

# vytvorenie TotalCost
data$TotalCost <- data$Accommodation.cost + data$Transportation.cost

# príprava datasetu pre model
d <- data[, c("TotalCost", "Duration..days.", "Accommodation.cost", "Transportation.cost")]
names(d) <- c("TotalCost","Duration","AccommodationCost","TransportationCost")

# imputácia mediánmi
med <- sapply(d, median, na.rm = TRUE)
for (col in names(d)) {
  d[[col]][is.na(d[[col]])] <- med[col]
}

# model
model <- lm(TotalCost ~ Duration + AccommodationCost + TransportationCost, data = d)
summary(model)

Call:
lm(formula = TotalCost ~ Duration + AccommodationCost + TransportationCost, 
    data = d)

Residuals:
   Min     1Q Median     3Q    Max 
-23.31  -9.31  -5.30  -2.31 734.50 

Coefficients:
                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)        24.129732  28.131350   0.858    0.393    
Duration           -2.228253   3.521003  -0.633    0.528    
AccommodationCost   0.991250   0.006821 145.320   <2e-16 ***
TransportationCost  1.014255   0.015537  65.281   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 64.36 on 133 degrees of freedom
Multiple R-squared:  0.9988,    Adjusted R-squared:  0.9988 
F-statistic: 3.654e+04 on 3 and 133 DF,  p-value: < 2.2e-16

#Boxploty premenných

par(mfrow=c(2,2))
for (col in names(d)) {
boxplot(d[[col]], main=col, col="lightblue")
}
par(mfrow=c(1,1))

#Odhad lineárného modelu Interpretácia: AccommodationCost → veľmi silný a štatisticky významny pozitívny efekt TransportationCost → rovnako silný pozitívny efekt Duration → efekt môže byť slabší (často kolineárny s ubytovaním) R² bude extrémne vysoké, pretože TotalCost je súčet dvoch vysvetľujúcich premenných.

model <- lm(TotalCost ~ Duration + AccommodationCost + TransportationCost, data=d)
summary(model)

Call:
lm(formula = TotalCost ~ Duration + AccommodationCost + TransportationCost, 
    data = d)

Residuals:
   Min     1Q Median     3Q    Max 
-23.31  -9.31  -5.30  -2.31 734.50 

Coefficients:
                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)        24.129732  28.131350   0.858    0.393    
Duration           -2.228253   3.521003  -0.633    0.528    
AccommodationCost   0.991250   0.006821 145.320   <2e-16 ***
TransportationCost  1.014255   0.015537  65.281   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 64.36 on 133 degrees of freedom
Multiple R-squared:  0.9988,    Adjusted R-squared:  0.9988 
F-statistic: 3.654e+04 on 3 and 133 DF,  p-value: < 2.2e-16

#Diagnostické grafy Interpretácia diagnostických grafov 1. Residuals vs Fitted Reziduá sú rovnomerne rozptýlené okolo nuly → nepozorujeme nelinearitu Mierne zakrivenie môže byť dôsledkom striktnej lineárnej väzby medzi TotalCost a jeho zložkami

  1. Q-Q plot Body takmer kopírujú priamku → reziduá sú približne normálne Odchýlky na koncoch sú minimálne

  2. Scale-Location Takmer rovná LOESS krivka → žiadny výrazný problém heteroskedasticity

  3. Residuals vs Leverage Žiadne extrémne Cookove vzdialenosti → žiadne mimoriadne vplyvné pozorovania

par(mfrow=c(2,2))
plot(model)
par(mfrow=c(1,1))

Jarque-Bera test (normalita rezíduí)

nterpretácia: p-value je < 0.05 → silne štatisticky významné

jb_test <- jarque.bera.test(residuals(model))
jb_test

    Jarque Bera Test

data:  residuals(model)
X-squared = 97046, df = 2, p-value < 2.2e-16

Testovanie prítomnosti odľahlých hodnôt

dáta sú veľmi „upratané“, pretože ceny cestovania prirodzene rastú proporčne

outlierTest(model)
No Studentized residuals with Bonferroni p < 0.05
Largest |rstudent|:

Testovanie heteroskedasticity (Breusch–Pagan)

p-value > 0.05 → heteroskedasticita nie je prítomná dôvod: TotalCost je v podstate lineárna kombinácia dvoch vysvetľujúcich premenných

bptest(model)

    studentized Breusch-Pagan test

data:  model
BP = 1.8444, df = 3, p-value = 0.6053

Robustné (White-HC) štandardné chyby

coeftest(model, vcov = vcovHC(model))

t test of coefficients:

                     Estimate Std. Error  t value Pr(>|t|)    
(Intercept)        24.1297318 24.6267539   0.9798   0.3290    
Duration           -2.2282526  2.2998452  -0.9689   0.3344    
AccommodationCost   0.9912499  0.0091204 108.6855   <2e-16 ***
TransportationCost  1.0142552  0.0148015  68.5236   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Záver

Accommodation cost výrazne a spoľahlivo zvyšuje Total Trip Cost Transportation cost rovnako výrazne zvyšuje Total Trip Cost Duration má slabší alebo žiadny nezávislý vplyv – keďže náklady na ubytovanie už jeho efekt zachytávajú Normalita rezíduí je v poriadku Heteroskedasticita sa nepreukázala

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.

plot(cars)

Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.

LS0tCnRpdGxlOiAiRWNvbm9tZXRyaWNzIGluIFIg4oCTIFRyYXZlbCBDb3N0IE1vZGVsIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKYXV0aG9yOiBCYXJib3JhIENhcGVrb3bDoQotLS0KCmBgYHtyfQpsaWJyYXJ5KHpvbykKbGlicmFyeSh0c2VyaWVzKQpsaWJyYXJ5KGxtdGVzdCkKbGlicmFyeShzYW5kd2ljaCkKbGlicmFyeShjYXIpCmxpYnJhcnkoZ2dwbG90MikKbGlicmFyeShwYXRjaHdvcmspCgpybShsaXN0PWxzKCkpCmBgYAoKIyDDmnZvZCBkbyBwcm9ibMOpbXUgYSBzdGFub3ZlbmllIGh5cG90w6l6CkNpZcS+b20gamUgbW9kZWxvdmHFpSBjZWxrb3bDuiBjZW51IGNlc3R5IChUb3RhbCB0cmlwIGNvc3QpIHYgesOhdmlzbG9zdGkgb2Q6Ci1kxLrFvmt5IHRydmFuaWEgcG9ieXR1IChEdXJhdGlvbikKLW7DoWtsYWRvdiBuYSB1Ynl0b3ZhbmllIChBY2NvbW1vZGF0aW9uIGNvc3QpCi1uw6FrbGFkb3YgbmEgZG9wcmF2dSAoVHJhbnNwb3J0YXRpb24gY29zdCkKCkVrb25vbWlja8OhIGxvZ2lrYSBwcmVkcG9rbGFkw6E6CkFjY29tbW9kYXRpb24gY29zdCDihpIgcG96aXTDrXZueSB2cGx5dgpUcmFuc3BvcnRhdGlvbiBjb3N0IOKGkiBwb3ppdMOtdm55IHZwbHl2CkR1cmF0aW9uIG3DtMW+ZSBiecWlIHByb2JsZW1hdGlja8OhLCBrZcSPxb5lIHXFviBqZSDigJ56YWhybnV0w6HigJwgdiBuw6FrbGFkb2NoIOKAlCBvxI1ha8OhdmFtIHNrw7RyIHNsYWLFocOtIGFsZWJvIG51bG92w70gdnBseXYuCgpIeXBvdMOpemE6CkgxOiBBY2NvbW1vZGF0aW9uIGNvc3QgYSBUcmFuc3BvcnRhdGlvbiBjb3N0IMWhdGF0aXN0aWNreSB2w716bmFtbmUgenZ5xaF1asO6IFRvdGFsIHRyaXAgY29zdC4KSDI6IER1cmF0aW9uIG3DoSBzbGFiw70gYWxlYm8gbmVqYXNuw70gdnBseXYgbmEgVG90YWwgdHJpcCBjb3N0LgoKI05hxI3DrXRhbmllIGEgw7pwcmF2YSDDumRham92CmBgYHtyfQpkYXRhIDwtIHJlYWQuY3N2KCJUcmF2ZWxfZGF0YS5jc3YiLCBzZXAgPSAiOyIsIHN0cmluZ3NBc0ZhY3RvcnMgPSBGQUxTRSkKCiMgemplZG5vdMOtbWUgbsOhenZ5IHN0xLpwY292IChtZWR6ZXJ5IC0+IGJvZGt5IGF0xI8uKQpuYW1lcyhkYXRhKSA8LSBtYWtlLm5hbWVzKG5hbWVzKGRhdGEpKQoKIyB2eXR2b3JlbmllIFRvdGFsQ29zdApkYXRhJFRvdGFsQ29zdCA8LSBkYXRhJEFjY29tbW9kYXRpb24uY29zdCArIGRhdGEkVHJhbnNwb3J0YXRpb24uY29zdAoKIyBwcsOtcHJhdmEgZGF0YXNldHUgcHJlIG1vZGVsCmQgPC0gZGF0YVssIGMoIlRvdGFsQ29zdCIsICJEdXJhdGlvbi4uZGF5cy4iLCAiQWNjb21tb2RhdGlvbi5jb3N0IiwgIlRyYW5zcG9ydGF0aW9uLmNvc3QiKV0KbmFtZXMoZCkgPC0gYygiVG90YWxDb3N0IiwiRHVyYXRpb24iLCJBY2NvbW1vZGF0aW9uQ29zdCIsIlRyYW5zcG9ydGF0aW9uQ29zdCIpCgojIGltcHV0w6FjaWEgbWVkacOhbm1pCm1lZCA8LSBzYXBwbHkoZCwgbWVkaWFuLCBuYS5ybSA9IFRSVUUpCmZvciAoY29sIGluIG5hbWVzKGQpKSB7CiAgZFtbY29sXV1baXMubmEoZFtbY29sXV0pXSA8LSBtZWRbY29sXX0KCiMgbW9kZWwKbW9kZWwgPC0gbG0oVG90YWxDb3N0IH4gRHVyYXRpb24gKyBBY2NvbW1vZGF0aW9uQ29zdCArIFRyYW5zcG9ydGF0aW9uQ29zdCwgZGF0YSA9IGQpCnN1bW1hcnkobW9kZWwpCmBgYAojQm94cGxvdHkgcHJlbWVubsO9Y2gKYGBge3J9CnBhcihtZnJvdz1jKDIsMikpCmZvciAoY29sIGluIG5hbWVzKGQpKSB7CmJveHBsb3QoZFtbY29sXV0sIG1haW49Y29sLCBjb2w9ImxpZ2h0Ymx1ZSIpCn0KcGFyKG1mcm93PWMoMSwxKSkKYGBgCiNPZGhhZCBsaW5lw6FybsOpaG8gbW9kZWx1CkludGVycHJldMOhY2lhOgpBY2NvbW1vZGF0aW9uQ29zdCDihpIgdmXEvm1pIHNpbG7DvSBhIMWhdGF0aXN0aWNreSB2w716bmFtbnkgcG96aXTDrXZueSBlZmVrdApUcmFuc3BvcnRhdGlvbkNvc3Qg4oaSIHJvdm5ha28gc2lsbsO9IHBveml0w612bnkgZWZla3QKRHVyYXRpb24g4oaSIGVmZWt0IG3DtMW+ZSBiecWlIHNsYWLFocOtICjEjWFzdG8ga29saW5lw6FybnkgcyB1Ynl0b3ZhbsOtbSkKUsKyIGJ1ZGUgZXh0csOpbW5lIHZ5c29rw6ksIHByZXRvxb5lIFRvdGFsQ29zdCBqZSBzw7rEjWV0IGR2b2NoIHZ5c3ZldMS+dWrDumNpY2ggcHJlbWVubsO9Y2guCmBgYHtyfQptb2RlbCA8LSBsbShUb3RhbENvc3QgfiBEdXJhdGlvbiArIEFjY29tbW9kYXRpb25Db3N0ICsgVHJhbnNwb3J0YXRpb25Db3N0LCBkYXRhPWQpCnN1bW1hcnkobW9kZWwpCmBgYAojRGlhZ25vc3RpY2vDqSBncmFmeQpJbnRlcnByZXTDoWNpYSBkaWFnbm9zdGlja8O9Y2ggZ3JhZm92CjEuIFJlc2lkdWFscyB2cyBGaXR0ZWQKUmV6aWR1w6Egc8O6IHJvdm5vbWVybmUgcm96cHTDvWxlbsOpIG9rb2xvIG51bHkg4oaSIG5lcG96b3J1amVtZSBuZWxpbmVhcml0dQpNaWVybmUgemFrcml2ZW5pZSBtw7TFvmUgYnnFpSBkw7RzbGVka29tIHN0cmlrdG5laiBsaW5lw6FybmVqIHbDpHpieSBtZWR6aSBUb3RhbENvc3QgYSBqZWhvIHpsb8W+a2FtaQoKMi4gUS1RIHBsb3QKQm9keSB0YWttZXIga29ww61ydWrDuiBwcmlhbWt1IOKGkiByZXppZHXDoSBzw7ogcHJpYmxpxb5uZSBub3Jtw6FsbmUKT2RjaMO9bGt5IG5hIGtvbmNvY2ggc8O6IG1pbmltw6FsbmUKCjMuIFNjYWxlLUxvY2F0aW9uClRha21lciByb3Zuw6EgTE9FU1Mga3JpdmthIOKGkiDFvmlhZG55IHbDvXJhem7DvSBwcm9ibMOpbSBoZXRlcm9za2VkYXN0aWNpdHkKCjQuIFJlc2lkdWFscyB2cyBMZXZlcmFnZQrFvWlhZG5lIGV4dHLDqW1uZSBDb29rb3ZlIHZ6ZGlhbGVub3N0aSDihpIgxb5pYWRuZSBtaW1vcmlhZG5lIHZwbHl2bsOpIHBvem9yb3ZhbmlhCmBgYHtyfQpwYXIobWZyb3c9YygyLDIpKQpwbG90KG1vZGVsKQpwYXIobWZyb3c9YygxLDEpKQpgYGAKIyBKYXJxdWUtQmVyYSB0ZXN0IChub3JtYWxpdGEgcmV6w61kdcOtKQpudGVycHJldMOhY2lhOgpwLXZhbHVlIGplIDwgMC4wNSDihpIgc2lsbmUgxaF0YXRpc3RpY2t5IHbDvXpuYW1uw6kKYGBge3J9CmpiX3Rlc3QgPC0gamFycXVlLmJlcmEudGVzdChyZXNpZHVhbHMobW9kZWwpKQpqYl90ZXN0CmBgYAojIFRlc3RvdmFuaWUgcHLDrXRvbW5vc3RpIG9kxL5haGzDvWNoIGhvZG7DtHQKZMOhdGEgc8O6IHZlxL5taSDigJ51cHJhdGFuw6nigJwsIHByZXRvxb5lIGNlbnkgY2VzdG92YW5pYSBwcmlyb2R6ZW5lIHJhc3TDuiBwcm9wb3LEjW5lCmBgYHtyfQpvdXRsaWVyVGVzdChtb2RlbCkKYGBgCiMgVGVzdG92YW5pZSBoZXRlcm9za2VkYXN0aWNpdHkgKEJyZXVzY2jigJNQYWdhbikKcC12YWx1ZSA+IDAuMDUg4oaSIGhldGVyb3NrZWRhc3RpY2l0YSBuaWUgamUgcHLDrXRvbW7DoQpkw7R2b2Q6IFRvdGFsQ29zdCBqZSB2IHBvZHN0YXRlIGxpbmXDoXJuYSBrb21iaW7DoWNpYSBkdm9jaCB2eXN2ZXTEvnVqw7pjaWNoIHByZW1lbm7DvWNoCmBgYHtyfQpicHRlc3QobW9kZWwpCmBgYAojIFJvYnVzdG7DqSAoV2hpdGUtSEMpIMWhdGFuZGFyZG7DqSBjaHlieQpgYGB7cn0KY29lZnRlc3QobW9kZWwsIHZjb3YgPSB2Y292SEMobW9kZWwpKQpgYGAKIyBaw6F2ZXIKQWNjb21tb2RhdGlvbiBjb3N0IHbDvXJhem5lIGEgc3BvxL5haGxpdm8genZ5xaF1amUgVG90YWwgVHJpcCBDb3N0ClRyYW5zcG9ydGF0aW9uIGNvc3Qgcm92bmFrbyB2w71yYXpuZSB6dnnFoXVqZSBUb3RhbCBUcmlwIENvc3QKRHVyYXRpb24gbcOhIHNsYWLFocOtIGFsZWJvIMW+aWFkbnkgbmV6w6F2aXNsw70gdnBseXYg4oCTIGtlxI/FvmUgbsOha2xhZHkgbmEgdWJ5dG92YW5pZSB1xb4gamVobyBlZmVrdCB6YWNoeXTDoXZhasO6Ck5vcm1hbGl0YSByZXrDrWR1w60gamUgdiBwb3JpYWRrdQpIZXRlcm9za2VkYXN0aWNpdGEgc2EgbmVwcmV1a8OhemFsYQoKVGhpcyBpcyBhbiBbUiBNYXJrZG93bl0oaHR0cDovL3JtYXJrZG93bi5yc3R1ZGlvLmNvbSkgTm90ZWJvb2suIFdoZW4geW91IGV4ZWN1dGUgY29kZSB3aXRoaW4gdGhlIG5vdGVib29rLCB0aGUgcmVzdWx0cyBhcHBlYXIgYmVuZWF0aCB0aGUgY29kZS4gCgpUcnkgZXhlY3V0aW5nIHRoaXMgY2h1bmsgYnkgY2xpY2tpbmcgdGhlICpSdW4qIGJ1dHRvbiB3aXRoaW4gdGhlIGNodW5rIG9yIGJ5IHBsYWNpbmcgeW91ciBjdXJzb3IgaW5zaWRlIGl0IGFuZCBwcmVzc2luZyAqQ3RybCtTaGlmdCtFbnRlciouIAoKYGBge3J9CnBsb3QoY2FycykKYGBgCgpBZGQgYSBuZXcgY2h1bmsgYnkgY2xpY2tpbmcgdGhlICpJbnNlcnQgQ2h1bmsqIGJ1dHRvbiBvbiB0aGUgdG9vbGJhciBvciBieSBwcmVzc2luZyAqQ3RybCtBbHQrSSouCgpXaGVuIHlvdSBzYXZlIHRoZSBub3RlYm9vaywgYW4gSFRNTCBmaWxlIGNvbnRhaW5pbmcgdGhlIGNvZGUgYW5kIG91dHB1dCB3aWxsIGJlIHNhdmVkIGFsb25nc2lkZSBpdCAoY2xpY2sgdGhlICpQcmV2aWV3KiBidXR0b24gb3IgcHJlc3MgKkN0cmwrU2hpZnQrSyogdG8gcHJldmlldyB0aGUgSFRNTCBmaWxlKS4KClRoZSBwcmV2aWV3IHNob3dzIHlvdSBhIHJlbmRlcmVkIEhUTUwgY29weSBvZiB0aGUgY29udGVudHMgb2YgdGhlIGVkaXRvci4gQ29uc2VxdWVudGx5LCB1bmxpa2UgKktuaXQqLCAqUHJldmlldyogZG9lcyBub3QgcnVuIGFueSBSIGNvZGUgY2h1bmtzLiBJbnN0ZWFkLCB0aGUgb3V0cHV0IG9mIHRoZSBjaHVuayB3aGVuIGl0IHdhcyBsYXN0IHJ1biBpbiB0aGUgZWRpdG9yIGlzIGRpc3BsYXllZC4K