1.

Correlation: It indicates how two or more variables are related. Correlation can be positive (both variables move in the same direction) or negative (move in opposite directions). The strength of the correlation ranges from -1 to +1, with values close to 0 indicating a weaker correlation. Correlation measures relationships without implying causation.

2.

Covariance: It shows the extent to which two variables vary together. It indicates whether two variables move in the same or opposite directions. Like variance, which measures variability of a single variable, covariance measures the joint variability of two variables. Positive or negative values indicate the direction of the relationship.

3&4.

# Load the data
Listings <- read.csv("Listings.csv")
Reviews <- read.csv("Reviews.csv")

# Merge the datasets into one
merged_data <- merge(Listings, Reviews, by = "listing_id", all = TRUE)

summary(merged_data)
##    listing_id           name              host_id           host_since       
##  Min.   :    2577   Length:5459299     Min.   :     1822   Length:5459299    
##  1st Qu.: 5425967   Class :character   1st Qu.:  8939609   Class :character  
##  Median :14746073   Mode  :character   Median : 31142417   Mode  :character  
##  Mean   :16278585                      Mean   : 65862881                     
##  3rd Qu.:24504409                      3rd Qu.: 96238999                     
##  Max.   :48343530                      Max.   :390187445                     
##                                                                              
##  host_location      host_response_time host_response_rate host_acceptance_rate
##  Length:5459299     Length:5459299     Min.   :0.0        Min.   :0.0         
##  Class :character   Class :character   1st Qu.:1.0        1st Qu.:0.9         
##  Mode  :character   Mode  :character   Median :1.0        Median :1.0         
##                                        Mean   :0.9        Mean   :0.9         
##                                        3rd Qu.:1.0        3rd Qu.:1.0         
##                                        Max.   :1.0        Max.   :1.0         
##                                        NA's   :1523184    NA's   :798117      
##  host_is_superhost  host_total_listings_count host_has_profile_pic
##  Length:5459299     Min.   :   0.000          Length:5459299      
##  Class :character   1st Qu.:   1.000          Class :character    
##  Mode  :character   Median :   2.000          Mode  :character    
##                     Mean   :   7.804                              
##                     3rd Qu.:   5.000                              
##                     Max.   :7235.000                              
##                     NA's   :3992                                  
##  host_identity_verified neighbourhood        district        
##  Length:5459299         Length:5459299     Length:5459299    
##  Class :character       Class :character   Class :character  
##  Mode  :character       Mode  :character   Mode  :character  
##                                                              
##                                                              
##                                                              
##                                                              
##      city              latitude        longitude       property_type     
##  Length:5459299     Min.   :-34.26   Min.   :-99.340   Length:5459299    
##  Class :character   1st Qu.: 13.76   1st Qu.:-43.377   Class :character  
##  Mode  :character   Median : 40.77   Median :  2.377   Mode  :character  
##                     Mean   : 24.19   Mean   :  4.104                     
##                     3rd Qu.: 41.91   3rd Qu.: 18.394                     
##                     Max.   : 48.90   Max.   :151.340                     
##                                                                          
##   room_type          accommodates       bedrooms       amenities        
##  Length:5459299     Min.   : 0.000   Min.   : 1.0     Length:5459299    
##  Class :character   1st Qu.: 2.000   1st Qu.: 1.0     Class :character  
##  Mode  :character   Median : 3.000   Median : 1.0     Mode  :character  
##                     Mean   : 3.445   Mean   : 1.5                       
##                     3rd Qu.: 4.000   3rd Qu.: 2.0                       
##                     Max.   :16.000   Max.   :50.0                       
##                                      NA's   :550476                     
##      price          minimum_nights     maximum_nights      review_scores_rating
##  Min.   :     0.0   Min.   :   1.000   Min.   :1.000e+00   Min.   : 20.00      
##  1st Qu.:    67.0   1st Qu.:   1.000   1st Qu.:4.500e+01   1st Qu.: 93.00      
##  Median :   116.0   Median :   2.000   Median :1.125e+03   Median : 96.00      
##  Mean   :   403.9   Mean   :   5.989   Mean   :1.346e+05   Mean   : 94.58      
##  3rd Qu.:   334.0   3rd Qu.:   3.000   3rd Qu.:1.125e+03   3rd Qu.: 98.00      
##  Max.   :625216.0   Max.   :9999.000   Max.   :2.147e+09   Max.   :100.00      
##                                                            NA's   :92260       
##  review_scores_accuracy review_scores_cleanliness review_scores_checkin
##  Min.   : 2.00          Min.   : 2.0              Min.   : 2.00        
##  1st Qu.:10.00          1st Qu.: 9.0              1st Qu.:10.00        
##  Median :10.00          Median :10.0              Median :10.00        
##  Mean   : 9.73          Mean   : 9.5              Mean   : 9.83        
##  3rd Qu.:10.00          3rd Qu.:10.0              3rd Qu.:10.00        
##  Max.   :10.00          Max.   :10.0              Max.   :10.00        
##  NA's   :125419         NA's   :125227            NA's   :125494       
##  review_scores_communication review_scores_location review_scores_value
##  Min.   : 2.00               Min.   : 2.00          Min.   : 2.00      
##  1st Qu.:10.00               1st Qu.:10.00          1st Qu.: 9.00      
##  Median :10.00               Median :10.00          Median :10.00      
##  Mean   : 9.83               Mean   : 9.75          Mean   : 9.47      
##  3rd Qu.:10.00               3rd Qu.:10.00          3rd Qu.:10.00      
##  Max.   :10.00               Max.   :10.00          Max.   :10.00      
##  NA's   :125390              NA's   :125502         NA's   :125515     
##  instant_bookable     review_id             date            reviewer_id       
##  Length:5459299     Min.   :      282   Length:5459299     Min.   :        1  
##  Class :character   1st Qu.:166643479   Class :character   1st Qu.: 23902058  
##  Mode  :character   Median :342572666   Mode  :character   Median : 66978139  
##                     Mean   :348675319                      Mean   : 98081330  
##                     3rd Qu.:533404482                      3rd Qu.:152893599  
##                     Max.   :735623741                      Max.   :390338478  
##                     NA's   :86156                          NA's   :86156

4.

library(stargazer)
## 
## Please cite as:
##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
stargazer(merged_data, type = "text", 
          title = "Summary Statistics")
## 
## Summary Statistics
## ===========================================================================================
## Statistic                       N          Mean          St. Dev.       Min        Max     
## -------------------------------------------------------------------------------------------
## listing_id                  5,459,299 16,278,585.000  12,188,827.000   2,577   48,343,530  
## host_id                     5,459,299 65,862,881.000  79,734,618.000   1,822   390,187,445 
## host_response_rate          3,936,115      0.934           0.192       0.000      1.000    
## host_acceptance_rate        4,661,182      0.903           0.201       0.000      1.000    
## host_total_listings_count   5,455,307      7.804          68.330         0        7,235    
## latitude                    5,459,299     24.187          29.890      -34.264    48.905    
## longitude                   5,459,299      4.104          69.585      -99.340    151.340   
## accommodates                5,459,299      3.445           2.027         0         16      
## bedrooms                    4,908,823      1.452           1.090         1         50      
## price                       5,459,299     403.920        2,495.727       0       625,216   
## minimum_nights              5,459,299      5.989          33.308         1        9,999    
## maximum_nights              5,459,299   134,584.200   16,574,518.000     1    2,147,483,647
## review_scores_rating        5,367,039     94.581           4.603        20         100     
## review_scores_accuracy      5,333,880      9.729           0.516         2         10      
## review_scores_cleanliness   5,334,072      9.498           0.669         2         10      
## review_scores_checkin       5,333,805      9.835           0.426         2         10      
## review_scores_communication 5,333,909      9.828           0.436         2         10      
## review_scores_location      5,333,797      9.751           0.491         2         10      
## review_scores_value         5,333,784      9.470           0.599         2         10      
## review_id                   5,373,143 348,675,319.000 206,101,858.000   282    735,623,741 
## reviewer_id                 5,373,143 98,081,330.000  90,805,956.000     1     390,338,478 
## -------------------------------------------------------------------------------------------

5.

correlation <- cor(Listings$host_response_rate, Listings$host_acceptance_rate, use = "complete.obs")
covariance <- cov(Listings$host_response_rate, Listings$host_acceptance_rate, use = "complete.obs")
print(correlation)
## [1] 0.3215103
print(covariance)
## [1] 0.02118452