1. Purpose

2. Preprocess The Data

library(readr)
USArain <- read_csv("C:/Users/manhphi2811/Downloads/USArain.csv")
head(USArain)
## # A tibble: 6 × 9
##   Date       Location Temperature Humidity WindSpeed Precipitation CloudCover
##   <date>     <chr>          <dbl>    <dbl>     <dbl>         <dbl>      <dbl>
## 1 2024-01-01 New York        87.5     75.7     28.4         0            69.6
## 2 2024-01-02 New York        83.3     28.7     12.4         0.527        41.6
## 3 2024-01-03 New York        80.9     64.7     14.2         0.917        77.4
## 4 2024-01-04 New York        78.1     59.7     19.4         0.0941       52.5
## 5 2024-01-05 New York        37.1     34.8      3.69        1.36         85.6
## 6 2024-01-06 New York        35.3     56.6     21.4         0.583        22.8
## # ℹ 2 more variables: Pressure <dbl>, RainTomorrow <dbl>
library(dplyr)

NewYork_rain <- USArain  %>%
  filter(Location == "New York" & Date <"2024-10-15") %>%
  select(Date, Location, Temperature )

tail(NewYork_rain)
## # A tibble: 6 × 3
##   Date       Location Temperature
##   <date>     <chr>          <dbl>
## 1 2024-10-09 New York        38.6
## 2 2024-10-10 New York        41.3
## 3 2024-10-11 New York        87.3
## 4 2024-10-12 New York        40.5
## 5 2024-10-13 New York        32.5
## 6 2024-10-14 New York        47.8
time <- as.Date(NewYork_rain$Date)
Year <- as.numeric(format(time,'%Y'))
Month <- as.numeric(format(time,'%m'))
Day <- as.numeric(format(time,'%d'))
NewYork_rain <-cbind(NewYork_rain, Day, Month, Year)
head(NewYork_rain) 
##         Date Location Temperature Day Month Year
## 1 2024-01-01 New York    87.52480   1     1 2024
## 2 2024-01-02 New York    83.25932   2     1 2024
## 3 2024-01-03 New York    80.94305   3     1 2024
## 4 2024-01-04 New York    78.09755   4     1 2024
## 5 2024-01-05 New York    37.05996   5     1 2024
## 6 2024-01-06 New York    35.29865   6     1 2024
tail(NewYork_rain)
##            Date Location Temperature Day Month Year
## 1435 2024-10-09 New York    38.59452   9    10 2024
## 1436 2024-10-10 New York    41.31892  10    10 2024
## 1437 2024-10-11 New York    87.28387  11    10 2024
## 1438 2024-10-12 New York    40.52369  12    10 2024
## 1439 2024-10-13 New York    32.49538  13    10 2024
## 1440 2024-10-14 New York    47.81928  14    10 2024

Overview and conclusion about our data

The initial dataset has predicted values of Temperature from the begin of 2024 to the end of 2025. However, for our analyzing purpose, we only used the data before the time that we wrote this report (15/10/2024). Therefore, after prepocessing, we have the data including Date (Day, Month, Year), Location and Temperature of New York City from 01/01/2024 to 14/10/2024. Our processed dataset has 1440 observations, which is sufficient to implement statistical methods.

3.Methodology

Observe the pattern of the Temperature in New York

4. Analyzing

5. Predicting and Evaluating

6. Conclusion