About Dataset

The “Superstore Sales” dataset is a comprehensive and versatile collection of data that provides valuable insights into sales, customer behavior, and product performance. This dataset offers a rich resource for in-depth analysis.

Containing information from diverse regions and segments, the dataset enables exploration of trends, patterns, and correlations in sales and customer preferences. The dataset encompasses sales transactions, enabling researchers and analysts to understand buying patterns, identify high-demand products, and assess the effectiveness of different shipping modes.

Moreover, the dataset provides an opportunity to examine the impact of various factors such as discounts, geographical locations, and product categories on profitability. By analyzing this dataset, businesses and data enthusiasts can uncover actionable insights for optimizing pricing strategies, supply chain management, and customer engagement.

Whether used for educational purposes, business strategy formulation, or data analysis practice, the “Superstore Sales” dataset offers a comprehensive platform to delve into the dynamics of sales operations, customer interactions, and the factors that drive business success.

Modules Used

library(dplyr)  
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ readr     2.1.4
## ✔ ggplot2   3.4.3     ✔ stringr   1.5.0
## ✔ lubridate 1.9.2     ✔ tibble    3.2.1
## ✔ purrr     1.0.2     ✔ tidyr     1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(reshape2)
## 
## Attaching package: 'reshape2'
## 
## The following object is masked from 'package:tidyr':
## 
##     smiths
library(pwr)
library(caret)
## Loading required package: lattice
## 
## Attaching package: 'caret'
## 
## The following object is masked from 'package:purrr':
## 
##     lift
library(caTools)
library(caret)
library(ROSE)
## Loaded ROSE 0.0-4
library(caret)
library(ggplot2)
library(reshape2)
library(moments)

Reading The Data

df <-read.csv('/Users/fahadmehfooz/Desktop/IUPUI/First Semester/Intro to Statistics/Intro to Stats Dataset/Dataset 1/Superstore.csv')
head(df, 5)
##   Row.ID       Order.ID Order.Date  Ship.Date      Ship.Mode Customer.ID
## 1      1 CA-2013-152156 09-11-2013 12-11-2013   Second Class    CG-12520
## 2      2 CA-2013-152156 09-11-2013 12-11-2013   Second Class    CG-12520
## 3      3 CA-2013-138688 13-06-2013 17-06-2013   Second Class    DV-13045
## 4      4 US-2012-108966 11-10-2012 18-10-2012 Standard Class    SO-20335
## 5      5 US-2012-108966 11-10-2012 18-10-2012 Standard Class    SO-20335
##     Customer.Name   Segment       Country            City      State
## 1     Claire Gute  Consumer United States       Henderson   Kentucky
## 2     Claire Gute  Consumer United States       Henderson   Kentucky
## 3 Darrin Van Huff Corporate United States     Los Angeles California
## 4  Sean O'Donnell  Consumer United States Fort Lauderdale    Florida
## 5  Sean O'Donnell  Consumer United States Fort Lauderdale    Florida
##   Postal.Code Region      Product.ID        Category Sub.Category
## 1       42420  South FUR-BO-10001798       Furniture    Bookcases
## 2       42420  South FUR-CH-10000454       Furniture       Chairs
## 3       90036   West OFF-LA-10000240 Office Supplies       Labels
## 4       33311  South FUR-TA-10000577       Furniture       Tables
## 5       33311  South OFF-ST-10000760 Office Supplies      Storage
##                                                  Product.Name    Sales Quantity
## 1                           Bush Somerset Collection Bookcase 261.9600        2
## 2 Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back 731.9400        3
## 3   Self-Adhesive Address Labels for Typewriters by Universal  14.6200        2
## 4               Bretford CR4500 Series Slim Rectangular Table 957.5775        5
## 5                              Eldon Fold 'N Roll Cart System  22.3680        2
##   Discount    Profit
## 1     0.00   41.9136
## 2     0.00  219.5820
## 3     0.00    6.8714
## 4     0.45 -383.0310
## 5     0.20    2.5164

Descriptive Statistics:

colnames(df)
##  [1] "Row.ID"        "Order.ID"      "Order.Date"    "Ship.Date"    
##  [5] "Ship.Mode"     "Customer.ID"   "Customer.Name" "Segment"      
##  [9] "Country"       "City"          "State"         "Postal.Code"  
## [13] "Region"        "Product.ID"    "Category"      "Sub.Category" 
## [17] "Product.Name"  "Sales"         "Quantity"      "Discount"     
## [21] "Profit"
dim(df)
## [1] 9994   21

Checking the statistics of the dataset

summary(df[, colnames(df)])
##      Row.ID       Order.ID          Order.Date         Ship.Date        
##  Min.   :   1   Length:9994        Length:9994        Length:9994       
##  1st Qu.:2499   Class :character   Class :character   Class :character  
##  Median :4998   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :4998                                                           
##  3rd Qu.:7496                                                           
##  Max.   :9994                                                           
##   Ship.Mode         Customer.ID        Customer.Name        Segment         
##  Length:9994        Length:9994        Length:9994        Length:9994       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##    Country              City              State            Postal.Code   
##  Length:9994        Length:9994        Length:9994        Min.   : 1040  
##  Class :character   Class :character   Class :character   1st Qu.:23223  
##  Mode  :character   Mode  :character   Mode  :character   Median :56430  
##                                                           Mean   :55190  
##                                                           3rd Qu.:90008  
##                                                           Max.   :99301  
##     Region           Product.ID          Category         Sub.Category      
##  Length:9994        Length:9994        Length:9994        Length:9994       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  Product.Name           Sales              Quantity        Discount     
##  Length:9994        Min.   :    0.444   Min.   : 1.00   Min.   :0.0000  
##  Class :character   1st Qu.:   17.280   1st Qu.: 2.00   1st Qu.:0.0000  
##  Mode  :character   Median :   54.490   Median : 3.00   Median :0.2000  
##                     Mean   :  229.858   Mean   : 3.79   Mean   :0.1562  
##                     3rd Qu.:  209.940   3rd Qu.: 5.00   3rd Qu.:0.2000  
##                     Max.   :22638.480   Max.   :14.00   Max.   :0.8000  
##      Profit         
##  Min.   :-6599.978  
##  1st Qu.:    1.729  
##  Median :    8.666  
##  Mean   :   28.657  
##  3rd Qu.:   29.364  
##  Max.   : 8399.976

There are 9994 values in every column.

Central tendency can be observed here for all the numeric columns.

Checking unique values in categorical columns and their counts

columns_to_count <- c("Category", "Region", "Country", "Segment", "State")

# Looping through the categorical columns and printing the counts
for (col in columns_to_count) {
    if (col %in% names(df)) {
        cat("Counts for", col, ":\n")
        print(table(df[[col]]))
        cat("\n")  # Corrected this line
    } else {
        cat(col, "is not a column in the dataframe.\n\n")
    }
}
## Counts for Category :
## 
##       Furniture Office Supplies      Technology 
##            2121            6026            1847 
## 
## Counts for Region :
## 
## Central    East   South    West 
##    2323    2848    1620    3203 
## 
## Counts for Country :
## 
## United States 
##          9994 
## 
## Counts for Segment :
## 
##    Consumer   Corporate Home Office 
##        5191        3020        1783 
## 
## Counts for State :
## 
##              Alabama              Arizona             Arkansas 
##                   61                  224                   60 
##           California             Colorado          Connecticut 
##                 2001                  182                   82 
##             Delaware District of Columbia              Florida 
##                   96                   10                  383 
##              Georgia                Idaho             Illinois 
##                  184                   21                  492 
##              Indiana                 Iowa               Kansas 
##                  149                   30                   24 
##             Kentucky            Louisiana                Maine 
##                  139                   42                    8 
##             Maryland        Massachusetts             Michigan 
##                  105                  135                  255 
##            Minnesota          Mississippi             Missouri 
##                   89                   53                   66 
##              Montana             Nebraska               Nevada 
##                   15                   38                   39 
##        New Hampshire           New Jersey           New Mexico 
##                   27                  130                   37 
##             New York       North Carolina         North Dakota 
##                 1128                  249                    7 
##                 Ohio             Oklahoma               Oregon 
##                  469                   66                  124 
##         Pennsylvania         Rhode Island       South Carolina 
##                  587                   56                   42 
##         South Dakota            Tennessee                Texas 
##                   12                  183                  985 
##                 Utah              Vermont             Virginia 
##                   53                   11                  224 
##           Washington        West Virginia            Wisconsin 
##                  506                    4                  110 
##              Wyoming 
##                    1

Checking the missing values proportions

# Viewing the missing values proportions

missing_vals <- colSums(is.na(df)) / nrow(df)
missing_vals
##        Row.ID      Order.ID    Order.Date     Ship.Date     Ship.Mode 
##             0             0             0             0             0 
##   Customer.ID Customer.Name       Segment       Country          City 
##             0             0             0             0             0 
##         State   Postal.Code        Region    Product.ID      Category 
##             0             0             0             0             0 
##  Sub.Category  Product.Name         Sales      Quantity      Discount 
##             0             0             0             0             0 
##        Profit 
##             0

Checking central tendency:

get_mode <- function(x) {
  unique_x <- unique(x)
  unique_x[which.max(tabulate(match(x, unique_x)))]
}

summarize_numeric_columns <- function(df) {
  numeric_columns <- sapply(df, is.numeric)  
  stats <- data.frame() 

  for (col_name in names(df)[numeric_columns]) {
    column_data <- df[[col_name]]
    stats[col_name, "Mean"] <- mean(column_data, na.rm = TRUE)
    stats[col_name, "Median"] <- median(column_data, na.rm = TRUE)
    stats[col_name, "Mode"] <- get_mode(column_data)
    col_IQR <- IQR(column_data, na.rm = TRUE)
    stats[col_name, "IQR"] <- col_IQR
    stats[col_name, "Lower Bound"] <- quantile(column_data, 0.25, na.rm = TRUE) - 1.5 * col_IQR
    stats[col_name, "Upper Bound"] <- quantile(column_data, 0.75, na.rm = TRUE) + 1.5 * col_IQR
  }

  return(stats)
}

summary_stats_df <- summarize_numeric_columns(df)
summary_stats_df
##                     Mean     Median     Mode         IQR  Lower Bound
## Row.ID      4.997500e+03  4997.5000     1.00  4996.50000  -4995.50000
## Postal.Code 5.519038e+04 56430.5000 10035.00 66785.00000 -76954.50000
## Sales       2.298580e+02    54.4900    12.96   192.66000   -271.71000
## Quantity    3.789574e+00     3.0000     3.00     3.00000     -2.50000
## Discount    1.562027e-01     0.2000     0.00     0.20000     -0.30000
## Profit      2.865690e+01     8.6665     0.00    27.63525    -39.72413
##              Upper Bound
## Row.ID       14990.50000
## Postal.Code 190185.50000
## Sales          498.93000
## Quantity         9.50000
## Discount         0.50000
## Profit          70.81687

Checking Skewness and Kurtosis for the columns

selected_columns <- df[, c("Sales", "Quantity", "Discount", "Profit")]

# Calculating kurtosis and skewness
kurtosis_values <- sapply(selected_columns, kurtosis)
skewness_values <- sapply(selected_columns, skewness)

# Combining the results into a data frame
skew_kurt_result <- rbind(kurtosis = kurtosis_values, skewness = skewness_values)
print(skew_kurt_result)
##              Sales Quantity Discount     Profit
## kurtosis 308.15843 4.990293 5.407740 399.989229
## skewness  12.97081 1.278353 1.684042   7.560297

All of them are more peaked than a normal distribution (kurtosis).

None of the columns are normally distributed (skewness).

Interpretation:

Skewness:

Definition: Skewness measures the degree of asymmetry of a distribution. A distribution can be asymmetrical on either the right (positive skew) or left (negative skew) side.

Interpretation:

Positive Skew (Right-skewed): The tail on the right side of the distribution is longer or fatter than the left side. It indicates that a large number of data points are clustered on the left, with a few exceptionally large values to the right.

Negative Skew (Left-skewed): The tail on the left side of the distribution is longer or fatter than the right side. This suggests that a large number of observations are clustered on the right, with some exceptionally small values to the left.

Zero or Close to Zero: If the skewness is zero or close to zero, the data is considered to be fairly symmetrical.

Kurtosis:

Definition: Kurtosis measures the “tailedness” of the distribution. It’s a descriptor of the shape of the tails and the peak of the distribution, relative to a normal distribution. Interpretation:

High Kurtosis (Leptokurtic): A distribution with a kurtosis greater than 3 (excess kurtosis greater than 0) is said to be leptokurtic. Leptokurtic distributions have heavy tails or outliers. The peak is also higher and sharper than the peak of a normal distribution.

Low Kurtosis (Platykurtic): A distribution with a kurtosis less than 3 (excess kurtosis less than 0) is platykurtic. Platykurtic distributions have lighter tails or fewer outliers. The peak is lower and broader than the peak of a normal distribution.

Normal Kurtosis (Mesokurtic): A kurtosis of exactly 3 (excess kurtosis of 0) indicates a mesokurtic distribution, which has a shape similar to a normal distribution.

Exploratory Data Analysis

Univariate Analysis:

Distribution for Profit Variable

# Creating a histogram
hist(df$Sales, 
     main = "Distribution of Profit",  
     xlab = "Profit",              # X-axis label
     ylab = "Frequency",          # Y-axis label
     col = "green",                # Bar color
     border = "black",            # Border color
     breaks = 20,
     freq = FALSE)                 # Number of bins or breaks

# Displaying density also
lines(density(df$Profit), col = "black", lwd = 2)

qqnorm(df$Profit)
qqline(df$Profit, col = "red")  

Its a right skewed distribution. The profit distribution extends to 9000 which can likely be outliers. We would have to dig deeper into this.

Distribution for Quantity Variable

hist(df$Sales, 
     main = "Distribution of Quantity",  
     xlab = "Quantity",              # X-axis label
     ylab = "Frequency",          # Y-axis label
     col = "green",                # Bar color
     border = "black",            # Border color
     breaks = 20,
     freq = FALSE)                 # Number of bins or breaks

# Displaying density also
lines(density(df$Quantity), col = "black", lwd = 2)

qqnorm(df$Sales)
qqline(df$Sales, col = "red")  

> This is also a right skewed distribution.

The Q-Q plot also shows the same, the line diverges on the upper right side.

Checking the Categories Variable

category_table <- table(df$Category)
category_percentages <- round(100 * category_table / sum(category_table), 1)
labels <- paste(names(category_percentages), "-", category_percentages, "%", sep = "")

pie(category_table,
    labels = labels,
    main = "Categories - Variable Distribution",
    xlab = "Categories",
    ylab = "Frequency")

The most sold category is office supplies.

barplot(table(df$Region),
        main = "Region Variable Distribution",
        xlab = "Region",
        ylab = "Frequency")

The heighest sale is from west region.

Boxplots to check outliers in Sales:

ggplot(df, aes(y = Sales)) +
  geom_boxplot(coef = 1.5) +  
  labs(y = "Sales") +
  ggtitle("Box Plot of Sales")

> So we can see a number of outliers in Sales that can mean a certain number of things. The sales were specifically high on these days, could be the discounts given were high or other factors.

Boxplots to check outliers in Profit:

ggplot(df, aes(y = Profit)) +
  geom_boxplot(coef = 1.5) +
  labs(y = "Profit") +
  ggtitle("Box Plot of Profit")

Segment Wise Division

ggplot(data = df, aes(x = Segment)) +
  geom_bar() +
  labs(x = "Segment", y = "Count") +
  ggtitle("Bar Plot of Segment")

Multivariate Analysis

Checking relationship between Sales and Profit?

plot(df$Profit, df$Sales, 
     main = "Scatter Plot: Sales vs. Profit",
     xlab = "Sales",
     ylab = "Profit",
     col = "red"
)

Is there a relationship between Quantity and Discount?

ggplot(df, aes(x = Quantity, y = Discount)) +
  geom_point() +
  labs(x = "Quantity", y = "Discount") +
  ggtitle("Scatter Plot of Quantity vs. Discount")

The plot seems pretty uniform that means quantity is not affecting the discount.

df$Category <- as.factor(df$Category)
df$Segment <- as.factor(df$Segment)

# Create a grouped bar plot
ggplot(df, aes(x = Category, fill = Segment)) +
  geom_bar(position = "dodge", color = "black", stat = "count") +
  labs(title = "Grouped Bar Plot of Category and Segment",
       x = "Category",
       y = "Count") +
  scale_fill_manual(values = c("red", "blue", "green"))  

# df_num is containing only numeric columns of df
df_num <- df[sapply(df, is.numeric)]

# Calculating the correlation matrix
df_corr <- cor(df_num)

# Melting the correlation matrix for ggplot
df_corr_melted <- melt(df_corr)

# Creating the heatmap plot
heatmap_plot <- ggplot(data = df_corr_melted, aes(x = Var1, y = Var2, fill = value)) +
  geom_tile() +
  geom_text(aes(label = round(value, 2)), vjust = 1) +
  scale_fill_gradientn(colors = c("blue", "white", "red")) + # Custom color gradient
  labs(title = "Correlation Heatmap for Continuous Variables", x = "Features", y = "Features", fill = "Correlation")

# Printing the heatmap plot
print(heatmap_plot)

> No strong correlations can be seen here.

Inferential Statistics:

Hypothesis Testing:

Choice of test:

ANOVA is used for comparing means of continuous data across multiple groups defined by categorical variables.

Chi-Square Test is used for testing relationships between categorical variables or checking the distribution of categorical variables against expected distributions.

Hypothesis 1:

Test Used: Anova

H0: There is no significant difference in quantity ordered between different states

HA: There is a significant difference in quantity ordered between different states.

Significance level: 0.05, Power level: 0.8, Minimum Effect Size: 0.3.

# Since we are finding a relationship between continuous and categorical column, we need to write an anova test
result_statevsquantity <- aov(Quantity ~ State, data = df)
summary(result_statevsquantity)
##               Df Sum Sq Mean Sq F value Pr(>F)
## State         48    219   4.556    0.92  0.631
## Residuals   9945  49258   4.953

Since the probability or p value is 0.631 i.e greater than 0.05, we fail to reject the null hypothesis. And hence there is no significant difference in quantity between different states.

Hypothesis 2:

Test Used: Chi-Square

Null Hypothesis (H0): There is no significant interaction effect between “State” and “Category”

Alternative Hypothesis (HA): There is a significant interaction effect between “State” and “Category”

Significance level: 0.05, Power level: 0.8, Minimum Effect Size: 0.3.

state_vs_category <- chisq.test(table(df$State, df$Category))
## Warning in chisq.test(table(df$State, df$Category)): Chi-squared approximation
## may be incorrect
state_vs_category
## 
##  Pearson's Chi-squared test
## 
## data:  table(df$State, df$Category)
## X-squared = 102.86, df = 96, p-value = 0.2974

Since the probability or p value is 0.2974 i.e greater than 0.05, we fail to reject the null hypothesis.

Hypothesis 3:

Test Used: Anova

H0: There’s no effect of subcategory on sales.

HA: Subcategory does have an effect on sales.

# If there are more than 10 subcategories, consolidating them
if (length(unique(df$Sub.Category)) > 10) {
  # Here, we'll group the subcategories with the smallest counts into a "Other" category
  subcat_counts <- table(df$Sub.Category)
  small_subcats <- names(subcat_counts)[order(subcat_counts)][1:(length(subcat_counts)-10)]
  df$Sub.Category[df$Sub.Category %in% small_subcats] <- 'Other'
}

result <- aov(Sales ~ Sub.Category, data = df)
anova_summary <- summary(result)
print(anova_summary)
##                Df    Sum Sq  Mean Sq F value Pr(>F)    
## Sub.Category   10 3.111e+08 31105056   86.97 <2e-16 ***
## Residuals    9983 3.571e+09   357666                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
significance_level <- 0.05
p_value <- anova_summary[[1]]$'Pr(>F)'[1]

if (p_value < significance_level) {
  print("Reject the null hypothesis: Sales differ among the subcategories.")
} else {
  print("Do not reject the null hypothesis: There's no significant difference in sales among the subcategories.")
}
## [1] "Reject the null hypothesis: Sales differ among the subcategories."

A significance level of 0.05 corresponds to a 95% confidence interval. This means that if the same experiment or study were repeated many times, 95% of the calculated confidence intervals would contain the true population parameter.

Feature Engineering

# Converting date strings to Date objects
df$Order.Date <- as.Date(df$Order.Date, format = "%d-%m-%Y")
df$Ship.Date <- as.Date(df$Ship.Date, format = "%d-%m-%Y")

# Calculating the number of days between order date and ship date
df$Days_Between <- as.numeric(df$Ship.Date - df$Order.Date)
head(df["Days_Between"], 5)
##   Days_Between
## 1            3
## 2            3
## 3            4
## 4            7
## 5            7

One hot encoding data for Categorical Variables

cols_to_encode <- c("Ship.Mode", "Segment", "Country", "City", "State", "Region", "Category", "Sub.Category")

# Identify columns with more than one level
cols_with_multiple_levels <- sapply(df[cols_to_encode], function(x) length(unique(x)) > 1)

# Create a formula for one-hot encoding with only columns having multiple levels
formula <- as.formula(paste("~", paste(cols_to_encode[cols_with_multiple_levels], collapse = " + ")))

# Create dummy variables
dummy_vars <- dummyVars(formula, data = df)
encoded_data <- predict(dummy_vars, newdata = df)

# Select the columns from 'df' that you want to keep
continuous_cols <- c("Sales", "Quantity", "Discount", "Profit", "Days_Between")

# Combine the selected columns from 'df' with 'encoded_data'
final_data <- cbind(encoded_data, df[continuous_cols])

# Print the result
head(final_data, 1)
##   Ship.ModeFirst Class Ship.ModeSame Day Ship.ModeSecond Class
## 1                    0                 0                     1
##   Ship.ModeStandard Class Segment.Consumer Segment.Corporate
## 1                       0                1                 0
##   Segment.Home Office CityAberdeen CityAbilene CityAkron CityAlbuquerque
## 1                   0            0           0         0               0
##   CityAlexandria CityAllen CityAllentown CityAltoona CityAmarillo CityAnaheim
## 1              0         0             0           0            0           0
##   CityAndover CityAnn Arbor CityAntioch CityApopka CityApple Valley
## 1           0             0           0          0                0
##   CityAppleton CityArlington CityArlington Heights CityArvada CityAsheville
## 1            0             0                     0          0             0
##   CityAthens CityAtlanta CityAtlantic City CityAuburn CityAurora CityAustin
## 1          0           0                 0          0          0          0
##   CityAvondale CityBakersfield CityBaltimore CityBangor CityBartlett
## 1            0               0             0          0            0
##   CityBayonne CityBaytown CityBeaumont CityBedford CityBelleville CityBellevue
## 1           0           0            0           0              0            0
##   CityBellingham CityBethlehem CityBeverly CityBillings CityBloomington
## 1              0             0           0            0               0
##   CityBoca Raton CityBoise CityBolingbrook CityBossier City CityBowling Green
## 1              0         0               0                0                 0
##   CityBoynton Beach CityBozeman CityBrentwood CityBridgeton CityBristol
## 1                 0           0             0             0           0
##   CityBroken Arrow CityBroomfield CityBrownsville CityBryan CityBuffalo
## 1                0              0               0         0           0
##   CityBuffalo Grove CityBullhead City CityBurbank CityBurlington CityCaldwell
## 1                 0                 0           0              0            0
##   CityCamarillo CityCambridge CityCanton CityCarlsbad CityCarol Stream
## 1             0             0          0            0                0
##   CityCarrollton CityCary CityCedar Hill CityCedar Rapids CityChampaign
## 1              0        0              0                0             0
##   CityChandler CityChapel Hill CityCharlotte CityCharlottesville
## 1            0               0             0                   0
##   CityChattanooga CityChesapeake CityChester CityCheyenne CityChicago CityChico
## 1               0              0           0            0           0         0
##   CityChula Vista CityCincinnati CityCitrus Heights CityClarksville
## 1               0              0                  0               0
##   CityCleveland CityClifton CityClinton CityClovis CityCoachella
## 1             0           0           0          0             0
##   CityCollege Station CityColorado Springs CityColumbia CityColumbus
## 1                   0                    0            0            0
##   CityCommerce City CityConcord CityConroe CityConway CityCoon Rapids
## 1                 0           0          0          0               0
##   CityCoppell CityCoral Gables CityCoral Springs CityCorpus Christi
## 1           0                0                 0                  0
##   CityCosta Mesa CityCottage Grove CityCovington CityCranston
## 1              0                 0             0            0
##   CityCuyahoga Falls CityDallas CityDanbury CityDanville CityDavis
## 1                  0          0           0            0         0
##   CityDaytona Beach CityDearborn CityDearborn Heights CityDecatur CityDeer Park
## 1                 0            0                    0           0             0
##   CityDelray Beach CityDeltona CityDenver CityDes Moines CityDes Plaines
## 1                0           0          0              0               0
##   CityDetroit CityDover CityDraper CityDublin CityDubuque CityDurham CityEagan
## 1           0         0          0          0           0          0         0
##   CityEast Orange CityEast Point CityEau Claire CityEdinburg CityEdmond
## 1               0              0              0            0          0
##   CityEdmonds CityEl Cajon CityEl Paso CityElkhart CityElmhurst CityElyria
## 1           0            0           0           0            0          0
##   CityEncinitas CityEnglewood CityEscondido CityEugene CityEvanston CityEverett
## 1             0             0             0          0            0           0
##   CityFairfield CityFargo CityFarmington CityFayetteville CityFlorence
## 1             0         0              0                0            0
##   CityFort Collins CityFort Lauderdale CityFort Worth CityFrankfort
## 1                0                   0              0             0
##   CityFranklin CityFreeport CityFremont CityFresno CityFrisco CityGaithersburg
## 1            0            0           0          0          0                0
##   CityGarden City CityGarland CityGastonia CityGeorgetown CityGilbert
## 1               0           0            0              0           0
##   CityGladstone CityGlendale CityGlenview CityGoldsboro CityGrand Island
## 1             0            0            0             0                0
##   CityGrand Prairie CityGrand Rapids CityGrapevine CityGreat Falls CityGreeley
## 1                 0                0             0               0           0
##   CityGreen Bay CityGreensboro CityGreenville CityGreenwood CityGresham
## 1             0              0              0             0           0
##   CityGrove City CityGulfport CityHackensack CityHagerstown CityHaltom City
## 1              0            0              0              0               0
##   CityHamilton CityHampton CityHarlingen CityHarrisonburg CityHattiesburg
## 1            0           0             0                0               0
##   CityHelena CityHempstead CityHenderson CityHendersonville CityHesperia
## 1          0             0             1                  0            0
##   CityHialeah CityHickory CityHighland Park CityHillsboro CityHolland
## 1           0           0                 0             0           0
##   CityHollywood CityHolyoke CityHomestead CityHoover CityHot Springs
## 1             0           0             0          0               0
##   CityHouston CityHuntington Beach CityHuntsville CityIndependence
## 1           0                    0              0                0
##   CityIndianapolis CityInglewood CityIowa City CityIrving CityJackson
## 1                0             0             0          0           0
##   CityJacksonville CityJamestown CityJefferson City CityJohnson City
## 1                0             0                  0                0
##   CityJonesboro CityJupiter CityKeller CityKenner CityKenosha CityKent
## 1             0           0          0          0           0        0
##   CityKirkwood CityKissimmee CityKnoxville CityLa Crosse CityLa Mesa
## 1            0             0             0             0           0
##   CityLa Porte CityLa Quinta CityLafayette CityLaguna Niguel CityLake Charles
## 1            0             0             0                 0                0
##   CityLake Elsinore CityLake Forest CityLakeland CityLakeville CityLakewood
## 1                 0               0            0             0            0
##   CityLancaster CityLansing CityLaredo CityLas Cruces CityLas Vegas CityLaurel
## 1             0           0          0              0             0          0
##   CityLawrence CityLawton CityLayton CityLeague City CityLebanon CityLehi
## 1            0          0          0               0           0        0
##   CityLeominster CityLewiston CityLincoln Park CityLinden CityLindenhurst
## 1              0            0                0          0               0
##   CityLittle Rock CityLittleton CityLodi CityLogan CityLong Beach CityLongmont
## 1               0             0        0         0              0            0
##   CityLongview CityLorain CityLos Angeles CityLouisville CityLoveland
## 1            0          0               0              0            0
##   CityLowell CityLubbock CityMacon CityMadison CityMalden CityManchester
## 1          0           0         0           0          0              0
##   CityManhattan CityMansfield CityManteca CityMaple Grove CityMargate
## 1             0             0           0               0           0
##   CityMarietta CityMarion CityMarlborough CityMarysville CityMason CityMcallen
## 1            0          0               0              0         0           0
##   CityMedford CityMedina CityMelbourne CityMemphis CityMentor CityMeriden
## 1           0          0             0           0          0           0
##   CityMeridian CityMesa CityMesquite CityMiami CityMiddletown CityMidland
## 1            0        0            0         0              0           0
##   CityMilford CityMilwaukee CityMinneapolis CityMiramar CityMishawaka
## 1           0             0               0           0             0
##   CityMission Viejo CityMissoula CityMissouri City CityMobile CityModesto
## 1                 0            0                 0          0           0
##   CityMonroe CityMontebello CityMontgomery CityMoorhead CityMoreno Valley
## 1          0              0              0            0                 0
##   CityMorgan Hill CityMorristown CityMount Pleasant CityMount Vernon
## 1               0              0                  0                0
##   CityMurfreesboro CityMurray CityMurrieta CityMuskogee CityNaperville
## 1                0          0            0            0              0
##   CityNashua CityNashville CityNew Albany CityNew Bedford CityNew Brunswick
## 1          0             0              0               0                 0
##   CityNew Castle CityNew Rochelle CityNew York City CityNewark CityNewport News
## 1              0                0                 0          0                0
##   CityNiagara Falls CityNoblesville CityNorfolk CityNormal CityNorman
## 1                 0               0           0          0          0
##   CityNorth Charleston CityNorth Las Vegas CityNorth Miami CityNorwich
## 1                    0                   0               0           0
##   CityOak Park CityOakland CityOceanside CityOdessa CityOklahoma City
## 1            0           0             0          0                 0
##   CityOlathe CityOlympia CityOmaha CityOntario CityOrange CityOrem
## 1          0           0         0           0          0        0
##   CityOrland Park CityOrlando CityOrmond Beach CityOswego CityOverland Park
## 1               0           0                0          0                 0
##   CityOwensboro CityOxnard CityPalatine CityPalm Coast CityPark Ridge
## 1             0          0            0              0              0
##   CityParker CityParma CityPasadena CityPasco CityPassaic CityPaterson
## 1          0         0            0         0           0            0
##   CityPearland CityPembroke Pines CityPensacola CityPeoria CityPerth Amboy
## 1            0                  0             0          0               0
##   CityPharr CityPhiladelphia CityPhoenix CityPico Rivera CityPine Bluff
## 1         0                0           0               0              0
##   CityPlainfield CityPlano CityPlantation CityPleasant Grove CityPocatello
## 1              0         0              0                  0             0
##   CityPomona CityPompano Beach CityPort Arthur CityPort Orange
## 1          0                 0               0               0
##   CityPort Saint Lucie CityPortage CityPortland CityProvidence CityProvo
## 1                    0           0            0              0         0
##   CityPueblo CityQuincy CityRaleigh CityRancho Cucamonga CityRapid City
## 1          0          0           0                    0              0
##   CityReading CityRedding CityRedlands CityRedmond CityRedondo Beach
## 1           0           0            0           0                 0
##   CityRedwood City CityReno CityRenton CityRevere CityRichardson CityRichmond
## 1                0        0          0          0              0            0
##   CityRio Rancho CityRiverside CityRochester CityRochester Hills CityRock Hill
## 1              0             0             0                   0             0
##   CityRockford CityRockville CityRogers CityRome CityRomeoville CityRoseville
## 1            0             0          0        0              0             0
##   CityRoswell CityRound Rock CityRoyal Oak CitySacramento CitySaginaw
## 1           0              0             0              0           0
##   CitySaint Charles CitySaint Cloud CitySaint Louis CitySaint Paul
## 1                 0               0               0              0
##   CitySaint Peters CitySaint Petersburg CitySalem CitySalinas
## 1                0                    0         0           0
##   CitySalt Lake City CitySan Angelo CitySan Antonio CitySan Bernardino
## 1                  0              0               0                  0
##   CitySan Clemente CitySan Diego CitySan Francisco CitySan Gabriel CitySan Jose
## 1                0             0                 0               0            0
##   CitySan Luis Obispo CitySan Marcos CitySan Mateo CitySandy Springs
## 1                   0              0             0                 0
##   CitySanford CitySanta Ana CitySanta Barbara CitySanta Clara CitySanta Fe
## 1           0             0                 0               0            0
##   CitySanta Maria CityScottsdale CitySeattle CitySheboygan CityShelton
## 1               0              0           0             0           0
##   CitySierra Vista CitySioux Falls CitySkokie CitySmyrna CitySouth Bend
## 1                0               0          0          0              0
##   CitySouthaven CitySparks CitySpokane CitySpringdale CitySpringfield
## 1             0          0           0              0               0
##   CitySterling Heights CityStockton CitySuffolk CitySummerville CitySunnyvale
## 1                    0            0           0               0             0
##   CitySuperior CityTallahassee CityTamarac CityTampa CityTaylor CityTemecula
## 1            0               0           0         0          0            0
##   CityTempe CityTexarkana CityTexas City CityThe Colony CityThomasville
## 1         0             0              0              0               0
##   CityThornton CityThousand Oaks CityTigard CityTinley Park CityToledo
## 1            0                 0          0               0          0
##   CityTorrance CityTrenton CityTroy CityTucson CityTulsa CityTuscaloosa
## 1            0           0        0          0         0              0
##   CityTwin Falls CityTyler CityUrbandale CityUtica CityVacaville CityVallejo
## 1              0         0             0         0             0           0
##   CityVancouver CityVineland CityVirginia Beach CityVisalia CityWaco
## 1             0            0                  0           0        0
##   CityWarner Robins CityWarwick CityWashington CityWaterbury CityWaterloo
## 1                 0           0              0             0            0
##   CityWatertown CityWaukesha CityWausau CityWaynesboro CityWest Allis
## 1             0            0          0              0              0
##   CityWest Jordan CityWest Palm Beach CityWestfield CityWestland
## 1               0                   0             0            0
##   CityWestminster CityWheeling CityWhittier CityWichita CityWilmington
## 1               0            0            0           0              0
##   CityWilson CityWoodbury CityWoodland CityWoodstock CityWoonsocket CityYonkers
## 1          0            0            0             0              0           0
##   CityYork CityYucaipa CityYuma StateAlabama StateArizona StateArkansas
## 1        0           0        0            0            0             0
##   StateCalifornia StateColorado StateConnecticut StateDelaware
## 1               0             0                0             0
##   StateDistrict of Columbia StateFlorida StateGeorgia StateIdaho StateIllinois
## 1                         0            0            0          0             0
##   StateIndiana StateIowa StateKansas StateKentucky StateLouisiana StateMaine
## 1            0         0           0             1              0          0
##   StateMaryland StateMassachusetts StateMichigan StateMinnesota
## 1             0                  0             0              0
##   StateMississippi StateMissouri StateMontana StateNebraska StateNevada
## 1                0             0            0             0           0
##   StateNew Hampshire StateNew Jersey StateNew Mexico StateNew York
## 1                  0               0               0             0
##   StateNorth Carolina StateNorth Dakota StateOhio StateOklahoma StateOregon
## 1                   0                 0         0             0           0
##   StatePennsylvania StateRhode Island StateSouth Carolina StateSouth Dakota
## 1                 0                 0                   0                 0
##   StateTennessee StateTexas StateUtah StateVermont StateVirginia
## 1              0          0         0            0             0
##   StateWashington StateWest Virginia StateWisconsin StateWyoming RegionCentral
## 1               0                  0              0            0             0
##   RegionEast RegionSouth RegionWest Category.Furniture Category.Office Supplies
## 1          0           1          0                  1                        0
##   Category.Technology Sub.CategoryAccessories Sub.CategoryAppliances
## 1                   0                       0                      0
##   Sub.CategoryArt Sub.CategoryBinders Sub.CategoryChairs
## 1               0                   0                  0
##   Sub.CategoryFurnishings Sub.CategoryLabels Sub.CategoryOther
## 1                       0                  0                 1
##   Sub.CategoryPaper Sub.CategoryPhones Sub.CategoryStorage  Sales Quantity
## 1                 0                  0                   0 261.96        2
##   Discount  Profit Days_Between
## 1        0 41.9136            3

Modelling: Linear Regression- Predicting Sales of a store on the basis of demographic factors

# Set a random seed for reproducibility
set.seed(123)


# Splitting the data into training and test sets (e.g., 70% for training and 30% for testing)
split <- sample.split(final_data$Sales, SplitRatio = 0.7)
train_data <- final_data[split, ]
test_data <- final_data[!split, ]

# Creating a linear regression model
lm_model <- lm(Sales ~ ., data = train_data)

# Printing the summary of the linear regression model
summary(lm_model)
## 
## Call:
## lm(formula = Sales ~ ., data = train_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1181.8  -131.4   -25.2    60.6  8466.5 
## 
## Coefficients: (53 not defined because of singularities)
##                               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  6.722e+02  3.965e+02   1.695   0.0900 .  
## `Ship.ModeFirst Class`      -5.309e+00  2.121e+01  -0.250   0.8023    
## `Ship.ModeSame Day`         -2.328e+01  3.579e+01  -0.650   0.5154    
## `Ship.ModeSecond Class`      2.246e+01  1.676e+01   1.340   0.1802    
## `Ship.ModeStandard Class`           NA         NA      NA       NA    
## Segment.Consumer             8.854e+00  1.442e+01   0.614   0.5393    
## Segment.Corporate           -2.995e+00  1.577e+01  -0.190   0.8493    
## `Segment.Home Office`               NA         NA      NA       NA    
## CityAberdeen                        NA         NA      NA       NA    
## CityAbilene                  1.717e-02  5.313e+02   0.000   1.0000    
## CityAkron                    2.216e+02  3.451e+02   0.642   0.5208    
## CityAlbuquerque              3.227e+02  4.122e+02   0.783   0.4338    
## CityAlexandria               4.114e+02  3.531e+02   1.165   0.2440    
## CityAllen                    8.518e+01  4.143e+02   0.206   0.8371    
## CityAllentown                2.951e+02  4.003e+02   0.737   0.4611    
## CityAltoona                  8.566e+01  4.683e+02   0.183   0.8548    
## CityAmarillo                 6.690e+02  3.864e+02   1.731   0.0834 .  
## CityAnaheim                  3.291e+02  3.467e+02   0.949   0.3425    
## CityAndover                  8.743e+01  3.983e+02   0.220   0.8263    
## `CityAnn Arbor`              2.515e+01  4.139e+02   0.061   0.9516    
## CityAntioch                         NA         NA      NA       NA    
## CityApopka                  -2.658e+01  4.173e+02  -0.064   0.9492    
## `CityApple Valley`           2.464e+02  3.923e+02   0.628   0.5300    
## CityAppleton                 6.421e+02  4.877e+02   1.317   0.1880    
## CityArlington                2.584e+02  3.390e+02   0.762   0.4458    
## `CityArlington Heights`      4.753e+01  5.139e+02   0.092   0.9263    
## CityArvada                   7.889e+01  4.052e+02   0.195   0.8456    
## CityAsheville                2.674e+02  3.903e+02   0.685   0.4934    
## CityAthens                   2.179e+02  3.746e+02   0.582   0.5609    
## CityAtlanta                  2.571e+02  3.449e+02   0.745   0.4560    
## `CityAtlantic City`                 NA         NA      NA       NA    
## CityAuburn                   2.279e+02  3.475e+02   0.656   0.5121    
## CityAurora                   1.148e+02  3.221e+02   0.357   0.7214    
## CityAustin                   1.824e+02  3.497e+02   0.522   0.6020    
## CityAvondale                 5.596e+01  2.884e+02   0.194   0.8461    
## CityBakersfield              1.873e+02  3.539e+02   0.529   0.5967    
## CityBaltimore                1.017e+02  3.684e+02   0.276   0.7825    
## CityBangor                   4.337e+02  4.870e+02   0.891   0.3732    
## CityBartlett                -8.990e+01  5.317e+02  -0.169   0.8657    
## CityBayonne                         NA         NA      NA       NA    
## CityBaytown                  2.736e+02  5.308e+02   0.516   0.6062    
## CityBeaumont                 5.028e+01  3.969e+02   0.127   0.8992    
## CityBedford                  7.275e+01  4.139e+02   0.176   0.8605    
## CityBelleville              -7.683e+01  3.956e+02  -0.194   0.8460    
## CityBellevue                 2.577e+02  4.026e+02   0.640   0.5221    
## CityBellingham               7.610e+02  4.357e+02   1.747   0.0807 .  
## CityBethlehem                5.951e+02  4.218e+02   1.411   0.1584    
## CityBeverly                  5.417e+02  4.148e+02   1.306   0.1916    
## CityBillings                        NA         NA      NA       NA    
## CityBloomington              1.124e+02  3.418e+02   0.329   0.7424    
## `CityBoca Raton`             6.401e+01  4.645e+02   0.138   0.8904    
## CityBoise                    3.476e+02  8.071e+02   0.431   0.6667    
## CityBolingbrook             -2.047e+01  3.737e+02  -0.055   0.9563    
## `CityBossier City`           5.726e+02  3.973e+02   1.441   0.1495    
## `CityBowling Green`          1.279e+02  3.598e+02   0.355   0.7224    
## `CityBoynton Beach`          1.101e+02  4.003e+02   0.275   0.7832    
## CityBozeman                  4.212e+02  5.660e+02   0.744   0.4568    
## CityBrentwood                3.885e+02  3.592e+02   1.082   0.2794    
## CityBridgeton               -1.792e+02  4.601e+02  -0.390   0.6969    
## CityBristol                  1.516e+02  3.598e+02   0.421   0.6735    
## `CityBroken Arrow`           4.805e+02  4.431e+02   1.084   0.2782    
## CityBroomfield              -2.019e+01  4.383e+02  -0.046   0.9633    
## CityBrownsville              2.565e+01  3.867e+02   0.066   0.9471    
## CityBryan                    3.198e+01  3.966e+02   0.081   0.9357    
## CityBuffalo                  5.046e+02  3.669e+02   1.375   0.1690    
## `CityBuffalo Grove`          2.145e+02  5.139e+02   0.417   0.6764    
## `CityBullhead City`          8.073e+01  3.532e+02   0.229   0.8192    
## CityBurbank                  8.245e+02  4.421e+02   1.865   0.0622 .  
## CityBurlington               7.491e+02  3.685e+02   2.033   0.0421 *  
## CityCaldwell                 4.065e+02  8.068e+02   0.504   0.6143    
## CityCamarillo                1.745e+02  3.922e+02   0.445   0.6563    
## CityCambridge                1.530e+02  3.873e+02   0.395   0.6928    
## CityCanton                   1.489e+02  4.303e+02   0.346   0.7293    
## CityCarlsbad                 3.331e+02  4.429e+02   0.752   0.4519    
## `CityCarol Stream`           1.591e+02  3.922e+02   0.406   0.6850    
## CityCarrollton               3.423e+02  3.622e+02   0.945   0.3447    
## CityCary                     1.537e+02  3.974e+02   0.387   0.6989    
## `CityCedar Hill`             1.233e+02  5.312e+02   0.232   0.8164    
## `CityCedar Rapids`           3.078e+02  5.463e+02   0.563   0.5732    
## CityChampaign                       NA         NA      NA       NA    
## CityChandler                 6.648e+01  2.736e+02   0.243   0.8080    
## `CityChapel Hill`            2.204e+02  5.388e+02   0.409   0.6825    
## CityCharlotte                2.667e+02  3.606e+02   0.740   0.4596    
## CityCharlottesville          1.939e+02  5.259e+02   0.369   0.7124    
## CityChattanooga             -1.476e+02  3.797e+02  -0.389   0.6976    
## CityChesapeake               1.508e+02  3.517e+02   0.429   0.6681    
## CityChester                  1.512e+02  3.880e+02   0.390   0.6967    
## CityCheyenne                 1.332e+03  5.661e+02   2.352   0.0187 *  
## CityChicago                  1.352e+02  3.146e+02   0.430   0.6673    
## CityChico                    1.917e+02  3.648e+02   0.526   0.5992    
## `CityChula Vista`            6.233e+02  5.272e+02   1.182   0.2371    
## CityCincinnati               1.528e+02  3.478e+02   0.439   0.6603    
## `CityCitrus Heights`         2.922e+02  5.274e+02   0.554   0.5796    
## CityClarksville              8.257e+02  3.976e+02   2.077   0.0379 *  
## CityCleveland                2.123e+02  3.399e+02   0.625   0.5322    
## CityClifton                         NA         NA      NA       NA    
## CityClinton                  1.236e+02  3.711e+02   0.333   0.7392    
## CityClovis                   3.847e+02  5.659e+02   0.680   0.4966    
## CityCoachella                2.516e+02  4.419e+02   0.569   0.5692    
## `CityCollege Station`        6.026e+01  5.314e+02   0.113   0.9097    
## `CityColorado Springs`       1.274e+02  3.487e+02   0.365   0.7149    
## CityColumbia                 2.894e+02  3.450e+02   0.839   0.4017    
## CityColumbus                 1.785e+02  3.319e+02   0.538   0.5908    
## `CityCommerce City`          9.795e+01  5.243e+02   0.187   0.8518    
## CityConcord                  2.619e+02  3.561e+02   0.736   0.4621    
## CityConroe                          NA         NA      NA       NA    
## CityConway                          NA         NA      NA       NA    
## `CityCoon Rapids`            2.062e+02  5.399e+02   0.382   0.7026    
## CityCoppell                  4.371e+01  4.463e+02   0.098   0.9220    
## `CityCoral Gables`          -1.366e+02  5.463e+02  -0.250   0.8025    
## `CityCoral Springs`          5.664e+01  4.172e+02   0.136   0.8920    
## `CityCorpus Christi`         1.306e+02  3.790e+02   0.345   0.7304    
## `CityCosta Mesa`             2.286e+02  3.690e+02   0.620   0.5355    
## `CityCottage Grove`          1.741e+02  5.399e+02   0.322   0.7471    
## CityCovington                1.943e+02  4.194e+02   0.463   0.6432    
## CityCranston                 4.146e+02  4.106e+02   1.010   0.3127    
## `CityCuyahoga Falls`         8.679e+01  4.386e+02   0.198   0.8431    
## CityDallas                   1.556e+02  3.429e+02   0.454   0.6500    
## CityDanbury                  2.885e+02  5.384e+02   0.536   0.5921    
## CityDanville                 1.860e+02  3.748e+02   0.496   0.6198    
## CityDavis                    2.040e+02  5.274e+02   0.387   0.6990    
## `CityDaytona Beach`          4.706e+01  4.339e+02   0.108   0.9136    
## CityDearborn                 1.057e+02  4.303e+02   0.246   0.8060    
## `CityDearborn Heights`       1.463e+02  4.042e+02   0.362   0.7173    
## CityDecatur                  8.020e+01  3.297e+02   0.243   0.8078    
## `CityDeer Park`             -4.057e+01  5.304e+02  -0.076   0.9390    
## `CityDelray Beach`          -1.773e+01  4.645e+02  -0.038   0.9695    
## CityDeltona                 -1.236e+01  4.074e+02  -0.030   0.9758    
## CityDenver                   1.290e+02  3.385e+02   0.381   0.7032    
## `CityDes Moines`             3.364e+02  3.723e+02   0.904   0.3662    
## `CityDes Plaines`            5.964e+01  3.918e+02   0.152   0.8790    
## CityDetroit                  8.452e+01  3.633e+02   0.233   0.8160    
## CityDover                    2.065e+02  3.605e+02   0.573   0.5669    
## CityDraper                   1.480e+02  4.871e+02   0.304   0.7613    
## CityDublin                   2.080e+02  3.576e+02   0.581   0.5609    
## CityDubuque                  3.071e+02  4.639e+02   0.662   0.5081    
## CityDurham                   1.655e+02  3.854e+02   0.429   0.6676    
## CityEagan                    3.003e+02  3.867e+02   0.777   0.4375    
## `CityEast Orange`           -9.376e+01  4.129e+02  -0.227   0.8204    
## `CityEast Point`             2.989e+02  4.421e+02   0.676   0.4991    
## `CityEau Claire`             2.866e+02  4.336e+02   0.661   0.5086    
## CityEdinburg                 9.248e+01  4.462e+02   0.207   0.8358    
## CityEdmond                   5.021e+02  4.876e+02   1.030   0.3031    
## CityEdmonds                  2.948e+02  3.912e+02   0.753   0.4512    
## `CityEl Cajon`               2.536e+02  5.276e+02   0.481   0.6308    
## `CityEl Paso`                1.803e+02  3.579e+02   0.504   0.6146    
## CityElkhart                  2.306e+02  5.254e+02   0.439   0.6608    
## CityElmhurst                 2.936e+02  3.921e+02   0.749   0.4540    
## CityElyria                   1.879e+02  5.247e+02   0.358   0.7203    
## CityEncinitas                8.106e+01  3.926e+02   0.206   0.8364    
## CityEnglewood                1.872e+01  4.381e+02   0.043   0.9659    
## CityEscondido                2.695e+02  4.096e+02   0.658   0.5106    
## CityEugene                   2.800e+02  3.771e+02   0.742   0.4579    
## CityEvanston                 7.583e+01  4.258e+02   0.178   0.8587    
## CityEverett                  1.647e+02  3.551e+02   0.464   0.6427    
## CityFairfield                2.177e+02  3.408e+02   0.639   0.5230    
## CityFargo                    4.768e+02  4.435e+02   1.075   0.2823    
## CityFarmington               4.315e+02  4.875e+02   0.885   0.3761    
## CityFayetteville             1.147e+02  3.654e+02   0.314   0.7536    
## CityFlorence                 1.484e+02  3.476e+02   0.427   0.6695    
## `CityFort Collins`           7.289e+01  3.695e+02   0.197   0.8437    
## `CityFort Lauderdale`        1.073e+02  3.865e+02   0.278   0.7813    
## `CityFort Worth`             1.675e+02  3.546e+02   0.472   0.6367    
## CityFrankfort                1.684e+01  4.257e+02   0.040   0.9684    
## CityFranklin                 2.360e+02  3.484e+02   0.677   0.4982    
## CityFreeport                 2.003e+02  3.537e+02   0.566   0.5712    
## CityFremont                  2.811e+02  4.269e+02   0.659   0.5102    
## CityFresno                   2.819e+02  3.494e+02   0.807   0.4199    
## CityFrisco                   4.043e+01  4.459e+02   0.091   0.9278    
## CityGaithersburg            -5.669e+01  4.608e+02  -0.123   0.9021    
## `CityGarden City`            4.005e+02  4.337e+02   0.924   0.3557    
## CityGarland                  1.967e+02  4.462e+02   0.441   0.6594    
## CityGastonia                 1.405e+02  4.560e+02   0.308   0.7581    
## CityGeorgetown               2.274e+02  3.721e+02   0.611   0.5412    
## CityGilbert                  1.772e+02  2.634e+02   0.673   0.5012    
## CityGladstone                1.846e+02  4.430e+02   0.417   0.6769    
## CityGlendale                 5.204e+01  2.294e+02   0.227   0.8206    
## CityGlenview                        NA         NA      NA       NA    
## CityGoldsboro                       NA         NA      NA       NA    
## `CityGrand Island`           4.312e+02  5.664e+02   0.761   0.4465    
## `CityGrand Prairie`          9.140e+01  3.738e+02   0.245   0.8068    
## `CityGrand Rapids`           1.845e+01  3.921e+02   0.047   0.9625    
## CityGrapevine                       NA         NA      NA       NA    
## `CityGreat Falls`            3.673e+02  4.226e+02   0.869   0.3848    
## CityGreeley                  6.870e+01  4.382e+02   0.157   0.8754    
## `CityGreen Bay`              4.055e+02  4.871e+02   0.833   0.4051    
## CityGreensboro               1.998e+02  3.813e+02   0.524   0.6003    
## CityGreenville               3.004e+02  3.852e+02   0.780   0.4356    
## CityGreenwood                3.083e+02  5.253e+02   0.587   0.5573    
## CityGresham                  2.557e+02  3.950e+02   0.647   0.5175    
## `CityGrove City`            -3.103e+02  5.250e+02  -0.591   0.5546    
## CityGulfport                 1.376e+02  5.506e+02   0.250   0.8026    
## CityHackensack              -1.098e+02  3.953e+02  -0.278   0.7812    
## CityHagerstown               3.766e+01  5.437e+02   0.069   0.9448    
## `CityHaltom City`            1.910e+02  3.973e+02   0.481   0.6307    
## CityHamilton                 2.688e+02  4.395e+02   0.612   0.5408    
## CityHampton                  7.933e+01  3.597e+02   0.221   0.8254    
## CityHarlingen               -5.750e+00  4.145e+02  -0.014   0.9889    
## CityHarrisonburg             3.852e+02  3.722e+02   1.035   0.3007    
## CityHattiesburg              8.544e+01  4.015e+02   0.213   0.8315    
## CityHelena                   1.529e+02  5.662e+02   0.270   0.7871    
## CityHempstead                1.624e+02  3.711e+02   0.438   0.6617    
## CityHenderson                2.904e+02  3.395e+02   0.855   0.3923    
## CityHendersonville           1.284e+02  4.146e+02   0.310   0.7568    
## CityHesperia                 3.209e+02  5.272e+02   0.609   0.5427    
## CityHialeah                 -9.876e+01  3.786e+02  -0.261   0.7942    
## CityHickory                  3.324e+01  5.387e+02   0.062   0.9508    
## `CityHighland Park`          1.153e+02  3.548e+02   0.325   0.7453    
## CityHillsboro                2.888e+02  4.443e+02   0.650   0.5157    
## CityHolland                  1.699e+02  4.614e+02   0.368   0.7127    
## CityHollywood                6.949e+01  3.890e+02   0.179   0.8582    
## CityHolyoke                 -1.083e+02  5.316e+02  -0.204   0.8386    
## CityHomestead                2.081e+02  5.463e+02   0.381   0.7033    
## CityHoover                   1.725e+02  4.449e+02   0.388   0.6983    
## `CityHot Springs`            2.631e+01  4.535e+02   0.058   0.9537    
## CityHouston                  1.928e+02  3.417e+02   0.564   0.5727    
## `CityHuntington Beach`       2.908e+02  3.919e+02   0.742   0.4582    
## CityHuntsville               1.829e+02  3.458e+02   0.529   0.5969    
## CityIndependence             6.093e+02  5.287e+02   1.153   0.2491    
## CityIndianapolis             2.461e+02  3.455e+02   0.712   0.4764    
## CityInglewood                1.751e+02  3.590e+02   0.488   0.6259    
## `CityIowa City`                     NA         NA      NA       NA    
## CityIrving                   3.019e+02  4.465e+02   0.676   0.4990    
## CityJackson                  1.829e+02  3.604e+02   0.508   0.6118    
## CityJacksonville             1.525e+02  3.599e+02   0.424   0.6717    
## CityJamestown                2.519e+03  5.296e+02   4.757 2.01e-06 ***
## `CityJefferson City`         2.789e+02  5.281e+02   0.528   0.5975    
## `CityJohnson City`           9.855e+01  3.710e+02   0.266   0.7905    
## CityJonesboro                5.996e+01  4.282e+02   0.140   0.8886    
## CityJupiter                         NA         NA      NA       NA    
## CityKeller                   2.421e+02  5.316e+02   0.455   0.6488    
## CityKenner                   5.719e+02  5.439e+02   1.052   0.2930    
## CityKenosha                  3.856e+02  4.269e+02   0.903   0.3664    
## CityKent                     2.122e+02  3.737e+02   0.568   0.5702    
## CityKirkwood                 3.159e+02  5.282e+02   0.598   0.5498    
## CityKissimmee                4.971e+02  5.465e+02   0.910   0.3631    
## CityKnoxville                1.230e+02  3.567e+02   0.345   0.7302    
## `CityLa Crosse`              2.710e+02  4.427e+02   0.612   0.5406    
## `CityLa Mesa`                2.555e+02  4.419e+02   0.578   0.5632    
## `CityLa Porte`               9.274e+01  3.697e+02   0.251   0.8019    
## `CityLa Quinta`              1.630e+02  5.273e+02   0.309   0.7572    
## CityLafayette                5.374e+02  3.519e+02   1.527   0.1267    
## `CityLaguna Niguel`          2.809e+01  4.417e+02   0.064   0.9493    
## `CityLake Charles`           2.451e+02  4.617e+02   0.531   0.5956    
## `CityLake Elsinore`         -3.750e+01  5.275e+02  -0.071   0.9433    
## `CityLake Forest`            2.953e+02  4.418e+02   0.668   0.5039    
## CityLakeland                 7.055e+01  3.865e+02   0.183   0.8552    
## CityLakeville                2.294e+02  3.803e+02   0.603   0.5463    
## CityLakewood                 1.528e+02  3.465e+02   0.441   0.6592    
## CityLancaster                1.496e+02  3.401e+02   0.440   0.6601    
## CityLansing                  4.368e+01  3.918e+02   0.112   0.9112    
## CityLaredo                   2.005e+02  3.740e+02   0.536   0.5919    
## `CityLas Cruces`             4.218e+02  4.875e+02   0.865   0.3870    
## `CityLas Vegas`              3.423e+01  4.312e+02   0.079   0.9367    
## CityLaurel                   1.531e+02  5.434e+02   0.282   0.7781    
## CityLawrence                 1.284e+02  3.442e+02   0.373   0.7092    
## CityLawton                   2.895e+02  4.878e+02   0.594   0.5528    
## CityLayton                          NA         NA      NA       NA    
## `CityLeague City`            1.566e+02  3.868e+02   0.405   0.6856    
## CityLebanon                  1.038e+02  5.318e+02   0.195   0.8453    
## CityLehi                     1.201e+02  5.657e+02   0.212   0.8319    
## CityLeominster               2.053e+02  3.874e+02   0.530   0.5961    
## CityLewiston                 4.362e+02  5.663e+02   0.770   0.4412    
## `CityLincoln Park`           9.643e+01  4.614e+02   0.209   0.8344    
## CityLinden                  -2.788e+02  5.424e+02  -0.514   0.6072    
## CityLindenhurst              3.918e+02  5.286e+02   0.741   0.4586    
## `CityLittle Rock`            5.620e+01  4.000e+02   0.140   0.8883    
## CityLittleton               -2.963e+02  5.242e+02  -0.565   0.5719    
## CityLodi                            NA         NA      NA       NA    
## CityLogan                    2.631e+02  4.336e+02   0.607   0.5439    
## `CityLong Beach`             2.470e+02  3.398e+02   0.727   0.4674    
## CityLongmont                 1.818e+02  4.052e+02   0.449   0.6537    
## CityLongview                 1.663e+02  4.663e+02   0.357   0.7214    
## CityLorain                   4.038e+02  3.651e+02   1.106   0.2688    
## `CityLos Angeles`            2.458e+02  3.357e+02   0.732   0.4642    
## CityLouisville               2.542e+02  3.332e+02   0.763   0.4457    
## CityLoveland                 7.417e+01  3.877e+02   0.191   0.8483    
## CityLowell                   1.994e+02  3.653e+02   0.546   0.5852    
## CityLubbock                  1.414e+02  3.969e+02   0.356   0.7217    
## CityMacon                    1.807e+02  3.819e+02   0.473   0.6360    
## CityMadison                  6.170e+02  4.191e+02   1.472   0.1410    
## CityMalden                   2.250e+02  4.472e+02   0.503   0.6149    
## CityManchester               2.169e+02  3.842e+02   0.564   0.5724    
## CityManhattan                4.464e+02  5.659e+02   0.789   0.4303    
## CityMansfield                8.410e+01  4.463e+02   0.188   0.8505    
## CityManteca                  1.505e+02  4.418e+02   0.341   0.7334    
## `CityMaple Grove`            3.407e+02  4.256e+02   0.800   0.4235    
## CityMargate                         NA         NA      NA       NA    
## CityMarietta                 6.477e+01  3.816e+02   0.170   0.8652    
## CityMarion                   2.670e+02  3.541e+02   0.754   0.4510    
## CityMarlborough              1.674e+02  5.312e+02   0.315   0.7526    
## CityMarysville              -1.088e+02  5.482e+02  -0.199   0.8427    
## CityMason                    2.158e+02  4.063e+02   0.531   0.5952    
## CityMcallen                  1.301e+02  3.557e+02   0.366   0.7145    
## CityMedford                  2.578e+02  4.118e+02   0.626   0.5313    
## CityMedina                   1.022e+02  3.653e+02   0.280   0.7796    
## CityMelbourne                3.176e+01  5.464e+02   0.058   0.9537    
## CityMemphis                  2.689e+02  3.526e+02   0.763   0.4458    
## CityMentor                   2.849e+02  3.888e+02   0.733   0.4638    
## CityMeriden                  2.556e+02  3.753e+02   0.681   0.4958    
## CityMeridian                 3.991e+02  7.357e+02   0.542   0.5875    
## CityMesa                    -2.232e+01  2.227e+02  -0.100   0.9202    
## CityMesquite                 8.958e+01  4.133e+02   0.217   0.8284    
## CityMiami                    1.387e+02  3.697e+02   0.375   0.7076    
## CityMiddletown               3.380e+02  3.895e+02   0.868   0.3855    
## CityMidland                  1.709e+02  3.855e+02   0.443   0.6576    
## CityMilford                  1.283e+02  4.067e+02   0.316   0.7523    
## CityMilwaukee                3.212e+02  3.997e+02   0.804   0.4216    
## CityMinneapolis              5.196e+02  3.676e+02   1.413   0.1576    
## CityMiramar                 -1.745e+01  4.075e+02  -0.043   0.9658    
## CityMishawaka                2.112e+02  5.254e+02   0.402   0.6877    
## `CityMission Viejo`          1.127e+02  4.096e+02   0.275   0.7831    
## CityMissoula                 6.365e+02  5.660e+02   1.125   0.2607    
## `CityMissouri City`         -9.021e+01  5.302e+02  -0.170   0.8649    
## CityMobile                   4.717e+01  3.687e+02   0.128   0.8982    
## CityModesto                  4.267e+01  4.419e+02   0.097   0.9231    
## CityMonroe                   4.493e+02  3.690e+02   1.218   0.2234    
## CityMontebello               2.514e+02  5.271e+02   0.477   0.6334    
## CityMontgomery               1.402e+02  3.686e+02   0.380   0.7037    
## CityMoorhead                 1.742e+02  4.566e+02   0.382   0.7028    
## `CityMoreno Valley`          2.711e+02  3.689e+02   0.735   0.4625    
## `CityMorgan Hill`           -1.421e+00  4.420e+02  -0.003   0.9974    
## CityMorristown              -1.739e+01  3.906e+02  -0.045   0.9645    
## `CityMount Pleasant`         1.451e+02  5.434e+02   0.267   0.7895    
## `CityMount Vernon`           2.739e+02  3.835e+02   0.714   0.4750    
## CityMurfreesboro             1.245e+02  3.705e+02   0.336   0.7369    
## CityMurray                   2.594e+02  4.582e+02   0.566   0.5714    
## CityMurrieta                        NA         NA      NA       NA    
## CityMuskogee                 2.891e+02  4.877e+02   0.593   0.5534    
## CityNaperville               8.991e+01  3.547e+02   0.253   0.7999    
## CityNashua                   2.372e+02  5.451e+02   0.435   0.6635    
## CityNashville                2.603e+02  3.521e+02   0.739   0.4597    
## `CityNew Albany`             1.561e+02  4.394e+02   0.355   0.7224    
## `CityNew Bedford`            1.440e+02  4.467e+02   0.322   0.7471    
## `CityNew Brunswick`          5.546e+01  5.427e+02   0.102   0.9186    
## `CityNew Castle`            -3.804e+01  4.394e+02  -0.087   0.9310    
## `CityNew Rochelle`           1.282e+02  3.671e+02   0.349   0.7269    
## `CityNew York City`          2.587e+02  3.379e+02   0.766   0.4440    
## CityNewark                   1.975e+02  3.390e+02   0.583   0.5602    
## `CityNewport News`           1.923e+02  3.629e+02   0.530   0.5961    
## `CityNiagara Falls`          1.343e+02  4.438e+02   0.303   0.7623    
## CityNoblesville              2.069e+02  5.256e+02   0.394   0.6938    
## CityNorfolk                         NA         NA      NA       NA    
## CityNormal                   1.971e+02  5.140e+02   0.383   0.7014    
## CityNorman                          NA         NA      NA       NA    
## `CityNorth Charleston`       3.371e+02  5.371e+02   0.628   0.5302    
## `CityNorth Las Vegas`        4.122e+02  4.284e+02   0.962   0.3361    
## `CityNorth Miami`           -8.090e+00  5.468e+02  -0.015   0.9882    
## CityNorwich                  3.088e+02  4.236e+02   0.729   0.4660    
## `CityOak Park`               1.928e+02  4.092e+02   0.471   0.6375    
## CityOakland                  2.254e+02  3.474e+02   0.649   0.5165    
## CityOceanside                1.672e+02  3.455e+02   0.484   0.6285    
## CityOdessa                   1.211e+02  4.143e+02   0.292   0.7700    
## `CityOklahoma City`          4.500e+02  4.066e+02   1.107   0.2684    
## CityOlathe                   4.639e+02  4.432e+02   1.047   0.2954    
## CityOlympia                  1.936e+02  4.196e+02   0.461   0.6446    
## CityOmaha                    3.840e+02  4.038e+02   0.951   0.3417    
## CityOntario                         NA         NA      NA       NA    
## CityOrange                  -7.829e+01  4.126e+02  -0.190   0.8495    
## CityOrem                     4.805e+02  4.225e+02   1.137   0.2555    
## `CityOrland Park`            2.164e+02  5.138e+02   0.421   0.6736    
## CityOrlando                 -7.708e+01  4.004e+02  -0.193   0.8474    
## `CityOrmond Beach`                  NA         NA      NA       NA    
## CityOswego                   5.322e+02  4.255e+02   1.251   0.2111    
## `CityOverland Park`          3.820e+02  4.428e+02   0.863   0.3884    
## CityOwensboro                2.615e+02  4.398e+02   0.594   0.5522    
## CityOxnard                   3.683e+02  3.687e+02   0.999   0.3178    
## CityPalatine                -2.387e+01  5.138e+02  -0.046   0.9630    
## `CityPalm Coast`             3.114e+01  4.335e+02   0.072   0.9427    
## `CityPark Ridge`             4.234e+02  4.255e+02   0.995   0.3197    
## CityParker                   5.155e+01  3.638e+02   0.142   0.8873    
## CityParma                    3.397e+02  3.712e+02   0.915   0.3601    
## CityPasadena                 2.134e+02  3.428e+02   0.622   0.5337    
## CityPasco                    3.768e+02  4.098e+02   0.920   0.3578    
## CityPassaic                 -1.800e+02  4.025e+02  -0.447   0.6548    
## CityPaterson                -1.700e+02  3.765e+02  -0.452   0.6516    
## CityPearland                 1.345e+02  4.138e+02   0.325   0.7453    
## `CityPembroke Pines`        -4.426e+00  3.891e+02  -0.011   0.9909    
## CityPensacola               -2.965e+01  5.464e+02  -0.054   0.9567    
## CityPeoria                   2.554e+01  2.380e+02   0.107   0.9145    
## `CityPerth Amboy`           -3.741e-01  4.600e+02  -0.001   0.9994    
## CityPharr                    1.341e+02  5.308e+02   0.253   0.8005    
## CityPhiladelphia             2.551e+02  3.699e+02   0.690   0.4904    
## CityPhoenix                  1.265e+02  2.130e+02   0.594   0.5524    
## `CityPico Rivera`            2.175e+02  5.274e+02   0.412   0.6801    
## `CityPine Bluff`            -2.863e+01  5.619e+02  -0.051   0.9594    
## CityPlainfield               1.210e+02  3.839e+02   0.315   0.7527    
## CityPlano                    1.919e+02  3.646e+02   0.526   0.5988    
## CityPlantation               2.787e+01  3.955e+02   0.070   0.9438    
## `CityPleasant Grove`         3.160e+02  4.337e+02   0.728   0.4664    
## CityPocatello                3.616e+02  7.202e+02   0.502   0.6156    
## CityPomona                   2.546e+02  3.921e+02   0.649   0.5161    
## `CityPompano Beach`         -5.998e+00  4.335e+02  -0.014   0.9890    
## `CityPort Arthur`           -1.042e+01  3.972e+02  -0.026   0.9791    
## `CityPort Orange`            7.375e+01  5.469e+02   0.135   0.8927    
## `CityPort Saint Lucie`      -3.538e+02  5.465e+02  -0.647   0.5174    
## CityPortage                  2.775e+02  5.254e+02   0.528   0.5974    
## CityPortland                 1.941e+02  3.509e+02   0.553   0.5803    
## CityProvidence               3.859e+02  4.026e+02   0.959   0.3378    
## CityProvo                    4.527e+02  4.224e+02   1.072   0.2839    
## CityPueblo                   1.898e+02  3.638e+02   0.522   0.6019    
## CityQuincy                   4.928e+01  3.369e+02   0.146   0.8837    
## CityRaleigh                  1.608e+02  3.676e+02   0.438   0.6618    
## `CityRancho Cucamonga`       2.377e+02  3.923e+02   0.606   0.5447    
## `CityRapid City`                    NA         NA      NA       NA    
## CityReading                  1.835e+02  4.218e+02   0.435   0.6635    
## CityRedding                  2.673e+02  5.274e+02   0.507   0.6123    
## CityRedlands                 2.935e+02  3.571e+02   0.822   0.4112    
## CityRedmond                  1.493e+02  3.631e+02   0.411   0.6809    
## `CityRedondo Beach`          2.153e+02  3.742e+02   0.575   0.5650    
## `CityRedwood City`           1.136e+02  5.271e+02   0.215   0.8294    
## CityReno                            NA         NA      NA       NA    
## CityRenton                          NA         NA      NA       NA    
## CityRevere                   9.943e+01  4.147e+02   0.240   0.8105    
## CityRichardson               6.911e+02  5.311e+02   1.301   0.1932    
## CityRichmond                 1.786e+02  3.321e+02   0.538   0.5907    
## `CityRio Rancho`             3.601e+02  4.876e+02   0.738   0.4602    
## CityRiverside                2.868e+02  3.689e+02   0.777   0.4369    
## CityRochester                2.519e+02  3.439e+02   0.732   0.4639    
## `CityRochester Hills`        1.122e+01  5.437e+02   0.021   0.9835    
## `CityRock Hill`              2.363e+02  5.370e+02   0.440   0.6599    
## CityRockford                 3.506e+02  3.414e+02   1.027   0.3045    
## CityRockville                6.564e+01  4.302e+02   0.153   0.8787    
## CityRogers                  -1.888e+02  5.621e+02  -0.336   0.7370    
## CityRome                     1.672e+02  3.838e+02   0.436   0.6631    
## CityRomeoville               6.372e+01  5.136e+02   0.124   0.9013    
## CityRoseville                1.394e+02  3.489e+02   0.400   0.6895    
## CityRoswell                  3.126e+02  3.542e+02   0.883   0.3775    
## `CityRound Rock`            -2.804e+00  4.143e+02  -0.007   0.9946    
## `CityRoyal Oak`              1.625e+02  4.614e+02   0.352   0.7247    
## CitySacramento               1.955e+02  3.688e+02   0.530   0.5960    
## CitySaginaw                 -1.999e+01  3.973e+02  -0.050   0.9599    
## `CitySaint Charles`          9.950e+01  3.529e+02   0.282   0.7780    
## `CitySaint Cloud`            3.717e+02  5.395e+02   0.689   0.4908    
## `CitySaint Louis`            2.932e+01  4.104e+02   0.071   0.9430    
## `CitySaint Paul`             1.064e+02  5.399e+02   0.197   0.8438    
## `CitySaint Peters`                  NA         NA      NA       NA    
## `CitySaint Petersburg`       1.035e+02  3.845e+02   0.269   0.7878    
## CitySalem                    1.693e+02  3.423e+02   0.495   0.6208    
## CitySalinas                  1.908e+02  3.923e+02   0.486   0.6268    
## `CitySalt Lake City`         3.472e+02  4.273e+02   0.812   0.4166    
## `CitySan Angelo`            -3.135e+01  4.141e+02  -0.076   0.9397    
## `CitySan Antonio`            5.945e+02  3.461e+02   1.718   0.0859 .  
## `CitySan Bernardino`         2.414e+02  3.691e+02   0.654   0.5131    
## `CitySan Clemente`           2.080e+02  5.275e+02   0.394   0.6934    
## `CitySan Diego`              2.776e+02  3.376e+02   0.822   0.4109    
## `CitySan Francisco`          2.628e+02  3.359e+02   0.782   0.4340    
## `CitySan Gabriel`            6.418e+02  4.421e+02   1.452   0.1467    
## `CitySan Jose`               1.641e+02  3.430e+02   0.478   0.6325    
## `CitySan Luis Obispo`        2.377e+02  5.275e+02   0.451   0.6524    
## `CitySan Marcos`             1.213e+02  4.463e+02   0.272   0.7858    
## `CitySan Mateo`              1.796e+02  5.279e+02   0.340   0.7336    
## `CitySandy Springs`          1.642e+02  3.621e+02   0.453   0.6502    
## CitySanford                 -1.843e+01  5.461e+02  -0.034   0.9731    
## `CitySanta Ana`             -8.262e+01  3.812e+02  -0.217   0.8284    
## `CitySanta Barbara`          2.675e+02  3.923e+02   0.682   0.4954    
## `CitySanta Clara`            2.271e+02  3.922e+02   0.579   0.5626    
## `CitySanta Fe`               2.132e+02  4.874e+02   0.437   0.6618    
## `CitySanta Maria`            1.523e+02  5.277e+02   0.289   0.7728    
## CityScottsdale               1.474e+01  2.499e+02   0.059   0.9530    
## CitySeattle                  2.714e+02  3.675e+02   0.739   0.4601    
## CitySheboygan                3.693e+02  4.435e+02   0.833   0.4051    
## CityShelton                  2.004e+02  5.385e+02   0.372   0.7097    
## `CitySierra Vista`           5.484e+01  3.531e+02   0.155   0.8766    
## `CitySioux Falls`            3.449e+02  4.187e+02   0.824   0.4102    
## CitySkokie                   1.740e+02  3.734e+02   0.466   0.6413    
## CitySmyrna                   1.425e+02  3.464e+02   0.411   0.6809    
## `CitySouth Bend`             6.766e+01  3.898e+02   0.174   0.8622    
## CitySouthaven                1.068e+02  4.226e+02   0.253   0.8005    
## CitySparks                   3.103e+02  4.756e+02   0.652   0.5141    
## CitySpokane                  1.017e+02  4.193e+02   0.243   0.8083    
## CitySpringdale               1.470e+02  5.618e+02   0.262   0.7936    
## CitySpringfield              2.477e+02  3.323e+02   0.745   0.4561    
## `CitySterling Heights`       9.008e+01  4.615e+02   0.195   0.8453    
## CityStockton                 1.704e+02  3.921e+02   0.434   0.6640    
## CitySuffolk                  2.587e+02  3.795e+02   0.682   0.4954    
## CitySummerville              1.081e+02  4.535e+02   0.238   0.8115    
## CitySunnyvale                1.696e+02  3.815e+02   0.445   0.6566    
## CitySuperior                 3.637e+02  4.273e+02   0.851   0.3948    
## CityTallahassee              2.841e+02  3.815e+02   0.745   0.4565    
## CityTamarac                  4.132e+02  4.174e+02   0.990   0.3223    
## CityTampa                    1.007e+02  3.741e+02   0.269   0.7878    
## CityTaylor                   3.734e+01  4.615e+02   0.081   0.9355    
## CityTemecula                 3.222e+02  4.098e+02   0.786   0.4318    
## CityTempe                    8.192e+00  2.558e+02   0.032   0.9745    
## CityTexarkana               -8.434e+01  4.824e+02  -0.175   0.8612    
## `CityTexas City`            -5.758e+01  4.145e+02  -0.139   0.8895    
## `CityThe Colony`                    NA         NA      NA       NA    
## CityThomasville              1.177e+02  5.387e+02   0.218   0.8271    
## CityThornton                 3.976e+01  3.696e+02   0.108   0.9143    
## `CityThousand Oaks`          1.916e+02  3.814e+02   0.502   0.6155    
## CityTigard                   3.757e+02  3.716e+02   1.011   0.3121    
## `CityTinley Park`           -8.752e+01  5.134e+02  -0.170   0.8647    
## CityToledo                   2.406e+02  3.421e+02   0.703   0.4818    
## CityTorrance                 3.869e+02  4.093e+02   0.945   0.3445    
## CityTrenton                 -8.741e+01  3.808e+02  -0.230   0.8184    
## CityTroy                     2.770e+02  3.436e+02   0.806   0.4202    
## CityTucson                   2.191e+01  2.192e+02   0.100   0.9204    
## CityTulsa                    3.369e+02  4.038e+02   0.834   0.4042    
## CityTuscaloosa                      NA         NA      NA       NA    
## `CityTwin Falls`             4.532e+02  7.539e+02   0.601   0.5478    
## CityTyler                    3.595e+02  4.138e+02   0.869   0.3850    
## CityUrbandale                3.576e+02  4.334e+02   0.825   0.4094    
## CityUtica                    1.866e+02  3.669e+02   0.509   0.6111    
## CityVacaville                       NA         NA      NA       NA    
## CityVallejo                  2.197e+02  4.097e+02   0.536   0.5919    
## CityVancouver                2.702e+02  4.356e+02   0.620   0.5351    
## CityVineland                -1.367e+02  4.025e+02  -0.340   0.7342    
## `CityVirginia Beach`         2.261e+02  3.549e+02   0.637   0.5241    
## CityVisalia                  1.183e+02  4.095e+02   0.289   0.7727    
## CityWaco                     2.372e+02  3.791e+02   0.626   0.5316    
## `CityWarner Robins`          1.017e+02  4.423e+02   0.230   0.8183    
## CityWarwick                  3.544e+02  4.874e+02   0.727   0.4673    
## CityWashington               2.324e+02  4.227e+02   0.550   0.5825    
## CityWaterbury                2.363e+02  3.776e+02   0.626   0.5315    
## CityWaterloo                 3.891e+02  5.462e+02   0.712   0.4763    
## CityWatertown                3.182e+02  3.711e+02   0.858   0.3911    
## CityWaukesha                 1.484e+02  5.664e+02   0.262   0.7933    
## CityWausau                   2.865e+02  4.432e+02   0.646   0.5181    
## CityWaynesboro               3.256e+02  3.669e+02   0.887   0.3749    
## `CityWest Allis`             7.056e+01  5.664e+02   0.125   0.9009    
## `CityWest Jordan`            3.284e+02  4.431e+02   0.741   0.4586    
## `CityWest Palm Beach`        1.314e+01  4.337e+02   0.030   0.9758    
## CityWestfield               -5.691e+01  4.127e+02  -0.138   0.8903    
## CityWestland                 9.996e+01  3.856e+02   0.259   0.7954    
## CityWestminster              3.214e+02  3.572e+02   0.900   0.3682    
## CityWheeling                 1.680e+02  4.255e+02   0.395   0.6930    
## CityWhittier                        NA         NA      NA       NA    
## CityWichita                  3.299e+02  4.336e+02   0.761   0.4468    
## CityWilmington               1.336e+02  3.495e+02   0.382   0.7023    
## CityWilson                   1.310e+02  4.241e+02   0.309   0.7573    
## CityWoodbury                 2.843e+02  4.567e+02   0.623   0.5336    
## CityWoodland                 2.914e+02  4.419e+02   0.660   0.5096    
## CityWoodstock                1.530e+02  3.786e+02   0.404   0.6861    
## CityWoonsocket               4.065e+02  4.875e+02   0.834   0.4043    
## CityYonkers                  3.001e+02  3.592e+02   0.835   0.4035    
## CityYork                     1.767e+02  4.115e+02   0.429   0.6676    
## CityYucaipa                         NA         NA      NA       NA    
## CityYuma                            NA         NA      NA       NA    
## StateAlabama                 2.247e+02  2.553e+02   0.880   0.3789    
## StateArizona                 2.870e+02  3.365e+02   0.853   0.3937    
## StateArkansas                3.142e+02  3.063e+02   1.026   0.3051    
## StateCalifornia              1.224e+02  2.395e+02   0.511   0.6092    
## StateColorado                2.603e+02  2.490e+02   1.046   0.2958    
## StateConnecticut             1.447e+02  2.559e+02   0.566   0.5717    
## StateDelaware                1.884e+02  2.475e+02   0.761   0.4466    
## `StateDistrict of Columbia`         NA         NA      NA       NA    
## StateFlorida                 2.634e+02  2.770e+02   0.951   0.3418    
## StateGeorgia                 1.640e+02  2.359e+02   0.695   0.4869    
## StateIdaho                  -6.915e+00  5.752e+02  -0.012   0.9904    
## StateIllinois                1.936e+02  2.386e+02   0.811   0.4172    
## StateIndiana                 1.986e+02  2.314e+02   0.858   0.3908    
## StateIowa                   -1.342e+01  2.746e+02  -0.049   0.9610    
## StateKansas                         NA         NA      NA       NA    
## StateKentucky                1.283e+02  2.406e+02   0.533   0.5938    
## StateLouisiana              -1.633e+02  2.716e+02  -0.601   0.5477    
## StateMaine                          NA         NA      NA       NA    
## StateMaryland                1.979e+02  2.626e+02   0.754   0.4512    
## StateMassachusetts           2.035e+02  2.095e+02   0.971   0.3314    
## StateMichigan                2.912e+02  2.648e+02   1.100   0.2715    
## StateMinnesota               8.582e+01  2.663e+02   0.322   0.7472    
## StateMississippi             1.917e+02  2.768e+02   0.693   0.4886    
## StateMissouri                2.125e+02  2.396e+02   0.887   0.3751    
## StateMontana                        NA         NA      NA       NA    
## StateNebraska                       NA         NA      NA       NA    
## StateNevada                  2.730e+02  3.434e+02   0.795   0.4266    
## `StateNew Hampshire`         1.089e+02  2.749e+02   0.396   0.6919    
## `StateNew Jersey`            4.498e+02  2.700e+02   1.666   0.0957 .  
## `StateNew Mexico`                   NA         NA      NA       NA    
## `StateNew York`              1.285e+02  2.450e+02   0.525   0.5999    
## `StateNorth Carolina`        1.604e+02  2.623e+02   0.612   0.5408    
## `StateNorth Dakota`                 NA         NA      NA       NA    
## StateOhio                    1.201e+02  2.319e+02   0.518   0.6045    
## StateOklahoma                       NA         NA      NA       NA    
## StateOregon                  7.871e+01  2.426e+02   0.324   0.7456    
## StatePennsylvania            1.671e+02  2.838e+02   0.589   0.5560    
## `StateRhode Island`                 NA         NA      NA       NA    
## `StateSouth Carolina`        1.021e+02  2.510e+02   0.407   0.6840    
## `StateSouth Dakota`                 NA         NA      NA       NA    
## StateTennessee               2.128e+02  2.258e+02   0.942   0.3460    
## StateTexas                   1.493e+02  2.495e+02   0.598   0.5497    
## StateUtah                           NA         NA      NA       NA    
## StateVermont                -3.329e+02  3.207e+02  -1.038   0.2993    
## StateVirginia                1.875e+02  2.357e+02   0.795   0.4266    
## StateWashington              1.139e+02  2.759e+02   0.413   0.6797    
## `StateWest Virginia`         2.703e+02  4.410e+02   0.613   0.5400    
## StateWisconsin                      NA         NA      NA       NA    
## StateWyoming                        NA         NA      NA       NA    
## RegionCentral                       NA         NA      NA       NA    
## RegionEast                          NA         NA      NA       NA    
## RegionSouth                         NA         NA      NA       NA    
## RegionWest                          NA         NA      NA       NA    
## Category.Furniture          -5.294e+02  4.343e+01 -12.189  < 2e-16 ***
## `Category.Office Supplies`  -1.012e+03  4.355e+01 -23.232  < 2e-16 ***
## Category.Technology                 NA         NA      NA       NA    
## Sub.CategoryAccessories     -1.106e+03  5.017e+01 -22.046  < 2e-16 ***
## Sub.CategoryAppliances      -6.761e+01  2.973e+01  -2.274   0.0230 *  
## Sub.CategoryArt             -2.044e+02  2.489e+01  -8.212 2.60e-16 ***
## Sub.CategoryBinders         -1.773e+02  2.593e+01  -6.837 8.80e-12 ***
## Sub.CategoryChairs          -2.395e+02  4.009e+01  -5.975 2.43e-09 ***
## Sub.CategoryFurnishings     -6.457e+02  3.835e+01 -16.837  < 2e-16 ***
## Sub.CategoryLabels          -2.089e+02  3.163e+01  -6.604 4.33e-11 ***
## Sub.CategoryOther           -1.116e+02  2.650e+01  -4.209 2.60e-05 ***
## Sub.CategoryPaper           -2.031e+02  2.204e+01  -9.216  < 2e-16 ***
## Sub.CategoryPhones          -9.462e+02  4.908e+01 -19.278  < 2e-16 ***
## Sub.CategoryStorage                 NA         NA      NA       NA    
## Quantity                     4.442e+01  2.264e+00  19.615  < 2e-16 ***
## Discount                     2.182e+02  4.991e+01   4.373 1.24e-05 ***
## Profit                       1.514e+00  2.303e-02  65.737  < 2e-16 ***
## Days_Between                -9.531e-01  5.362e+00  -0.178   0.8589    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 406.5 on 6438 degrees of freedom
## Multiple R-squared:  0.5793, Adjusted R-squared:  0.543 
## F-statistic: 15.94 on 556 and 6438 DF,  p-value: < 2.2e-16
# Making predictions on the test set
predictions <- predict(lm_model, newdata = test_data)
## Warning in predict.lm(lm_model, newdata = test_data): prediction from
## rank-deficient fit; attr(*, "non-estim") has doubtful cases
# Calculating the performance metrics (e.g., RMSE)
rmse <- sqrt(mean((predictions - test_data$Sales)^2))

# Printing the RMSE
print(paste("Root Mean Squared Error (RMSE):", rmse))
## [1] "Root Mean Squared Error (RMSE): 652.608684865285"

Checking the predictions of the model

head(predictions, 5)
##        2        4        5        8       11 
## 816.0427 144.5988 169.4996 545.6848 976.0285
# Set a random seed for reproducibility
set.seed(123)


# Splitting the data into training and test sets (e.g., 70% for training and 30% for testing)
split <- sample.split(final_data$Sales, SplitRatio = 0.85)
train_data <- final_data[split, ]
test_data <- final_data[!split, ]

# Creating a linear regression model
lm_model2 <- lm(Sales ~ ., data = train_data)

# Printing the summary of the linear regression model
summary(lm_model2)
## 
## Call:
## lm(formula = Sales ~ ., data = train_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1600.2  -140.7   -24.3    70.1 22745.9 
## 
## Coefficients: (33 not defined because of singularities)
##                               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  1.160e+03  4.430e+02   2.619  0.00883 ** 
## `Ship.ModeFirst Class`       7.661e+00  2.362e+01   0.324  0.74565    
## `Ship.ModeSame Day`         -1.001e+01  4.000e+01  -0.250  0.80232    
## `Ship.ModeSecond Class`      1.376e+01  1.862e+01   0.739  0.45989    
## `Ship.ModeStandard Class`           NA         NA      NA       NA    
## Segment.Consumer            -8.083e+00  1.609e+01  -0.502  0.61551    
## Segment.Corporate           -2.062e+01  1.759e+01  -1.172  0.24111    
## `Segment.Home Office`               NA         NA      NA       NA    
## CityAberdeen                        NA         NA      NA       NA    
## CityAbilene                  2.210e+02  6.380e+02   0.346  0.72903    
## CityAkron                    3.419e+02  4.027e+02   0.849  0.39591    
## CityAlbuquerque              1.536e+02  4.626e+02   0.332  0.73982    
## CityAlexandria               4.555e+02  4.088e+02   1.114  0.26521    
## CityAllen                    2.749e+02  4.893e+02   0.562  0.57424    
## CityAllentown                8.180e+02  4.733e+02   1.728  0.08397 .  
## CityAltoona                  6.308e+02  5.598e+02   1.127  0.25988    
## CityAmarillo                 6.654e+02  4.322e+02   1.540  0.12370    
## CityAnaheim                  5.603e+02  4.044e+02   1.385  0.16597    
## CityAndover                  1.087e+02  4.679e+02   0.232  0.81633    
## `CityAnn Arbor`              1.505e+02  4.679e+02   0.322  0.74776    
## CityAntioch                  5.373e+02  6.351e+02   0.846  0.39754    
## CityApopka                  -2.704e+02  4.682e+02  -0.578  0.56355    
## `CityApple Valley`           5.205e+02  4.498e+02   1.157  0.24722    
## CityAppleton                 5.662e+02  5.646e+02   1.003  0.31593    
## CityArlington                3.800e+02  3.935e+02   0.966  0.33425    
## `CityArlington Heights`      1.166e+02  6.210e+02   0.188  0.85105    
## CityArvada                   2.368e+02  4.582e+02   0.517  0.60528    
## CityAsheville                3.099e+02  4.577e+02   0.677  0.49837    
## CityAthens                   2.801e+02  4.406e+02   0.636  0.52495    
## CityAtlanta                  3.694e+02  3.997e+02   0.924  0.35543    
## `CityAtlantic City`          2.893e+00  6.514e+02   0.004  0.99646    
## CityAuburn                   3.871e+02  4.051e+02   0.956  0.33928    
## CityAurora                   2.081e+02  3.757e+02   0.554  0.57976    
## CityAustin                   3.672e+02  4.033e+02   0.910  0.36263    
## CityAvondale                 1.808e+02  3.549e+02   0.510  0.61033    
## CityBakersfield              4.197e+02  4.112e+02   1.021  0.30743    
## CityBaltimore                1.962e+02  4.235e+02   0.463  0.64325    
## CityBangor                   1.949e+02  5.054e+02   0.386  0.69983    
## CityBartlett                 2.622e+01  6.386e+02   0.041  0.96725    
## CityBayonne                 -2.440e+01  6.516e+02  -0.037  0.97013    
## CityBaytown                  4.743e+02  6.375e+02   0.744  0.45690    
## CityBeaumont                 2.540e+02  4.670e+02   0.544  0.58657    
## CityBedford                  2.745e+02  4.536e+02   0.605  0.54519    
## CityBelleville               8.894e+01  4.486e+02   0.198  0.84284    
## CityBellevue                 3.628e+02  4.716e+02   0.769  0.44175    
## CityBellingham               9.044e+02  5.141e+02   1.759  0.07859 .  
## CityBethlehem                1.009e+03  4.880e+02   2.067  0.03878 *  
## CityBeverly                  6.173e+02  4.892e+02   1.262  0.20700    
## CityBillings                        NA         NA      NA       NA    
## CityBloomington              1.820e+02  3.975e+02   0.458  0.64698    
## `CityBoca Raton`            -2.501e+02  5.507e+02  -0.454  0.64967    
## CityBoise                    2.015e+02  8.820e+02   0.228  0.81929    
## CityBolingbrook              1.138e+02  4.296e+02   0.265  0.79105    
## `CityBossier City`           6.915e+02  4.656e+02   1.485  0.13752    
## `CityBowling Green`          2.355e+02  4.149e+02   0.567  0.57040    
## `CityBoynton Beach`         -2.059e+02  4.531e+02  -0.454  0.64957    
## CityBozeman                  2.778e+02  6.663e+02   0.417  0.67675    
## CityBrentwood                5.963e+02  4.161e+02   1.433  0.15186    
## CityBridgeton               -1.149e+02  5.467e+02  -0.210  0.83352    
## CityBristol                  3.043e+02  4.209e+02   0.723  0.46974    
## `CityBroken Arrow`           3.472e+02  5.059e+02   0.686  0.49253    
## CityBroomfield               1.198e+02  4.583e+02   0.261  0.79377    
## CityBrownsville              2.045e+02  4.539e+02   0.450  0.65237    
## CityBryan                    2.929e+02  4.438e+02   0.660  0.50927    
## CityBuffalo                  6.351e+02  4.274e+02   1.486  0.13727    
## `CityBuffalo Grove`          2.398e+02  5.099e+02   0.470  0.63810    
## `CityBullhead City`          2.277e+02  4.346e+02   0.524  0.60038    
## CityBurbank                  1.073e+03  5.273e+02   2.035  0.04193 *  
## CityBurlington               7.068e+02  4.285e+02   1.649  0.09912 .  
## CityCaldwell                 1.741e+02  9.053e+02   0.192  0.84754    
## CityCamarillo                4.480e+02  4.499e+02   0.996  0.31930    
## CityCambridge                1.749e+02  4.445e+02   0.394  0.69395    
## CityCanton                   2.920e+02  5.025e+02   0.581  0.56113    
## CityCarlsbad                 2.387e+02  4.783e+02   0.499  0.61774    
## `CityCarol Stream`           2.192e+02  4.302e+02   0.510  0.61029    
## CityCarrollton               5.245e+02  4.180e+02   1.255  0.20963    
## CityCary                     1.894e+02  4.668e+02   0.406  0.68499    
## `CityCedar Hill`             3.297e+02  6.379e+02   0.517  0.60526    
## `CityCedar Rapids`           3.796e+02  6.560e+02   0.579  0.56284    
## CityChampaign                4.138e+02  6.205e+02   0.667  0.50485    
## CityChandler                 1.614e+02  3.144e+02   0.513  0.60786    
## `CityChapel Hill`            2.745e+02  6.470e+02   0.424  0.67132    
## CityCharlotte                2.483e+02  4.169e+02   0.596  0.55149    
## CityCharlottesville          2.595e+02  6.337e+02   0.409  0.68221    
## CityChattanooga             -2.588e+01  4.384e+02  -0.059  0.95292    
## CityChesapeake               2.193e+02  4.088e+02   0.536  0.59172    
## CityChester                  6.865e+02  4.527e+02   1.516  0.12947    
## CityCheyenne                 1.212e+03  6.664e+02   1.819  0.06888 .  
## CityChicago                  2.108e+02  3.678e+02   0.573  0.56647    
## CityChico                    4.264e+02  4.284e+02   0.995  0.31967    
## `CityChula Vista`            6.595e+02  4.857e+02   1.358  0.17453    
## CityCincinnati               2.452e+02  4.059e+02   0.604  0.54580    
## `CityCitrus Heights`         5.279e+02  6.352e+02   0.831  0.40597    
## CityClarksville              6.570e+02  4.454e+02   1.475  0.14025    
## CityCleveland                3.610e+02  3.954e+02   0.913  0.36129    
## CityClifton                  4.331e+01  6.516e+02   0.066  0.94700    
## CityClinton                  1.768e+02  4.268e+02   0.414  0.67870    
## CityClovis                   2.530e+02  6.662e+02   0.380  0.70412    
## CityCoachella                4.934e+02  5.271e+02   0.936  0.34923    
## `CityCollege Station`        2.303e+02  6.381e+02   0.361  0.71814    
## `CityColorado Springs`       2.006e+02  3.988e+02   0.503  0.61503    
## CityColumbia                 3.900e+02  3.980e+02   0.980  0.32716    
## CityColumbus                 2.625e+02  3.863e+02   0.679  0.49692    
## `CityCommerce City`          2.027e+02  6.314e+02   0.321  0.74816    
## CityConcord                  6.782e+02  4.103e+02   1.653  0.09837 .  
## CityConroe                   9.008e+00  6.381e+02   0.014  0.98874    
## CityConway                          NA         NA      NA       NA    
## `CityCoon Rapids`            3.835e+02  5.439e+02   0.705  0.48075    
## CityCoppell                  2.275e+02  4.895e+02   0.465  0.64211    
## `CityCoral Gables`          -4.772e+02  6.546e+02  -0.729  0.46606    
## `CityCoral Springs`         -2.236e+02  4.683e+02  -0.478  0.63300    
## `CityCorpus Christi`         3.291e+02  4.371e+02   0.753  0.45160    
## `CityCosta Mesa`             4.354e+02  4.245e+02   1.026  0.30514    
## `CityCottage Grove`          3.505e+02  6.490e+02   0.540  0.58916    
## CityCovington                2.827e+02  4.932e+02   0.573  0.56648    
## CityCranston                 2.990e+02  4.593e+02   0.651  0.51505    
## `CityCuyahoga Falls`         2.090e+02  4.820e+02   0.434  0.66463    
## CityDallas                   3.581e+02  3.967e+02   0.903  0.36670    
## CityDanbury                  4.246e+02  6.478e+02   0.655  0.51222    
## CityDanville                 3.901e+02  4.428e+02   0.881  0.37841    
## CityDavis                    4.201e+02  6.351e+02   0.661  0.50833    
## `CityDaytona Beach`         -3.265e+02  4.904e+02  -0.666  0.50563    
## CityDearborn                 2.121e+02  4.680e+02   0.453  0.65034    
## `CityDearborn Heights`       2.445e+02  4.594e+02   0.532  0.59466    
## CityDecatur                  2.017e+02  3.814e+02   0.529  0.59689    
## `CityDeer Park`              1.561e+02  6.371e+02   0.245  0.80648    
## `CityDelray Beach`          -2.972e+02  5.113e+02  -0.581  0.56110    
## CityDeltona                 -3.487e+02  4.773e+02  -0.731  0.46503    
## CityDenver                   2.207e+02  3.931e+02   0.561  0.57459    
## `CityDes Moines`             3.906e+02  4.312e+02   0.906  0.36504    
## `CityDes Plaines`            1.506e+02  4.300e+02   0.350  0.72617    
## CityDetroit                  2.566e+02  4.139e+02   0.620  0.53537    
## CityDover                    5.403e+02  4.194e+02   1.288  0.19767    
## CityDraper                   1.030e+02  5.258e+02   0.196  0.84468    
## CityDublin                   2.727e+02  4.118e+02   0.662  0.50787    
## CityDubuque                  4.379e+02  5.125e+02   0.855  0.39285    
## CityDurham                   1.964e+02  4.423e+02   0.444  0.65701    
## CityEagan                    4.637e+02  4.540e+02   1.021  0.30714    
## `CityEast Orange`           -2.934e+01  4.862e+02  -0.060  0.95188    
## `CityEast Point`             3.631e+02  5.270e+02   0.689  0.49088    
## `CityEau Claire`             1.519e+02  4.932e+02   0.308  0.75807    
## CityEdinburg                 2.923e+02  5.302e+02   0.551  0.58145    
## CityEdmond                   3.934e+02  5.645e+02   0.697  0.48596    
## CityEdmonds                  4.015e+02  4.473e+02   0.898  0.36939    
## `CityEl Cajon`               4.315e+02  5.273e+02   0.818  0.41319    
## `CityEl Paso`                3.950e+02  4.151e+02   0.951  0.34143    
## CityElkhart                  2.496e+02  5.240e+02   0.476  0.63389    
## CityElmhurst                 3.317e+02  4.444e+02   0.746  0.45545    
## CityElyria                   3.406e+02  6.321e+02   0.539  0.58999    
## CityEncinitas                3.302e+02  4.503e+02   0.733  0.46339    
## CityEnglewood                1.342e+02  5.224e+02   0.257  0.79726    
## CityEscondido                5.082e+02  4.859e+02   1.046  0.29562    
## CityEugene                   4.510e+02  4.443e+02   1.015  0.31015    
## CityEvanston                 1.195e+02  4.673e+02   0.256  0.79825    
## CityEverett                  1.982e+02  4.097e+02   0.484  0.62856    
## CityFairfield                3.723e+02  3.969e+02   0.938  0.34831    
## CityFargo                    3.254e+02  4.935e+02   0.659  0.50965    
## CityFarmington               3.121e+02  5.644e+02   0.553  0.58033    
## CityFayetteville             1.636e+02  4.232e+02   0.387  0.69913    
## CityFlorence                 2.727e+02  3.992e+02   0.683  0.49449    
## `CityFort Collins`           1.965e+02  4.347e+02   0.452  0.65121    
## `CityFort Lauderdale`       -2.367e+02  4.502e+02  -0.526  0.59908    
## `CityFort Worth`             3.360e+02  4.089e+02   0.822  0.41124    
## CityFrankfort                8.417e+01  5.099e+02   0.165  0.86890    
## CityFranklin                 2.717e+02  4.018e+02   0.676  0.49895    
## CityFreeport                 3.367e+02  4.168e+02   0.808  0.41918    
## CityFremont                  2.423e+02  4.782e+02   0.507  0.61241    
## CityFresno                   4.994e+02  4.059e+02   1.230  0.21858    
## CityFrisco                   2.308e+02  5.298e+02   0.436  0.66315    
## CityGaithersburg             1.768e+01  5.452e+02   0.032  0.97414    
## `CityGarden City`            2.784e+02  4.933e+02   0.564  0.57259    
## CityGarland                  4.022e+02  5.302e+02   0.759  0.44812    
## CityGastonia                 1.618e+02  5.419e+02   0.299  0.76529    
## CityGeorgetown               3.177e+02  4.377e+02   0.726  0.46796    
## CityGilbert                  2.848e+02  2.872e+02   0.992  0.32125    
## CityGladstone                2.861e+02  5.283e+02   0.542  0.58817    
## CityGlendale                 1.608e+02  2.773e+02   0.580  0.56197    
## CityGlenview                        NA         NA      NA       NA    
## CityGoldsboro                1.221e+02  6.477e+02   0.188  0.85052    
## `CityGrand Island`           3.136e+02  6.667e+02   0.470  0.63805    
## `CityGrand Prairie`          2.993e+02  4.222e+02   0.709  0.47835    
## `CityGrand Rapids`           1.397e+02  4.526e+02   0.309  0.75756    
## CityGrapevine                2.528e+02  6.376e+02   0.397  0.69174    
## `CityGreat Falls`            2.377e+02  4.739e+02   0.502  0.61599    
## CityGreeley                  1.960e+02  5.226e+02   0.375  0.70762    
## `CityGreen Bay`              2.594e+02  5.257e+02   0.494  0.62165    
## CityGreensboro               2.465e+02  4.363e+02   0.565  0.57219    
## CityGreenville               1.914e+02  4.422e+02   0.433  0.66508    
## CityGreenwood                5.382e+02  5.240e+02   1.027  0.30436    
## CityGresham                  3.951e+02  4.674e+02   0.845  0.39788    
## `CityGrove City`             2.695e+01  5.238e+02   0.051  0.95897    
## CityGulfport                 1.135e+02  5.108e+02   0.222  0.82415    
## CityHackensack              -2.849e+01  4.637e+02  -0.061  0.95101    
## CityHagerstown               9.308e+01  6.506e+02   0.143  0.88623    
## `CityHaltom City`            3.847e+02  4.675e+02   0.823  0.41057    
## CityHamilton                 3.782e+02  4.605e+02   0.821  0.41155    
## CityHampton                  2.108e+02  4.162e+02   0.507  0.61249    
## CityHarlingen                2.135e+02  4.678e+02   0.457  0.64801    
## CityHarrisonburg             5.037e+02  4.386e+02   1.149  0.25079    
## CityHattiesburg              2.196e+02  4.570e+02   0.480  0.63089    
## CityHelena                   6.519e+01  5.259e+02   0.124  0.90135    
## CityHempstead                3.350e+02  4.316e+02   0.776  0.43764    
## CityHenderson                3.840e+02  3.940e+02   0.975  0.32977    
## CityHendersonville           2.195e+02  4.681e+02   0.469  0.63908    
## CityHesperia                 4.167e+02  4.855e+02   0.858  0.39078    
## CityHialeah                 -4.579e+02  4.378e+02  -1.046  0.29560    
## CityHickory                  8.167e+01  6.469e+02   0.126  0.89953    
## `CityHighland Park`          1.983e+02  4.126e+02   0.481  0.63076    
## CityHillsboro                3.671e+02  4.670e+02   0.786  0.43178    
## CityHolland                  1.940e+02  5.027e+02   0.386  0.69953    
## CityHollywood               -2.701e+02  4.504e+02  -0.600  0.54875    
## CityHolyoke                 -8.512e+01  6.379e+02  -0.133  0.89384    
## CityHomestead               -2.226e+02  5.506e+02  -0.404  0.68603    
## CityHoover                   2.000e+02  4.873e+02   0.410  0.68156    
## `CityHot Springs`            5.247e+00  5.349e+02   0.010  0.99217    
## CityHouston                  3.745e+02  3.953e+02   0.947  0.34348    
## `CityHuntington Beach`       5.068e+02  4.634e+02   1.094  0.27412    
## CityHuntsville               3.362e+02  3.992e+02   0.842  0.39979    
## CityIndependence             7.977e+02  5.285e+02   1.510  0.13121    
## CityIndianapolis             3.198e+02  4.020e+02   0.796  0.42626    
## CityInglewood                4.382e+02  4.125e+02   1.062  0.28811    
## `CityIowa City`              5.859e+02  6.554e+02   0.894  0.37141    
## CityIrving                   4.065e+02  4.675e+02   0.870  0.38455    
## CityJackson                  3.216e+02  4.103e+02   0.784  0.43321    
## CityJacksonville             1.766e+02  4.166e+02   0.424  0.67164    
## CityJamestown                1.739e+03  5.295e+02   3.285  0.00103 ** 
## `CityJefferson City`         3.629e+02  6.359e+02   0.571  0.56827    
## `CityJohnson City`           2.243e+02  4.263e+02   0.526  0.59879    
## CityJonesboro               -1.304e+01  4.796e+02  -0.027  0.97831    
## CityJupiter                         NA         NA      NA       NA    
## CityKeller                   4.625e+02  6.382e+02   0.725  0.46869    
## CityKenner                   6.766e+02  6.526e+02   1.037  0.29991    
## CityKenosha                  2.898e+02  4.736e+02   0.612  0.54052    
## CityKent                     3.250e+02  4.250e+02   0.765  0.44446    
## CityKirkwood                 4.181e+02  6.360e+02   0.657  0.51101    
## CityKissimmee                1.818e+02  6.549e+02   0.278  0.78134    
## CityKnoxville                3.992e+02  4.113e+02   0.971  0.33180    
## `CityLa Crosse`              1.376e+02  5.055e+02   0.272  0.78542    
## `CityLa Mesa`                4.761e+02  5.271e+02   0.903  0.36650    
## `CityLa Porte`               1.989e+02  4.188e+02   0.475  0.63483    
## `CityLa Quinta`              3.875e+02  6.351e+02   0.610  0.54182    
## CityLafayette                6.824e+02  4.088e+02   1.670  0.09505 .  
## `CityLaguna Niguel`          2.139e+02  4.855e+02   0.440  0.65960    
## `CityLake Charles`           3.522e+02  5.483e+02   0.642  0.52070    
## `CityLake Elsinore`          1.878e+02  6.352e+02   0.296  0.76748    
## `CityLake Forest`            5.804e+02  4.502e+02   1.289  0.19738    
## CityLakeland                -2.136e+02  4.408e+02  -0.485  0.62792    
## CityLakeville                3.415e+02  4.376e+02   0.780  0.43523    
## CityLakewood                 2.800e+02  4.016e+02   0.697  0.48573    
## CityLancaster                6.864e+02  3.956e+02   1.735  0.08273 .  
## CityLansing                  1.805e+02  4.522e+02   0.399  0.68973    
## CityLaredo                   4.186e+02  4.283e+02   0.977  0.32844    
## `CityLas Cruces`             2.709e+02  5.261e+02   0.515  0.60660    
## `CityLas Vegas`              9.617e+01  5.102e+02   0.188  0.85050    
## CityLaurel                   2.221e+02  6.503e+02   0.341  0.73275    
## CityLawrence                 1.264e+02  3.963e+02   0.319  0.74981    
## CityLawton                   1.718e+02  5.647e+02   0.304  0.76096    
## CityLayton                          NA         NA      NA       NA    
## `CityLeague City`            3.274e+02  4.446e+02   0.736  0.46154    
## CityLebanon                  2.045e+02  6.388e+02   0.320  0.74887    
## CityLehi                    -1.864e+02  5.643e+02  -0.330  0.74112    
## CityLeominster               1.938e+02  4.445e+02   0.436  0.66285    
## CityLewiston                 2.746e+02  6.667e+02   0.412  0.68048    
## `CityLincoln Park`           7.423e+01  5.026e+02   0.148  0.88259    
## CityLinden                  -2.233e+02  6.511e+02  -0.343  0.73166    
## CityLindenhurst              5.714e+02  6.369e+02   0.897  0.36964    
## `CityLittle Rock`            7.674e+01  4.633e+02   0.166  0.86844    
## CityLittleton               -1.450e+02  6.311e+02  -0.230  0.81828    
## CityLodi                     2.823e+02  6.355e+02   0.444  0.65691    
## CityLogan                    1.213e+02  4.933e+02   0.246  0.80573    
## `CityLong Beach`             4.165e+02  3.951e+02   1.054  0.29187    
## CityLongmont                 3.021e+02  4.806e+02   0.629  0.52962    
## CityLongview                 2.838e+02  5.532e+02   0.513  0.60799    
## CityLorain                   4.745e+02  4.297e+02   1.104  0.26948    
## `CityLos Angeles`            4.737e+02  3.907e+02   1.212  0.22540    
## CityLouisville               3.049e+02  3.874e+02   0.787  0.43127    
## CityLoveland                 2.165e+02  4.581e+02   0.473  0.63651    
## CityLowell                   2.613e+02  4.167e+02   0.627  0.53057    
## CityLubbock                  3.371e+02  4.671e+02   0.722  0.47049    
## CityMacon                    2.252e+02  4.405e+02   0.511  0.60925    
## CityMadison                  4.916e+02  4.701e+02   1.046  0.29568    
## CityMalden                   9.332e+01  4.894e+02   0.191  0.84879    
## CityManchester               3.493e+02  4.398e+02   0.794  0.42710    
## CityManhattan                2.892e+02  6.664e+02   0.434  0.66432    
## CityMansfield                2.529e+02  5.302e+02   0.477  0.63341    
## CityManteca                  3.928e+02  5.271e+02   0.745  0.45609    
## `CityMaple Grove`            5.178e+02  4.829e+02   1.072  0.28359    
## CityMargate                 -2.131e+02  6.548e+02  -0.325  0.74486    
## CityMarietta                 2.872e+02  4.403e+02   0.652  0.51433    
## CityMarion                   4.261e+02  4.072e+02   1.046  0.29537    
## CityMarlborough              1.919e+02  6.375e+02   0.301  0.76336    
## CityMarysville              -4.978e+01  6.573e+02  -0.076  0.93964    
## CityMason                    3.405e+02  4.823e+02   0.706  0.48021    
## CityMcallen                  3.441e+02  4.107e+02   0.838  0.40204    
## CityMedford                  4.451e+02  4.672e+02   0.953  0.34072    
## CityMedina                   2.784e+02  4.245e+02   0.656  0.51200    
## CityMelbourne               -2.902e+02  6.548e+02  -0.443  0.65761    
## CityMemphis                  3.687e+02  4.077e+02   0.904  0.36582    
## CityMentor                   3.285e+02  4.369e+02   0.752  0.45212    
## CityMeriden                  4.319e+02  4.375e+02   0.987  0.32362    
## CityMeridian                 2.142e+02  8.825e+02   0.243  0.80824    
## CityMesa                     8.584e+01  2.712e+02   0.316  0.75165    
## CityMesquite                 3.490e+02  4.529e+02   0.771  0.44097    
## CityMiami                   -2.099e+02  4.272e+02  -0.491  0.62324    
## CityMiddletown               4.623e+02  4.470e+02   1.034  0.30107    
## CityMidland                  3.464e+02  4.382e+02   0.791  0.42923    
## CityMilford                  3.027e+02  4.675e+02   0.647  0.51733    
## CityMilwaukee                2.068e+02  4.469e+02   0.463  0.64360    
## CityMinneapolis              7.164e+02  4.262e+02   1.681  0.09285 .  
## CityMiramar                 -3.649e+02  4.774e+02  -0.764  0.44463    
## CityMishawaka                2.137e+02  5.242e+02   0.408  0.68352    
## `CityMission Viejo`          4.004e+02  4.639e+02   0.863  0.38810    
## CityMissoula                 5.582e+02  6.662e+02   0.838  0.40209    
## `CityMissouri City`          1.049e+02  6.369e+02   0.165  0.86913    
## CityMobile                   1.495e+02  4.233e+02   0.353  0.72385    
## CityModesto                  2.642e+02  5.271e+02   0.501  0.61621    
## CityMonroe                   4.584e+02  4.245e+02   1.080  0.28021    
## CityMontebello               4.886e+02  6.349e+02   0.770  0.44161    
## CityMontgomery               2.491e+02  4.230e+02   0.589  0.55595    
## CityMoorhead                 3.397e+02  5.436e+02   0.625  0.53207    
## `CityMoreno Valley`          4.998e+02  4.287e+02   1.166  0.24371    
## `CityMorgan Hill`            2.437e+02  4.639e+02   0.525  0.59941    
## CityMorristown               6.870e+01  4.574e+02   0.150  0.88060    
## `CityMount Pleasant`         2.579e+02  5.313e+02   0.485  0.62745    
## `CityMount Vernon`           4.229e+02  4.314e+02   0.980  0.32697    
## CityMurfreesboro             2.117e+02  4.232e+02   0.500  0.61691    
## CityMurray                   1.140e+02  5.260e+02   0.217  0.82844    
## CityMurrieta                        NA         NA      NA       NA    
## CityMuskogee                 1.600e+02  5.646e+02   0.283  0.77694    
## CityNaperville               1.589e+02  4.125e+02   0.385  0.70001    
## CityNashua                   4.877e+02  5.488e+02   0.889  0.37424    
## CityNashville                3.487e+02  4.072e+02   0.856  0.39182    
## `CityNew Albany`             1.343e+02  4.826e+02   0.278  0.78078    
## `CityNew Bedford`            1.049e+02  4.537e+02   0.231  0.81715    
## `CityNew Brunswick`          1.203e+02  6.516e+02   0.185  0.85358    
## `CityNew Castle`            -8.239e+00  5.241e+02  -0.016  0.98746    
## `CityNew Rochelle`           2.582e+02  4.276e+02   0.604  0.54587    
## `CityNew York City`          4.270e+02  3.938e+02   1.084  0.27832    
## CityNewark                   4.398e+02  3.945e+02   1.115  0.26494    
## `CityNewport News`           2.057e+02  4.189e+02   0.491  0.62333    
## `CityNiagara Falls`          2.987e+02  4.886e+02   0.611  0.54101    
## CityNoblesville              7.526e+02  4.825e+02   1.560  0.11886    
## CityNorfolk                         NA         NA      NA       NA    
## CityNormal                   2.232e+02  6.211e+02   0.359  0.71930    
## CityNorman                   1.075e+03  6.663e+02   1.613  0.10678    
## `CityNorth Charleston`       4.447e+02  6.432e+02   0.691  0.48931    
## `CityNorth Las Vegas`        4.781e+02  5.050e+02   0.947  0.34380    
## `CityNorth Miami`           -2.681e+02  5.507e+02  -0.487  0.62638    
## CityNorwich                  4.730e+02  5.023e+02   0.942  0.34639    
## `CityOak Park`               3.201e+02  4.830e+02   0.663  0.50757    
## CityOakland                  4.683e+02  4.046e+02   1.158  0.24707    
## CityOceanside                3.348e+02  4.025e+02   0.832  0.40552    
## CityOdessa                   3.187e+02  4.535e+02   0.703  0.48220    
## `CityOklahoma City`          3.104e+02  4.536e+02   0.684  0.49375    
## CityOlathe                   3.329e+02  4.936e+02   0.674  0.50010    
## CityOlympia                  3.479e+02  4.805e+02   0.724  0.46909    
## CityOmaha                    2.750e+02  4.506e+02   0.610  0.54169    
## CityOntario                         NA         NA      NA       NA    
## CityOrange                   8.103e+00  4.638e+02   0.017  0.98606    
## CityOrem                     3.595e+02  4.738e+02   0.759  0.44807    
## `CityOrland Park`            2.806e+02  6.209e+02   0.452  0.65135    
## CityOrlando                 -4.107e+02  4.501e+02  -0.912  0.36165    
## `CityOrmond Beach`                  NA         NA      NA       NA    
## CityOswego                   5.373e+02  5.098e+02   1.054  0.29189    
## `CityOverland Park`          2.346e+02  4.930e+02   0.476  0.63416    
## CityOwensboro                3.597e+02  5.244e+02   0.686  0.49281    
## CityOxnard                   5.582e+02  4.242e+02   1.316  0.18829    
## CityPalatine                 3.586e+01  6.209e+02   0.058  0.95395    
## `CityPalm Coast`            -2.981e+02  5.110e+02  -0.583  0.55966    
## `CityPark Ridge`             4.318e+02  5.099e+02   0.847  0.39705    
## CityParker                   1.800e+02  4.275e+02   0.421  0.67381    
## CityParma                    4.726e+02  4.209e+02   1.123  0.26158    
## CityPasadena                 4.038e+02  3.971e+02   1.017  0.30925    
## CityPasco                    4.592e+02  4.720e+02   0.973  0.33060    
## CityPassaic                 -9.670e+01  4.639e+02  -0.208  0.83490    
## CityPaterson                -6.006e+01  4.338e+02  -0.138  0.88988    
## CityPearland                 3.296e+02  4.889e+02   0.674  0.50026    
## `CityPembroke Pines`        -3.067e+02  4.458e+02  -0.688  0.49152    
## CityPensacola               -3.436e+02  6.548e+02  -0.525  0.59974    
## CityPeoria                   1.225e+02  2.896e+02   0.423  0.67238    
## `CityPerth Amboy`            1.242e+01  4.728e+02   0.026  0.97904    
## CityPharr                    3.389e+02  4.890e+02   0.693  0.48839    
## CityPhiladelphia             7.616e+02  4.341e+02   1.754  0.07941 .  
## CityPhoenix                  2.171e+02  2.602e+02   0.834  0.40421    
## `CityPico Rivera`            4.303e+02  6.353e+02   0.677  0.49823    
## `CityPine Bluff`             7.590e+01  5.723e+02   0.133  0.89449    
## CityPlainfield               1.580e+02  4.457e+02   0.355  0.72291    
## CityPlano                    3.827e+02  4.226e+02   0.905  0.36524    
## CityPlantation              -3.256e+02  4.532e+02  -0.718  0.47255    
## `CityPleasant Grove`         3.028e+02  4.850e+02   0.624  0.53233    
## CityPocatello                2.258e+02  8.547e+02   0.264  0.79167    
## CityPomona                   4.865e+02  4.499e+02   1.081  0.27965    
## `CityPompano Beach`         -3.501e+02  5.111e+02  -0.685  0.49336    
## `CityPort Arthur`            1.668e+02  4.538e+02   0.368  0.71321    
## `CityPort Orange`           -2.151e+02  6.551e+02  -0.328  0.74271    
## `CityPort Saint Lucie`      -5.578e+02  5.507e+02  -1.013  0.31116    
## CityPortage                  3.489e+02  6.327e+02   0.551  0.58134    
## CityPortland                 3.160e+02  4.085e+02   0.773  0.43928    
## CityProvidence               2.675e+02  4.503e+02   0.594  0.55252    
## CityProvo                    3.165e+02  4.737e+02   0.668  0.50409    
## CityPueblo                   2.151e+02  4.275e+02   0.503  0.61492    
## CityQuincy                   1.045e+02  3.919e+02   0.267  0.78974    
## CityRaleigh                  1.903e+02  4.262e+02   0.447  0.65521    
## `CityRancho Cucamonga`       4.721e+02  4.638e+02   1.018  0.30877    
## `CityRapid City`             2.984e+02  6.665e+02   0.448  0.65438    
## CityReading                  7.628e+02  4.880e+02   1.563  0.11802    
## CityRedding                  4.925e+02  6.351e+02   0.775  0.43807    
## CityRedlands                 5.128e+02  4.127e+02   1.243  0.21399    
## CityRedmond                  2.625e+02  4.265e+02   0.615  0.53828    
## `CityRedondo Beach`          4.412e+02  4.406e+02   1.002  0.31659    
## `CityRedwood City`           3.236e+02  6.348e+02   0.510  0.61023    
## CityReno                     1.918e+02  5.684e+02   0.337  0.73580    
## CityRenton                   1.112e+02  5.531e+02   0.201  0.84069    
## CityRevere                   1.099e+02  4.536e+02   0.242  0.80864    
## CityRichardson               7.320e+02  5.299e+02   1.381  0.16725    
## CityRichmond                 2.564e+02  3.867e+02   0.663  0.50724    
## `CityRio Rancho`             2.487e+02  5.645e+02   0.441  0.65958    
## CityRiverside                5.939e+02  4.184e+02   1.419  0.15581    
## CityRochester                4.060e+02  4.006e+02   1.013  0.31087    
## `CityRochester Hills`        1.467e+02  6.480e+02   0.226  0.82091    
## `CityRock Hill`              3.467e+02  6.432e+02   0.539  0.58994    
## CityRockford                 3.829e+02  4.026e+02   0.951  0.34160    
## CityRockville                1.625e+02  4.715e+02   0.345  0.73037    
## CityRogers                  -2.275e+02  6.733e+02  -0.338  0.73540    
## CityRome                     3.144e+02  4.530e+02   0.694  0.48766    
## CityRomeoville               1.803e+02  6.206e+02   0.291  0.77140    
## CityRoseville                3.583e+02  4.033e+02   0.889  0.37424    
## CityRoswell                  3.957e+02  4.099e+02   0.965  0.33438    
## `CityRound Rock`             1.366e+03  4.535e+02   3.012  0.00260 ** 
## `CityRoyal Oak`              3.042e+02  5.427e+02   0.560  0.57520    
## CitySacramento               7.530e+02  4.183e+02   1.800  0.07190 .  
## CitySaginaw                  9.405e+01  4.593e+02   0.205  0.83778    
## `CitySaint Charles`          1.605e+02  4.126e+02   0.389  0.69738    
## `CitySaint Cloud`            4.993e+02  5.434e+02   0.919  0.35826    
## `CitySaint Louis`            1.072e+02  4.868e+02   0.220  0.82572    
## `CitySaint Paul`             2.300e+02  5.436e+02   0.423  0.67222    
## `CitySaint Peters`           1.642e+02  6.363e+02   0.258  0.79641    
## `CitySaint Petersburg`      -2.404e+02  4.421e+02  -0.544  0.58667    
## CitySalem                    2.975e+02  3.988e+02   0.746  0.45568    
## CitySalinas                  4.271e+02  4.500e+02   0.949  0.34254    
## `CitySalt Lake City`         1.724e+02  4.738e+02   0.364  0.71601    
## `CitySan Angelo`             1.816e+02  4.891e+02   0.371  0.71051    
## `CitySan Antonio`            6.551e+02  4.002e+02   1.637  0.10162    
## `CitySan Bernardino`         4.763e+02  4.287e+02   1.111  0.26659    
## `CitySan Clemente`           4.601e+02  6.352e+02   0.724  0.46894    
## `CitySan Diego`              4.938e+02  3.925e+02   1.258  0.20841    
## `CitySan Francisco`          4.895e+02  3.909e+02   1.252  0.21046    
## `CitySan Gabriel`            7.812e+02  4.859e+02   1.608  0.10796    
## `CitySan Jose`               4.044e+02  3.997e+02   1.012  0.31167    
## `CitySan Luis Obispo`        4.812e+02  6.353e+02   0.757  0.44880    
## `CitySan Marcos`             3.295e+02  4.889e+02   0.674  0.50041    
## `CitySan Mateo`              4.075e+02  6.357e+02   0.641  0.52153    
## `CitySandy Springs`          2.054e+02  4.185e+02   0.491  0.62353    
## CitySanford                 -2.729e+02  5.504e+02  -0.496  0.62000    
## `CitySanta Ana`              3.506e+02  4.183e+02   0.838  0.40191    
## `CitySanta Barbara`          7.887e+02  4.337e+02   1.818  0.06906 .  
## `CitySanta Clara`            4.386e+02  4.500e+02   0.975  0.32975    
## `CitySanta Fe`               4.963e+01  5.644e+02   0.088  0.92992    
## `CitySanta Maria`            3.857e+02  6.355e+02   0.607  0.54395    
## CityScottsdale               9.728e+01  2.901e+02   0.335  0.73733    
## CitySeattle                  3.787e+02  4.257e+02   0.890  0.37364    
## CitySheboygan                2.211e+02  5.064e+02   0.437  0.66245    
## CityShelton                  4.005e+02  5.425e+02   0.738  0.46045    
## `CitySierra Vista`           1.354e+02  3.833e+02   0.353  0.72401    
## `CitySioux Falls`            2.107e+02  4.698e+02   0.448  0.65386    
## CitySkokie                   1.956e+02  4.068e+02   0.481  0.63055    
## CitySmyrna                   2.388e+02  4.029e+02   0.593  0.55336    
## `CitySouth Bend`             1.313e+02  4.606e+02   0.285  0.77561    
## CitySouthaven                2.704e+02  4.766e+02   0.567  0.57048    
## CitySparks                   2.599e+02  5.684e+02   0.457  0.64754    
## CitySpokane                  1.070e+02  4.802e+02   0.223  0.82371    
## CitySpringdale               1.539e+02  6.730e+02   0.229  0.81916    
## CitySpringfield              3.413e+02  3.871e+02   0.882  0.37795    
## `CitySterling Heights`       2.060e+02  5.428e+02   0.379  0.70436    
## CityStockton                 3.958e+02  4.636e+02   0.854  0.39324    
## CitySuffolk                  3.816e+02  4.385e+02   0.870  0.38427    
## CitySummerville              2.241e+02  5.367e+02   0.418  0.67631    
## CitySunnyvale                4.319e+02  4.405e+02   0.980  0.32688    
## CitySuperior                 2.392e+02  4.787e+02   0.500  0.61736    
## CityTallahassee             -1.343e+02  4.396e+02  -0.306  0.75998    
## CityTamarac                  7.514e+01  4.901e+02   0.153  0.87816    
## CityTampa                   -2.412e+02  4.309e+02  -0.560  0.57565    
## CityTaylor                   1.930e+02  5.026e+02   0.384  0.70103    
## CityTemecula                 5.550e+02  4.861e+02   1.142  0.25363    
## CityTempe                    1.266e+02  2.933e+02   0.432  0.66609    
## CityTexarkana               -7.687e+01  5.721e+02  -0.134  0.89311    
## `CityTexas City`             1.411e+02  4.895e+02   0.288  0.77312    
## `CityThe Colony`             2.752e+02  5.307e+02   0.518  0.60414    
## CityThomasville              2.012e+02  5.415e+02   0.372  0.71019    
## CityThornton                 1.454e+02  4.279e+02   0.340  0.73410    
## `CityThousand Oaks`          4.215e+02  4.499e+02   0.937  0.34876    
## CityTigard                   5.012e+02  4.374e+02   1.146  0.25193    
## `CityTinley Park`           -9.046e+00  6.205e+02  -0.015  0.98837    
## CityToledo                   3.288e+02  3.974e+02   0.827  0.40801    
## CityTorrance                 8.018e+02  4.636e+02   1.729  0.08378 .  
## CityTrenton                  1.525e+01  4.356e+02   0.035  0.97207    
## CityTroy                     3.650e+02  4.007e+02   0.911  0.36227    
## CityTucson                   1.341e+02  2.673e+02   0.502  0.61588    
## CityTulsa                    2.396e+02  4.516e+02   0.530  0.59580    
## CityTuscaloosa               2.367e+01  6.367e+02   0.037  0.97035    
## `CityTwin Falls`             3.123e+02  9.055e+02   0.345  0.73023    
## CityTyler                    4.251e+02  4.533e+02   0.938  0.34839    
## CityUrbandale                4.148e+02  5.126e+02   0.809  0.41845    
## CityUtica                    3.446e+02  4.314e+02   0.799  0.42443    
## CityVacaville                1.819e+02  6.355e+02   0.286  0.77463    
## CityVallejo                  6.084e+02  4.638e+02   1.312  0.18966    
## CityVancouver                4.134e+02  4.803e+02   0.861  0.38946    
## CityVineland                -9.540e+01  4.637e+02  -0.206  0.83702    
## `CityVirginia Beach`         3.727e+02  4.088e+02   0.912  0.36191    
## CityVisalia                  3.824e+02  4.639e+02   0.824  0.40981    
## CityWaco                     4.243e+02  4.441e+02   0.955  0.33940    
## `CityWarner Robins`          1.587e+02  5.272e+02   0.301  0.76337    
## CityWarwick                  2.927e+02  5.260e+02   0.556  0.57796    
## CityWashington               1.671e+02  4.673e+02   0.358  0.72062    
## CityWaterbury                3.908e+02  4.432e+02   0.882  0.37799    
## CityWaterloo                 4.514e+02  6.559e+02   0.688  0.49133    
## CityWatertown                4.761e+02  4.316e+02   1.103  0.26998    
## CityWaukesha                -6.911e+00  6.668e+02  -0.010  0.99173    
## CityWausau                   1.464e+02  5.061e+02   0.289  0.77232    
## CityWaynesboro               4.268e+02  4.265e+02   1.001  0.31693    
## `CityWest Allis`            -4.904e+01  6.666e+02  -0.074  0.94135    
## `CityWest Jordan`            2.145e+02  4.933e+02   0.435  0.66371    
## `CityWest Palm Beach`       -3.271e+02  5.112e+02  -0.640  0.52229    
## CityWestfield                9.824e+00  4.729e+02   0.021  0.98342    
## CityWestland                 2.396e+02  4.342e+02   0.552  0.58109    
## CityWestminster              5.813e+02  4.127e+02   1.409  0.15898    
## CityWheeling                 3.658e+02  4.669e+02   0.783  0.43338    
## CityWhittier                        NA         NA      NA       NA    
## CityWichita                  2.067e+02  4.847e+02   0.426  0.66983    
## CityWilmington               3.075e+02  4.061e+02   0.757  0.44899    
## CityWilson                   1.769e+02  4.801e+02   0.369  0.71244    
## CityWoodbury                 5.053e+02  5.038e+02   1.003  0.31586    
## CityWoodland                 5.017e+02  4.856e+02   1.033  0.30157    
## CityWoodstock                2.013e+02  4.492e+02   0.448  0.65401    
## CityWoonsocket               2.216e+02  5.259e+02   0.421  0.67348    
## CityYonkers                  4.766e+02  4.172e+02   1.142  0.25337    
## CityYork                     7.131e+02  4.877e+02   1.462  0.14373    
## CityYucaipa                  4.588e+02  6.353e+02   0.722  0.47022    
## CityYuma                            NA         NA      NA       NA    
## StateAlabama                 5.669e+00  2.593e+02   0.022  0.98256    
## StateArizona                 3.670e+01  3.608e+02   0.102  0.91898    
## StateArkansas                1.732e+02  3.280e+02   0.528  0.59757    
## StateCalifornia             -2.407e+02  2.427e+02  -0.992  0.32128    
## StateColorado                1.517e+01  2.551e+02   0.059  0.95258    
## StateConnecticut            -1.430e+02  2.673e+02  -0.535  0.59268    
## StateDelaware               -1.704e+02  2.537e+02  -0.672  0.50177    
## `StateDistrict of Columbia`         NA         NA      NA       NA    
## StateFlorida                 4.736e+02  2.888e+02   1.640  0.10113    
## StateGeorgia                -2.641e+01  2.388e+02  -0.111  0.91192    
## StateIdaho                   3.315e+01  7.082e+02   0.047  0.96266    
## StateIllinois               -4.970e+00  2.434e+02  -0.020  0.98371    
## StateIndiana                 1.520e+01  2.327e+02   0.065  0.94792    
## StateIowa                   -2.108e+02  2.894e+02  -0.728  0.46634    
## StateKansas                         NA         NA      NA       NA    
## StateKentucky               -8.211e+01  2.444e+02  -0.336  0.73691    
## StateLouisiana              -3.876e+02  2.836e+02  -1.367  0.17176    
## StateMaine                          NA         NA      NA       NA    
## StateMaryland               -4.977e+00  2.663e+02  -0.019  0.98509    
## StateMassachusetts           4.542e+01  2.120e+02   0.214  0.83039    
## StateMichigan                2.762e+01  2.624e+02   0.105  0.91617    
## StateMinnesota              -2.167e+02  2.771e+02  -0.782  0.43417    
## StateMississippi            -5.833e+01  2.754e+02  -0.212  0.83225    
## StateMissouri               -1.080e+01  2.434e+02  -0.044  0.96461    
## StateMontana                        NA         NA      NA       NA    
## StateNebraska                       NA         NA      NA       NA    
## StateNevada                  7.503e+01  3.863e+02   0.194  0.84603    
## `StateNew Hampshire`        -3.693e+02  2.857e+02  -1.293  0.19619    
## `StateNew Jersey`            2.467e+02  2.813e+02   0.877  0.38054    
## `StateNew Mexico`                   NA         NA      NA       NA    
## `StateNew York`             -1.605e+02  2.505e+02  -0.641  0.52158    
## `StateNorth Carolina`       -2.693e+00  2.714e+02  -0.010  0.99209    
## `StateNorth Dakota`                 NA         NA      NA       NA    
## StateOhio                   -1.100e+02  2.342e+02  -0.470  0.63867    
## StateOklahoma                       NA         NA      NA       NA    
## StateOregon                 -1.895e+02  2.488e+02  -0.761  0.44647    
## StatePennsylvania           -4.921e+02  3.065e+02  -1.606  0.10838    
## `StateRhode Island`                 NA         NA      NA       NA    
## `StateSouth Carolina`       -1.253e+02  2.515e+02  -0.498  0.61827    
## `StateSouth Dakota`                 NA         NA      NA       NA    
## StateTennessee              -4.918e+01  2.247e+02  -0.219  0.82677    
## StateTexas                  -1.655e+02  2.527e+02  -0.655  0.51245    
## StateUtah                           NA         NA      NA       NA    
## StateVermont                -3.550e+02  3.419e+02  -1.038  0.29915    
## StateVirginia               -4.155e+01  2.390e+02  -0.174  0.86200    
## StateWashington             -1.231e+02  2.905e+02  -0.424  0.67184    
## `StateWest Virginia`        -6.242e+01  4.753e+02  -0.131  0.89552    
## StateWisconsin                      NA         NA      NA       NA    
## StateWyoming                        NA         NA      NA       NA    
## RegionCentral                       NA         NA      NA       NA    
## RegionEast                          NA         NA      NA       NA    
## RegionSouth                         NA         NA      NA       NA    
## RegionWest                          NA         NA      NA       NA    
## Category.Furniture          -8.857e+02  4.804e+01 -18.436  < 2e-16 ***
## `Category.Office Supplies`  -1.384e+03  4.829e+01 -28.668  < 2e-16 ***
## Category.Technology                 NA         NA      NA       NA    
## Sub.CategoryAccessories     -1.479e+03  5.571e+01 -26.550  < 2e-16 ***
## Sub.CategoryAppliances      -5.642e+01  3.298e+01  -1.711  0.08713 .  
## Sub.CategoryArt             -2.149e+02  2.775e+01  -7.746 1.07e-14 ***
## Sub.CategoryBinders         -1.681e+02  2.886e+01  -5.823 6.01e-09 ***
## Sub.CategoryChairs          -2.553e+02  4.442e+01  -5.747 9.44e-09 ***
## Sub.CategoryFurnishings     -6.652e+02  4.259e+01 -15.621  < 2e-16 ***
## Sub.CategoryLabels          -2.205e+02  3.510e+01  -6.283 3.49e-10 ***
## Sub.CategoryOther           -1.240e+02  2.950e+01  -4.204 2.65e-05 ***
## Sub.CategoryPaper           -2.068e+02  2.444e+01  -8.463  < 2e-16 ***
## Sub.CategoryPhones          -1.303e+03  5.446e+01 -23.936  < 2e-16 ***
## Sub.CategoryStorage                 NA         NA      NA       NA    
## Quantity                     5.067e+01  2.517e+00  20.128  < 2e-16 ***
## Discount                     1.542e+02  5.528e+01   2.790  0.00529 ** 
## Profit                       1.210e+00  2.398e-02  50.463  < 2e-16 ***
## Days_Between                 3.285e+00  5.974e+00   0.550  0.58238    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 500.6 on 7917 degrees of freedom
## Multiple R-squared:  0.4514, Adjusted R-squared:  0.4115 
## F-statistic: 11.31 on 576 and 7917 DF,  p-value: < 2.2e-16
# Making predictions on the test set
predictions <- predict(lm_model2, newdata = test_data)
## Warning in predict.lm(lm_model2, newdata = test_data): prediction from
## rank-deficient fit; attr(*, "non-estim") has doubtful cases
# Calculating the performance metrics (e.g., RMSE)
rmse <- sqrt(mean((predictions - test_data$Sales)^2))

# Printing the RMSE
print(paste("Root Mean Squared Error (RMSE):", rmse))
## [1] "Root Mean Squared Error (RMSE): 404.803076422241"

AIC-BIC to compare models:

# Compare models using AIC
aic_model1 <- AIC(lm_model)
aic_model2 <- AIC(lm_model2)

# Compare models using BIC
bic_model1 <- BIC(lm_model)
bic_model2 <- BIC(lm_model2)

# Printing the AIC and BIC values
print(paste("AIC for lm_model:", aic_model1))
## [1] "AIC for lm_model: 104433.670401841"
print(paste("AIC for lm_model2:", aic_model2))
## [1] "AIC for lm_model2: 130256.169483449"
print(paste("BIC for lm_model:", bic_model1))
## [1] "BIC for lm_model: 108257.616996843"
print(paste("BIC for lm_model2:", bic_model2))
## [1] "BIC for lm_model2: 134329.402133133"

AIC_BIC is better for lm_model as it is lower. Therefore, AIC-BIC suggest model 1 is better.

Key Takeaways:

  • There are no missing values in the columns.

  • All of them are more peaked than a normal distribution (kurtosis). None of the columns are normally distributed (skewness). Except for days_between, all of them are right skewed.

  • So we can see a number of outliers in Sales that can mean a certain number of things. The sales were specifically high on these days, could be the discounts given were high or other fa

  • The plot seems pretty uniform that means quantity is not affecting the discount.

  • No strong linear correlations can be seen here.

  • Hypothesis testing suggests there is no significant difference in quantity ordered between different states.

  • Hypothesis testing also suggests there is no significant interaction effect between “State” and “Category”

  • Hypothesis testing also suggests sales differ among the subcategories.

  • Did feature engineering to create new feature days_between. Encoded categorical columns so they can be accepted by the model.

  • Data split into train and test sets.

  • Trained 2 different regression models which predicts on the basis of all the factors, what would be the Sale for a partcular superstore.

  • Finally compared and chose the right model on the basis of AIC-BIC score.