Problem Definition

Use data set of Mutual Funds.
1. Analysis of Mutual Fund Data of different scheme.(Summarize, plots etc.)
2. Visualization of Data
3. Some T and Chi square tests through data
4. Correlation between dependent and Independent variables
5. Find out which all columns / features impact Price of hotel room
6. Predict the hotel prices with some dummy values.

Attributes:
Dataset is of different mutual fund schemes in India:
Dependent Variable
1YearReturn - Annual Return by scheme

Independent Variables
Investment Style
Market Cap
Turnover
Net Assets (Cr)
Standard Deviation
Sharpe Ratio
Sortino Ratio
Beta
Alpha
R-Squared
Expense Ratio
Tenure 1
Tenure 2
Tenure3

Setup

library(tidyr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(corrgram)
library(gridExtra) 
## 
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
## 
##     combine
library(vcd)
## Loading required package: grid
library(psych)
## 
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha
library(car)
## 
## Attaching package: 'car'
## The following object is masked from 'package:psych':
## 
##     logit
## The following object is masked from 'package:dplyr':
## 
##     recode
library(corrplot)
library(coefplot)

Functions

detect_outliers <- function(inp, na.rm=TRUE) {
  i.qnt <- quantile(inp, probs=c(.25, .75), na.rm=na.rm)
  i.max <- 1.5 * IQR(inp, na.rm=na.rm)
  otp <- inp
  otp[inp < (i.qnt[1] - i.max)] <- NA
  otp[inp > (i.qnt[2] + i.max)] <- NA
  #inp <- count(inp[is.na(otp)])
  sum(is.na(otp))
}

Non_outliers <- function(x, na.rm = TRUE, ...) {
  qnt <- quantile(x, probs=c(.25, .75), na.rm = na.rm, ...)
  H <- 1.5 * IQR(x, na.rm = na.rm)
  y <- x
  y[x < (qnt[1] - H)] <- NA
  y[x > (qnt[2] + H)] <- NA
  y
}

Remove_Outliers <- function ( z, na.rm = TRUE){
 Out <- Non_outliers(z)
 Out <-as.data.frame (Out)
 z <- Out$Out[match(z, Out$Out)]
 z
}

Graph_Boxplot <- function (input, na.rm = TRUE){
Plot <- ggplot(dfrModel, aes(x="", y=input)) +
            geom_boxplot(aes(fill=input), color="green") +
            labs(title="Outliers")
Plot
}

Dataset

dfrModel <- read.csv("D:/Welingkar/Trim 6/Data/Regression_data.csv", header=T, stringsAsFactors=F)
intRowCount <- nrow(dfrModel)
head(dfrModel)
##   X1.Year.Return Investment.Style Market.Cap Turnover Net.Assets..Cr.
## 1          12.90                1   66337.65       62         5819.08
## 2          14.35                1   66337.65       62         5819.08
## 3          16.39                2   50546.68       24         1453.04
## 4          14.86                2   50546.68       24         1453.04
## 5          11.32                1   63907.70       49         8602.25
## 6          12.67                1   63907.70       49         8602.25
##   Standard.Deviation Sharpe.Ratio Sortino.Ratio Beta Alpha R.Squared
## 1              15.51         0.62          1.00 0.97  5.00      0.73
## 2              15.52         0.69          1.12 0.97  6.19      0.73
## 3              19.36         0.74          1.11 0.89  8.70      0.79
## 4              19.35         0.69          1.04 0.89  7.75      0.78
## 5              14.30         0.71          1.09 0.94  5.69      0.81
## 6              14.32         0.78          1.20 0.94  6.70      0.81
##   Expense.Ratio Tenure.1 Tenure.2 Tenure3
## 1          2.30      6.4      0.0       0
## 2          1.00      5.2      0.0       0
## 3          1.15      4.3      2.6       0
## 4          2.45      4.3      2.6       0
## 5          2.23      5.5      0.0       0
## 6          0.99      5.2      0.0       0

Observation 1. There are total ‘intRowCount’ data records in the file.
As there are Non Numeric data as well in the given dataset, so we are going to remove the non numeric data.

Summary

lapply(dfrModel, FUN=describe)
## $X1.Year.Return
##    vars   n  mean   sd median trimmed  mad min   max range skew kurtosis
## X1    1 164 17.76 6.02  16.78   17.12 5.17 6.1 37.93 31.83 1.06     1.44
##      se
## X1 0.47
## 
## $Investment.Style
##    vars   n mean   sd median trimmed mad min max range skew kurtosis   se
## X1    1 164 1.29 0.48      1    1.22   0   1   3     2 1.26     0.36 0.04
## 
## $Market.Cap
##    vars   n     mean       sd   median  trimmed      mad     min    max
## X1    1 164 55229.49 40437.65 49254.58 51477.42 42578.32 2628.39 204103
##       range skew kurtosis      se
## X1 201474.6 0.89      0.8 3157.65
## 
## $Turnover
##    vars   n  mean    sd median trimmed   mad min max range skew kurtosis
## X1    1 164 68.88 58.26   51.5   58.64 33.36   2 369   367 2.17     5.82
##      se
## X1 4.55
## 
## $Net.Assets..Cr.
##    vars   n    mean      sd  median trimmed     mad   min      max   range
## X1    1 164 4086.98 5136.47 2265.55 2928.68 2562.82 26.34 21621.14 21594.8
##    skew kurtosis     se
## X1 1.85      2.6 401.09
## 
## $Standard.Deviation
##    vars   n  mean   sd median trimmed  mad   min   max range skew kurtosis
## X1    1 164 14.92 1.97  14.73   14.81 1.99 10.66 21.07 10.41 0.54     0.13
##      se
## X1 0.15
## 
## $Sharpe.Ratio
##    vars   n mean  sd median trimmed  mad  min  max range skew kurtosis
## X1    1 164 0.74 0.2   0.72    0.73 0.18 0.35 1.33  0.98 0.52     0.02
##      se
## X1 0.02
## 
## $Sortino.Ratio
##    vars   n mean   sd median trimmed  mad  min  max range skew kurtosis
## X1    1 164 1.07 0.26   1.04    1.06 0.27 0.57 1.82  1.25 0.38    -0.51
##      se
## X1 0.02
## 
## $Beta
##    vars   n mean  sd median trimmed  mad  min  max range  skew kurtosis
## X1    1 164 0.94 0.1   0.94    0.94 0.09 0.61 1.19  0.58 -0.12     0.72
##      se
## X1 0.01
## 
## $Alpha
##    vars   n mean   sd median trimmed  mad  min   max range skew kurtosis
## X1    1 164 6.74 3.65   6.47    6.43 3.39 0.52 19.09 18.57 0.78     0.58
##      se
## X1 0.29
## 
## $R.Squared
##    vars   n mean   sd median trimmed  mad  min  max range  skew kurtosis
## X1    1 164 0.77 0.12   0.76    0.78 0.14 0.39 0.98  0.59 -0.48    -0.04
##      se
## X1 0.01
## 
## $Expense.Ratio
##    vars   n mean  sd median trimmed  mad  min  max range  skew kurtosis
## X1    1 164 1.64 0.6   1.58    1.65 0.76 0.19 2.72  2.53 -0.05    -1.06
##      se
## X1 0.05
## 
## $Tenure.1
##    vars   n mean  sd median trimmed  mad min  max range skew kurtosis   se
## X1    1 164 4.57 2.7    5.2    4.33 2.08 0.2 14.2    14 0.83      1.1 0.21
## 
## $Tenure.2
##    vars   n mean   sd median trimmed mad min  max range skew kurtosis   se
## X1    1 164 0.81 1.72      0    0.36   0   0 10.8  10.8 2.89     9.73 0.13
## 
## $Tenure3
##    vars   n mean   sd median trimmed mad min max range skew kurtosis   se
## X1    1 164 0.16 0.56      0       0   0   0 3.2   3.2  3.3     9.78 0.04

Box Plot

lapply(dfrModel, FUN=Graph_Boxplot)
## $X1.Year.Return

## 
## $Investment.Style

## 
## $Market.Cap

## 
## $Turnover

## 
## $Net.Assets..Cr.

## 
## $Standard.Deviation

## 
## $Sharpe.Ratio

## 
## $Sortino.Ratio

## 
## $Beta

## 
## $Alpha

## 
## $R.Squared

## 
## $Expense.Ratio

## 
## $Tenure.1

## 
## $Tenure.2

## 
## $Tenure3

Observation
There are few outliers in the datasets

Tables

Investment_Style <- table(dfrModel$Investment.Style)
Investment_Style
## 
##   1   2   3 
## 119  43   2
prop.table(Investment_Style)
## 
##          1          2          3 
## 0.72560976 0.26219512 0.01219512

Observations
Here
1 Implies Growth Investment Style
2 Implies Blend Investment Style
3 implies Value Investment Style

Scatter Plot

plot(y=dfrModel$X1.Year.Return, x=dfrModel$Net.Assets..Cr.,
     col="green",
     ylim=c(0, 50), xlim=c(0, 22000), 
     main="Relationship Btw Return & Net Assets",
     ylab="Return", xlab="Net Assets(Crs)")

scatterplot(dfrModel$X1.Year.Return, dfrModel$Standard.Deviation , main="Relationship Btw Risk & Return", xlab="Risk", ylab="Return")

plot((dfrModel$Investment.Style),jitter(dfrModel$X1.Year.Return),
     col="green",
     ylim=c(0, 30), xlim=c(1,3), 
     main="Relationship Btw Investment Style & Return",
     ylab="Hotel Rent", xlab="Investment Style")

plot(y=dfrModel$X1.Year.Return, x=dfrModel$Sharpe.Ratio,
     col="blue",
     ylim=c(0, 40), xlim=c(0, 2), 
     main="Relationship Btw Room Rent and Star Rating of Hotel",
     ylab="Hotel Rent", xlab="Star Rating")

scatterplot(dfrModel$X1.Year.Return, dfrModel$Sharpe.Ratio , main="Relationship Btw Sharpe Ratio & Return", xlab="Sharpe Ratio", ylab="Return")

plot(y=dfrModel$X1.Year.Return, x=dfrModel$Alpha,
     col="green",
     ylim=c(0, 40), xlim=c(0, 25), 
     main="Relationship Btw Alpha and Return",
     ylab="Return", xlab="Alpha")

scatterplot(dfrModel$Alpha, dfrModel$X1.Year.Return , main="Relationship Btw Alpha & Return", xlab="Alpha", ylab="Return")

Observations
1.Above scatter plot is showing some relationship between Hotel rent and other Independent variables.

Correlation Plot

#pairs(dfrModel)
corrplot(corr=cor(dfrModel[ , c(1,2,3,4,5)], use="complete.obs"), 
         method ="ellipse")

corrplot(corr=cor(dfrModel[ , c(1,6,7,8,9)], use="complete.obs"), 
         method ="ellipse")

corrplot(corr=cor(dfrModel[ , c(1,10,11,12,12)], use="complete.obs"), 
         method ="ellipse")

Observations
1. We can see few variables are having very good correlation with Annual Return on different schemes 2. Standard Deviation, Sharpe Ratio & Alpha is very good coorelated with Annual Return.

Correlation Matrix

cor(dfrModel[, c(1:13)]) 
##                    X1.Year.Return Investment.Style  Market.Cap    Turnover
## X1.Year.Return         1.00000000      -0.10456392 -0.47907689  0.04934270
## Investment.Style      -0.10456392       1.00000000 -0.30875673  0.06848923
## Market.Cap            -0.47907689      -0.30875673  1.00000000  0.14590358
## Turnover               0.04934270       0.06848923  0.14590358  1.00000000
## Net.Assets..Cr.       -0.33953020      -0.02665225  0.16388213 -0.05273135
## Standard.Deviation     0.34830314       0.23074846 -0.45929001 -0.01605976
## Sharpe.Ratio           0.62250124       0.06508839 -0.65138337 -0.20646989
## Sortino.Ratio          0.49454982       0.18243604 -0.59703821 -0.18234403
## Beta                   0.10313459       0.20756302 -0.13703157  0.02673520
## Alpha                  0.65548490       0.11697120 -0.70263481 -0.16780872
## R.Squared             -0.46001065      -0.16047660  0.76549672  0.18005088
## Expense.Ratio          0.00260838      -0.07250175 -0.05679234 -0.13159745
## Tenure.1              -0.12422889      -0.02376280 -0.08835882 -0.08106243
##                    Net.Assets..Cr. Standard.Deviation Sharpe.Ratio
## X1.Year.Return         -0.33953020         0.34830314   0.62250124
## Investment.Style       -0.02665225         0.23074846   0.06508839
## Market.Cap              0.16388213        -0.45929001  -0.65138337
## Turnover               -0.05273135        -0.01605976  -0.20646989
## Net.Assets..Cr.         1.00000000        -0.20631606  -0.08135486
## Standard.Deviation     -0.20631606         1.00000000   0.31568785
## Sharpe.Ratio           -0.08135486         0.31568785   1.00000000
## Sortino.Ratio          -0.03232574         0.32358763   0.93586004
## Beta                   -0.07889484         0.64339918  -0.08257479
## Alpha                  -0.12148716         0.54380060   0.95774922
## R.Squared               0.15267659        -0.40079552  -0.70905176
## Expense.Ratio          -0.02186945         0.11909018  -0.06343242
## Tenure.1                0.23141599        -0.01427949  -0.04102574
##                    Sortino.Ratio        Beta       Alpha   R.Squared
## X1.Year.Return        0.49454982  0.10313459  0.65548490 -0.46001065
## Investment.Style      0.18243604  0.20756302  0.11697120 -0.16047660
## Market.Cap           -0.59703821 -0.13703157 -0.70263481  0.76549672
## Turnover             -0.18234403  0.02673520 -0.16780872  0.18005088
## Net.Assets..Cr.      -0.03232574 -0.07889484 -0.12148716  0.15267659
## Standard.Deviation    0.32358763  0.64339918  0.54380060 -0.40079552
## Sharpe.Ratio          0.93586004 -0.08257479  0.95774922 -0.70905176
## Sortino.Ratio         1.00000000 -0.08174135  0.90493003 -0.67118231
## Beta                 -0.08174135  1.00000000  0.07931381  0.12212232
## Alpha                 0.90493003  0.07931381  1.00000000 -0.77474175
## R.Squared            -0.67118231  0.12212232 -0.77474175  1.00000000
## Expense.Ratio        -0.15805730  0.07806508 -0.01624840 -0.05942900
## Tenure.1             -0.03869664  0.08135126 -0.05055422 -0.02188434
##                    Expense.Ratio    Tenure.1
## X1.Year.Return        0.00260838 -0.12422889
## Investment.Style     -0.07250175 -0.02376280
## Market.Cap           -0.05679234 -0.08835882
## Turnover             -0.13159745 -0.08106243
## Net.Assets..Cr.      -0.02186945  0.23141599
## Standard.Deviation    0.11909018 -0.01427949
## Sharpe.Ratio         -0.06343242 -0.04102574
## Sortino.Ratio        -0.15805730 -0.03869664
## Beta                  0.07806508  0.08135126
## Alpha                -0.01624840 -0.05055422
## R.Squared            -0.05942900 -0.02188434
## Expense.Ratio         1.00000000  0.20736973
## Tenure.1              0.20736973  1.00000000

Correlation with Room Rent
Correlation

vctCorr = numeric(0)
for (i in names(dfrModel)){
cor.result <- cor(dfrModel$X1.Year.Return, as.numeric(dfrModel[,i]))
vctCorr <- c(vctCorr, cor.result)
}
dfrCorr <- vctCorr
names(dfrCorr) <- names(dfrModel)
dfrCorr
##     X1.Year.Return   Investment.Style         Market.Cap 
##         1.00000000        -0.10456392        -0.47907689 
##           Turnover    Net.Assets..Cr. Standard.Deviation 
##         0.04934270        -0.33953020         0.34830314 
##       Sharpe.Ratio      Sortino.Ratio               Beta 
##         0.62250124         0.49454982         0.10313459 
##              Alpha          R.Squared      Expense.Ratio 
##         0.65548490        -0.46001065         0.00260838 
##           Tenure.1           Tenure.2            Tenure3 
##        -0.12422889        -0.09932426        -0.13050913

Visualize

dfrGraph <- gather(dfrModel, variable, value, -X1.Year.Return)
head(dfrGraph)
##   X1.Year.Return         variable value
## 1          12.90 Investment.Style     1
## 2          14.35 Investment.Style     1
## 3          16.39 Investment.Style     2
## 4          14.86 Investment.Style     2
## 5          11.32 Investment.Style     1
## 6          12.67 Investment.Style     1
ggplot(dfrGraph) +
geom_jitter(aes(value,X1.Year.Return, colour=variable)) + 
geom_smooth(aes(value,X1.Year.Return, colour=variable), method=lm, se=FALSE) +
facet_wrap(~variable, scales="free_x") +
labs(title="Relation Of Return With Other Features")

Regression Analysis
Find Best Multi Linear Model for Economy Class
Choose the best linear model by using step(). Choose a model by AIC in a Stepwise Algorithm
In statistics, stepwise regression is a method of fitting regression models in which the choice of predictive variables is carried out by an automatic procedure. In each step, a variable is considered for addition to or subtraction from the set of explanatory variables based on some prespecified criterion.
The Akaike information criterion (AIC) is a measure of the relative quality of statistical models for a given set of data. Given a collection of models for the data, AIC estimates the quality of each model, relative to each of the other models. Hence, AIC provides a means for model selection.

#?step()
stpModel=step(lm(data=dfrModel, X1.Year.Return~.), trace=0, steps=1000)
stpSummary <- summary(stpModel)
stpSummary 
## 
## Call:
## lm(formula = X1.Year.Return ~ Investment.Style + Market.Cap + 
##     Turnover + Net.Assets..Cr. + Standard.Deviation + Sortino.Ratio + 
##     Alpha + R.Squared + Tenure.2, data = dfrModel)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.2545  -2.0996  -0.0058   2.3818  14.2957 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         1.958e+01  5.260e+00   3.723 0.000276 ***
## Investment.Style   -1.836e+00  7.250e-01  -2.532 0.012352 *  
## Market.Cap         -2.435e-05  1.275e-05  -1.910 0.058016 .  
## Turnover            1.462e-02  5.310e-03   2.752 0.006629 ** 
## Net.Assets..Cr.    -2.837e-04  6.030e-05  -4.705 5.60e-06 ***
## Standard.Deviation -6.011e-01  2.214e-01  -2.714 0.007398 ** 
## Sortino.Ratio      -1.251e+01  3.364e+00  -3.719 0.000280 ***
## Alpha               2.206e+00  3.069e-01   7.188 2.66e-11 ***
## R.Squared           1.273e+01  4.774e+00   2.667 0.008465 ** 
## Tenure.2           -3.164e-01  1.750e-01  -1.808 0.072623 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.814 on 154 degrees of freedom
## Multiple R-squared:  0.6214, Adjusted R-squared:  0.5992 
## F-statistic: 28.08 on 9 and 154 DF,  p-value: < 2.2e-16

Model1

## ------------------------------------------------------------------------
Model1 <- X1.Year.Return ~ Investment.Style+Market.Cap+Turnover+Net.Assets..Cr.+Standard.Deviation+Sharpe.Ratio+Sortino.Ratio+Beta+Alpha+R.Squared+Expense.Ratio+Tenure.1+Tenure.2+Tenure3

fit1 <- lm(Model1, data = dfrModel)
summary(fit1)
## 
## Call:
## lm(formula = Model1, data = dfrModel)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.1177  -2.0616   0.0876   2.1490  14.2157 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         2.431e+01  8.704e+00   2.793 0.005904 ** 
## Investment.Style   -2.024e+00  7.491e-01  -2.702 0.007686 ** 
## Market.Cap         -2.395e-05  1.409e-05  -1.700 0.091303 .  
## Turnover            1.435e-02  5.612e-03   2.556 0.011590 *  
## Net.Assets..Cr.    -2.904e-04  6.392e-05  -4.543 1.14e-05 ***
## Standard.Deviation -1.030e+00  4.422e-01  -2.329 0.021190 *  
## Sharpe.Ratio       -9.364e+00  1.281e+01  -0.731 0.465914    
## Sortino.Ratio      -1.144e+01  3.861e+00  -2.962 0.003557 ** 
## Beta                5.323e+00  5.113e+00   1.041 0.299604    
## Alpha               2.731e+00  7.863e-01   3.473 0.000675 ***
## R.Squared           1.216e+01  6.204e+00   1.960 0.051852 .  
## Expense.Ratio      -1.872e-01  5.614e-01  -0.333 0.739295    
## Tenure.1           -8.054e-03  1.266e-01  -0.064 0.949359    
## Tenure.2           -3.490e-01  2.017e-01  -1.730 0.085704 .  
## Tenure3             2.965e-01  6.343e-01   0.467 0.640873    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.852 on 149 degrees of freedom
## Multiple R-squared:  0.6263, Adjusted R-squared:  0.5912 
## F-statistic: 17.84 on 14 and 149 DF,  p-value: < 2.2e-16

Model Fit

## ------------------------------------------------------------------------
library(leaps)
leap1 <- regsubsets(Model1, data = dfrModel, nbest=1)
# summary(leap1)
plot(leap1, scale="adjr2")

Observations
The best fit model excludes Sharpe Ratio, Beta & Expense Ratio. Therefore, in our next model, we rerun the regression, excluding these variables.
Model2

## ------------------------------------------------------------------------
Model2 <- X1.Year.Return ~ Investment.Style + Market.Cap + 
    Turnover + Net.Assets..Cr. + Standard.Deviation + Sortino.Ratio + 
    Alpha + R.Squared + Tenure.2
fit2 <- lm(Model2, data = dfrModel)
summary(fit2)
## 
## Call:
## lm(formula = Model2, data = dfrModel)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.2545  -2.0996  -0.0058   2.3818  14.2957 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         1.958e+01  5.260e+00   3.723 0.000276 ***
## Investment.Style   -1.836e+00  7.250e-01  -2.532 0.012352 *  
## Market.Cap         -2.435e-05  1.275e-05  -1.910 0.058016 .  
## Turnover            1.462e-02  5.310e-03   2.752 0.006629 ** 
## Net.Assets..Cr.    -2.837e-04  6.030e-05  -4.705 5.60e-06 ***
## Standard.Deviation -6.011e-01  2.214e-01  -2.714 0.007398 ** 
## Sortino.Ratio      -1.251e+01  3.364e+00  -3.719 0.000280 ***
## Alpha               2.206e+00  3.069e-01   7.188 2.66e-11 ***
## R.Squared           1.273e+01  4.774e+00   2.667 0.008465 ** 
## Tenure.2           -3.164e-01  1.750e-01  -1.808 0.072623 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.814 on 154 degrees of freedom
## Multiple R-squared:  0.6214, Adjusted R-squared:  0.5992 
## F-statistic: 28.08 on 9 and 154 DF,  p-value: < 2.2e-16

Observations of Regression Analysis
Null Hypothesis
There is no dependency between return by mutual fund and other variables

Alternative Hypothesis
There is dependency between return by mutual fund and other variables

As per regression model we find out that P Value is less than 0.05 which means we are rejecting the NULL Hypothesis at 95% Confidence interval.
As well as we can see that F value is very high which means means of all the variables differ.

Below are the 9 variables which are affecting the price of the Room of the hotels, As well as they are in the order of significance to affect the return of Mutual Fund
Alpha
Sortino.Ratio
Net.Assets..Cr.
Standard.Deviation
R.Square
Turnover
Investment Style
Market Cap
Tenure

VISUALIZE THE BETA COEFFICIENTS AND THEIR CONFIDENCE INTERVALS FROM MODEL 2

library(coefplot)
coefplot(fit2, intercept= FALSE, outerCI=1.96,coefficients=c("Investment.Style","Market.Cap", "Net.Assets..Cr.", "Standard.Deviation", "Sortino.Ratio", "Alpha", "R.Squared", "Tenure.2"))
## Warning: Ignoring unknown aesthetics: xmin, xmax

## ------------------------------------------------------------------------
# the Adjusted R Squared for Model 2 is less than Model 1
summary(fit1)$adj.r.squared
## [1] 0.5911897
summary(fit2)$adj.r.squared
## [1] 0.5992417
# the AIC for Model 2 is less than Model 1
AIC(fit1)
## [1] 924.004
AIC(fit2)
## [1] 916.1546

Observations
1. We can see that Adjusted R square value is more for model 2 instead of model 1 so model 2 is better
2. As well as AIC Value is less than Model 1, so Model 2 is better

Summary

  1. Data has been loaded successfully
  2. Data has been summarized to know the different statistical values
  3. Outliers has been find out in each variable and Evry variable is plotted on Box plot to know about the outliers
  4. Scatter plot as well as Corrgram is plotted which is showing the relationship between Room Rent and other variables
  5. Continuous variable are shown on Box plot while tables is used for discrete variables.
  6. For Regression Analysis,
    Dependent Variable: Return by Mutual Fund Schemes

Below are the 9 variables which are affecting the return of Mutual Funds, As well as they are in the order of significance to affect the room rent
Alpha Sortino.Ratio
Net.Assets..Cr.
Standard.Deviation
R.Square
Turnover
Investment Style
Market Cap
Tenure

###########End of the Project#########