library(readxl)
Regression_Analysis <- read_excel("C:/Users/Hxtreme/Desktop/Regression_BATCH2.xlsx")
IMPORTED THE DATA USING “IMPORT DATASET” AVAILABLE IN THE ENVIRONMENT COLUMN. THEN COPIED THE CODE PREVIEW FROM THE IMPORT DATA SECTION.
View(Regression_Analysis)
THIS FUNCTION HELPS US TO VIEW THE IMPORTED DATA
summary(Regression_Analysis)
Student School_Ranking GPA Experience
Min. : 1.00 Min. :15.00 Min. :2.760 Min. :2.000
1st Qu.:10.75 1st Qu.:45.75 1st Qu.:3.033 1st Qu.:5.000
Median :20.50 Median :67.00 Median :3.155 Median :6.000
Mean :20.50 Mean :59.88 Mean :3.233 Mean :5.975
3rd Qu.:30.25 3rd Qu.:76.50 3rd Qu.:3.350 3rd Qu.:7.000
Max. :40.00 Max. :89.00 Max. :3.850 Max. :9.000
Salary
Min. :71040
1st Qu.:76913
Median :78670
Mean :78721
3rd Qu.:80600
Max. :87000
THIS FUNCTION SHOWS THE OVERALL SUMMARY OF THE TABLE.
str(Regression_Analysis)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 40 obs. of 5 variables:
$ Student : num 1 2 3 4 5 6 7 8 9 10 ...
$ School_Ranking: num 78 56 23 67 56 78 68 89 37 67 ...
$ GPA : num 2.92 3.84 3.04 3.2 3.61 2.99 3.78 3.2 3.42 3.05 ...
$ Experience : num 3 9 6 6 7 5 8 5 7 5 ...
$ Salary : num 73590 87000 76970 79320 79530 ...
THIS FUNCTION SHOWS THE STRUCTURE OF THE TABLE.
THIS FUNCTION SHOWS THE ENTIRE DATA IN A PLOT GRAPH.
THIS FUNCTION SHOWS THE SCATTER PLOT GRAPH FOR THE FIRST TWO VARIABLES IN THE DATA.
THE “par” FUNCTION IS USED TO SET THE MULTIFRAME GRAPHICAL PARAMETERS.
cc
Student School_Ranking GPA Experience Salary
Student 1.00000000 0.0582101 -0.1262402 0.04395596 -0.00261919
School_Ranking 0.05821010 1.0000000 0.2051312 0.20250931 0.23429048
GPA -0.12624017 0.2051312 1.0000000 0.65904413 0.73788910
Experience 0.04395596 0.2025093 0.6590441 1.00000000 0.78580114
Salary -0.00261919 0.2342905 0.7378891 0.78580114 1.00000000
THIS FUNCTON HELPS TO DO THE CORRELATION ANALYSIS FOR THE GIVEN DATA.
THE FUNCTION “library” IS USED TO CALL THE CORRPLOT GRAPH FUNCTION,WHICH IS BASED FROM THE DERIVED CORRELATION ANALYSIS OF THE GIVEN DATA.
linearmodel
Call:
lm(formula = Salary ~ GPA + Experience + School_Ranking + Student,
data = Regression_Analysis)
Coefficients:
(Intercept) GPA Experience School_Ranking
52751.870 5534.006 1232.513 9.442
Student
7.064
THIS FUNCTION IS USED TO INSERT THE FORMULA FOR CALCUATING THE REGRESSION ANALYSIS FROM THE GIVEN DATA.
summary(linearmodel)
Call:
lm(formula = Salary ~ GPA + Experience + School_Ranking + Student,
data = Regression_Analysis)
Residuals:
Min 1Q Median 3Q Max
-6359.4 -736.0 306.7 1392.8 4440.1
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 52751.870 4928.281 10.704 1.39e-12 ***
GPA 5534.006 1785.345 3.100 0.003811 **
Experience 1232.513 294.651 4.183 0.000183 ***
School_Ranking 9.442 18.450 0.512 0.612031
Student 7.064 31.929 0.221 0.826188
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2273 on 35 degrees of freedom
Multiple R-squared: 0.7058, Adjusted R-squared: 0.6722
F-statistic: 20.99 on 4 and 35 DF, p-value: 6.703e-09
THIS FUNCTION GIVES THE ENTIRE SUMMARY OF THE WHOLE REGRESSION ANALYSIS DONE ON THE DERIVED DATA.
linearmodel
Call:
lm(formula = Salary ~ GPA + Experience, data = Regression_Analysis)
Coefficients:
(Intercept) GPA Experience
53295 5540 1257
BASED ON THE P-VALUE FROM THE REGRESSION ANALYSIS,‘GPA & EXPERIENCE’ VARIABLES ARE THE ONES CLOSELY RELATED TO THE DEPENDENT VARIABLE ‘SALARY’.
my_predict_result
1
87280.72
WITH THE CALCULATED REGRESSION EQUATION,WE ARE ABLE TO PREDICT THE SALARY OF A STUDENT WHO HAS A ‘EXPERIENCE=5,GPA=5’.