Introduction
The Logistic Regression is a regression model in which the response variable (dependent variable) has categorical values such as True/False or 0/1. It actually measures the probability of a binary response as the value of response variable based on the mathematical equation relating it with the predictor variables.

Problem Definition
The objective is to predict based on diagnostic measurements whether a patient has diabetes or not.

Dataset
This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.

Data Description Attributes: [, 1] Pregnancies: Number of times pregnant
[, 2] Glucose:Plasma glucose concentration a 2 hours in an oral glucose tolerance test
[, 3] BloodPressure:Diastolic blood pressure (mm Hg)
[, 4] SkinThickness:Triceps skin fold thickness (mm)
[, 5] Insulin:2-Hour serum insulin (mu U/ml)
[, 6] BMI: Body mass index (weight in kg/(height in m)^2)
[, 7] DiabetesPedigreeFunction: Diabetes pedigree function
[, 8] Age:Age (years)
[, 9] Outcome: Class variable (0 or 1), 0=Non diabetic and 1= Diabetic

Setup

library(tidyr)
## Warning: package 'tidyr' was built under R version 3.3.3
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.3.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.3.3
library(corrgram)
## Warning: package 'corrgram' was built under R version 3.3.3
library(gridExtra) 
## Warning: package 'gridExtra' was built under R version 3.3.3
## 
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
## 
##     combine
library(Deducer)
## Warning: package 'Deducer' was built under R version 3.3.3
## Loading required package: JGR
## Warning: package 'JGR' was built under R version 3.3.3
## Loading required package: rJava
## Loading required package: JavaGD
## Loading required package: iplots
## Warning: package 'iplots' was built under R version 3.3.3
## 
## Please type JGR() to launch console. Platform specific launchers (.exe and .app) can also be obtained at http://www.rforge.net/JGR/files/.
## Loading required package: car
## Warning: package 'car' was built under R version 3.3.3
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
## Loading required package: MASS
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
## 
## 
## Note Non-JGR console detected:
##  Deducer is best used from within JGR (http://jgr.markushelbig.org/).
##  To Bring up GUI dialogs, type deducer().
library(caret)
## Warning: package 'caret' was built under R version 3.3.3
## Loading required package: lattice
library(pscl)
## Warning: package 'pscl' was built under R version 3.3.3
## Classes and Methods for R developed in the
## Political Science Computational Laboratory
## Department of Political Science
## Stanford University
## Simon Jackman
## hurdle and zeroinfl functions by Achim Zeileis

Functions

Dataset

setwd("D:\\PGDM\\Trim 4\\MachineLearning")
dfrModel <- read.csv("./Data/Diabetes_train1.csv", header=T, stringsAsFactors=F)
head(dfrModel)
##   Pregnancies Glucose BloodPressure SkinThickness Insulin  BMI
## 1           6     148            72            35       0 33.6
## 2           1      85            66            29       0 26.6
## 3           8     183            64             0       0 23.3
## 4           1      89            66            23      94 28.1
## 5           0     137            40            35     168 43.1
## 6           5     116            74             0       0 25.6
##   DiabetesPedigreeFunction Age Outcome
## 1                    0.627  50       1
## 2                    0.351  31       0
## 3                    0.672  32       1
## 4                    0.167  21       0
## 5                    2.288  33       1
## 6                    0.201  30       0

Datatypes

str(dfrModel)
## 'data.frame':    700 obs. of  9 variables:
##  $ Pregnancies             : int  6 1 8 1 0 5 3 10 2 8 ...
##  $ Glucose                 : int  148 85 183 89 137 116 78 115 197 125 ...
##  $ BloodPressure           : int  72 66 64 66 40 74 50 0 70 96 ...
##  $ SkinThickness           : int  35 29 0 23 35 0 32 0 45 0 ...
##  $ Insulin                 : int  0 0 0 94 168 0 88 0 543 0 ...
##  $ BMI                     : num  33.6 26.6 23.3 28.1 43.1 25.6 31 35.3 30.5 0 ...
##  $ DiabetesPedigreeFunction: num  0.627 0.351 0.672 0.167 2.288 ...
##  $ Age                     : int  50 31 32 21 33 30 26 29 53 54 ...
##  $ Outcome                 : int  1 0 1 0 1 0 1 0 1 1 ...

Observation
Dataset is comprised of integer and numeric data

Check for Missing Data

lapply(dfrModel, FUN=detect_na)
## $Pregnancies
## [1] 0
## 
## $Glucose
## [1] 0
## 
## $BloodPressure
## [1] 0
## 
## $SkinThickness
## [1] 0
## 
## $Insulin
## [1] 0
## 
## $BMI
## [1] 0
## 
## $DiabetesPedigreeFunction
## [1] 0
## 
## $Age
## [1] 0
## 
## $Outcome
## [1] 0

Observation
Dataset has no missing data

Summarizing data

summarise(group_by(dfrModel, Pregnancies), n())
## # A tibble: 17 × 2
##    Pregnancies `n()`
##          <int> <int>
## 1            0   106
## 2            1   120
## 3            2    91
## 4            3    68
## 5            4    63
## 6            5    53
## 7            6    46
## 8            7    43
## 9            8    35
## 10           9    24
## 11          10    20
## 12          11    10
## 13          12     8
## 14          13     9
## 15          14     2
## 16          15     1
## 17          17     1
summarise(group_by(dfrModel, Glucose), n())
## # A tibble: 133 × 2
##    Glucose `n()`
##      <int> <int>
## 1        0     5
## 2       44     1
## 3       56     1
## 4       57     2
## 5       61     1
## 6       62     1
## 7       67     1
## 8       68     3
## 9       71     4
## 10      72     1
## # ... with 123 more rows
summarise(group_by(dfrModel, BloodPressure), n())
## # A tibble: 47 × 2
##    BloodPressure `n()`
##            <int> <int>
## 1              0    33
## 2             24     1
## 3             30     2
## 4             38     1
## 5             40     1
## 6             44     3
## 7             46     1
## 8             48     5
## 9             50    12
## 10            52    10
## # ... with 37 more rows
summarise(group_by(dfrModel, SkinThickness), n())
## # A tibble: 51 × 2
##    SkinThickness `n()`
##            <int> <int>
## 1              0   209
## 2              7     2
## 3              8     2
## 4             10     5
## 5             11     6
## 6             12     7
## 7             13    10
## 8             14     6
## 9             15    14
## 10            16     6
## # ... with 41 more rows
summarise(group_by(dfrModel, Insulin), n())
## # A tibble: 176 × 2
##    Insulin `n()`
##      <int> <int>
## 1        0   338
## 2       14     1
## 3       15     1
## 4       18     2
## 5       23     2
## 6       25     1
## 7       29     1
## 8       32     1
## 9       36     3
## 10      37     2
## # ... with 166 more rows
summarise(group_by(dfrModel, BMI), n())
## # A tibble: 245 × 2
##      BMI `n()`
##    <dbl> <int>
## 1    0.0    10
## 2   18.2     3
## 3   18.4     1
## 4   19.1     1
## 5   19.3     1
## 6   19.4     1
## 7   19.5     2
## 8   19.6     3
## 9   19.9     1
## 10  20.0     1
## # ... with 235 more rows
summarise(group_by(dfrModel, DiabetesPedigreeFunction), n())
## # A tibble: 487 × 2
##    DiabetesPedigreeFunction `n()`
##                       <dbl> <int>
## 1                     0.078     1
## 2                     0.084     1
## 3                     0.085     2
## 4                     0.088     2
## 5                     0.089     1
## 6                     0.092     1
## 7                     0.096     1
## 8                     0.100     1
## 9                     0.101     1
## 10                    0.102     1
## # ... with 477 more rows
summarise(group_by(dfrModel, Age), n())
## # A tibble: 52 × 2
##      Age `n()`
##    <int> <int>
## 1     21    59
## 2     22    63
## 3     23    36
## 4     24    43
## 5     25    46
## 6     26    29
## 7     27    29
## 8     28    32
## 9     29    29
## 10    30    19
## # ... with 42 more rows
summarise(group_by(dfrModel, Outcome), n())
## # A tibble: 2 × 2
##   Outcome `n()`
##     <int> <int>
## 1       0   459
## 2       1   241

Exploratory Analysis

lapply(dfrModel, FUN=summary)
## $Pregnancies
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   1.000   3.000   3.827   6.000  17.000 
## 
## $Glucose
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0    99.0   116.5   120.5   140.2   199.0 
## 
## $BloodPressure
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   63.50   72.00   68.88   80.00  122.00 
## 
## $SkinThickness
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    0.00   23.00   20.38   32.00   99.00 
## 
## $Insulin
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    0.00   36.50   79.88  126.50  846.00 
## 
## $BMI
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   27.00   32.00   31.89   36.50   67.10 
## 
## $DiabetesPedigreeFunction
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0780  0.2400  0.3755  0.4760  0.6370  2.4200 
## 
## $Age
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   21.00   24.00   29.00   33.12   40.00   81.00 
## 
## $Outcome
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.0000  0.0000  0.3443  1.0000  1.0000

Histogram to check data distribution

hist(dfrModel$Pregnancies)

hist(dfrModel$Glucose)

hist(dfrModel$Age)

hist(dfrModel$BMI)

hist(dfrModel$Insulin)

Outliers Data

#detect_outliers(dfrModel$Age)
lapply(dfrModel, FUN=detect_outliers)
## $Pregnancies
## integer(0)
## 
## $Glucose
## integer(0)
## 
## $BloodPressure
##  [1]   0   0   0   0   0   0 122   0   0   0   0   0   0   0   0   0   0
## [18]   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 
## $SkinThickness
## integer(0)
## 
## $Insulin
##  [1] 543 846 495 485 495 478 744 680 545 465 579 474 480 600 540 480
## 
## $BMI
##  [1]  0.0  0.0  0.0  0.0  0.0 67.1  0.0  0.0  0.0  0.0  0.0
## 
## $DiabetesPedigreeFunction
## [1] 2.288 1.893 1.781 2.329 2.137 1.731 2.420 1.699 1.698
## 
## $Age
## [1] 81
## 
## $Outcome
## integer(0)

Display Outliers

lapply(dfrModel[1:8],FUN=display_Outliers)
## $Pregnancies

## 
## $Glucose

## 
## $BloodPressure

## 
## $SkinThickness

## 
## $Insulin

## 
## $BMI

## 
## $DiabetesPedigreeFunction

## 
## $Age

Observation
Outliers are present in few features.
But Outlier count is low.
For this model we will work with the outliers.

Correlation

vctCorr = numeric(0)
for (i in names(dfrModel)){
    cor.result <- cor(as.numeric(dfrModel$Outcome), as.numeric(dfrModel[,i]))
    vctCorr <- c(vctCorr, cor.result)
}
dfrCorr <- vctCorr
names(dfrCorr) <- names(dfrModel)
dfrCorr
##              Pregnancies                  Glucose            BloodPressure 
##               0.22774403               0.45928020               0.06019258 
##            SkinThickness                  Insulin                      BMI 
##               0.08740524               0.14592233               0.30659734 
## DiabetesPedigreeFunction                      Age                  Outcome 
##               0.17053194               0.22699018               1.00000000

Data For Visualization

dfrGraph <- gather(dfrModel, variable, value, -Outcome)
head(dfrGraph)
##   Outcome    variable value
## 1       1 Pregnancies     6
## 2       0 Pregnancies     1
## 3       1 Pregnancies     8
## 4       0 Pregnancies     1
## 5       1 Pregnancies     0
## 6       0 Pregnancies     5
ggplot(dfrGraph) +        #ggplot works better with factors
    geom_jitter(aes(value,Outcome, colour=variable)) + 
    geom_smooth(aes(value,Outcome, colour=variable), method=lm, se=FALSE) +
    facet_wrap(~variable, scales="free_x") +
    labs(title="Relation Of diabetes With Other Features")

Find Best Multi Logistic Model
Choose the best logistic model by using step().

stpModel=step(glm(data=dfrModel, formula=Outcome~., family=binomial), trace=0, steps=100)
summary(stpModel)
## 
## Call:
## glm(formula = Outcome ~ Pregnancies + Glucose + BloodPressure + 
##     BMI + DiabetesPedigreeFunction, family = binomial, data = dfrModel)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.7459  -0.7335  -0.4097   0.7153   2.8945  
## 
## Coefficients:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              -8.009215   0.713218 -11.230  < 2e-16 ***
## Pregnancies               0.157336   0.029270   5.375 7.64e-08 ***
## Glucose                   0.033444   0.003517   9.509  < 2e-16 ***
## BloodPressure            -0.012432   0.005246  -2.370  0.01781 *  
## BMI                       0.091142   0.014970   6.088 1.14e-09 ***
## DiabetesPedigreeFunction  0.885935   0.304143   2.913  0.00358 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 901.37  on 699  degrees of freedom
## Residual deviance: 659.26  on 694  degrees of freedom
## AIC: 671.26
## 
## Number of Fisher Scoring iterations: 5

Observation
Best results given by Outcome ~ Pregnancies + Glucose + BloodPressure + BMI + DiabetesPedigreeFunction.
p-values for the features are less than 0.05.
Difference between the null deviance and residual deviance is quiet large. Thus model is fit.

Make Final Multi Linear Model

# make model
mgmModel <- glm(data=dfrModel, formula=Outcome~Pregnancies+Glucose+BloodPressure+BMI+DiabetesPedigreeFunction, family=binomial(link="logit"))
# print summary
summary(mgmModel)
## 
## Call:
## glm(formula = Outcome ~ Pregnancies + Glucose + BloodPressure + 
##     BMI + DiabetesPedigreeFunction, family = binomial(link = "logit"), 
##     data = dfrModel)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.7459  -0.7335  -0.4097   0.7153   2.8945  
## 
## Coefficients:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              -8.009215   0.713218 -11.230  < 2e-16 ***
## Pregnancies               0.157336   0.029270   5.375 7.64e-08 ***
## Glucose                   0.033444   0.003517   9.509  < 2e-16 ***
## BloodPressure            -0.012432   0.005246  -2.370  0.01781 *  
## BMI                       0.091142   0.014970   6.088 1.14e-09 ***
## DiabetesPedigreeFunction  0.885935   0.304143   2.913  0.00358 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 901.37  on 699  degrees of freedom
## Residual deviance: 659.26  on 694  degrees of freedom
## AIC: 671.26
## 
## Number of Fisher Scoring iterations: 5

Confusion Matrix

prdVal <- predict(mgmModel, type='response')
prdBln <- ifelse(prdVal > 0.5, 1, 0)
cnfmtrx <- table(prd=prdBln, act=dfrModel$Outcome)
confusionMatrix(cnfmtrx)
## Confusion Matrix and Statistics
## 
##    act
## prd   0   1
##   0 404 103
##   1  55 138
##                                           
##                Accuracy : 0.7743          
##                  95% CI : (0.7415, 0.8048)
##     No Information Rate : 0.6557          
##     P-Value [Acc > NIR] : 5.601e-12       
##                                           
##                   Kappa : 0.4753          
##  Mcnemar's Test P-Value : 0.0001847       
##                                           
##             Sensitivity : 0.8802          
##             Specificity : 0.5726          
##          Pos Pred Value : 0.7968          
##          Neg Pred Value : 0.7150          
##              Prevalence : 0.6557          
##          Detection Rate : 0.5771          
##    Detection Prevalence : 0.7243          
##       Balanced Accuracy : 0.7264          
##                                           
##        'Positive' Class : 0               
## 

observation
Accuracy of the model is found to be 77% and sensitivity around 88%

Regression Data

dfrPlot <- mutate(dfrModel, PrdVal=prdVal, POutcome=prdBln)
head(dfrPlot)
##   Pregnancies Glucose BloodPressure SkinThickness Insulin  BMI
## 1           6     148            72            35       0 33.6
## 2           1      85            66            29       0 26.6
## 3           8     183            64             0       0 23.3
## 4           1      89            66            23      94 28.1
## 5           0     137            40            35     168 43.1
## 6           5     116            74             0       0 25.6
##   DiabetesPedigreeFunction Age Outcome     PrdVal POutcome
## 1                    0.627  50       1 0.64732497        1
## 2                    0.351  31       0 0.04334372        0
## 3                    0.672  32       1 0.78466856        1
## 4                    0.167  21       0 0.04802554        0
## 5                    2.288  33       1 0.88397222        1
## 6                    0.201  30       0 0.14783877        0

Regression Visulaization

#dfrPlot
ggplot(dfrPlot, aes(x=PrdVal, y=POutcome))  + 
    geom_point(shape=19, colour="blue", fill="blue") +
    geom_smooth(method="gam", formula=y~s(log(x)), se=FALSE) +
    labs(title="Binomial Regression Curve") +
    labs(x="") +
    labs(y="")

ROC Visulaization

rocplot(mgmModel)

Observation
Accuracy identified by the AUC model is around 83%.

Test Data

setwd("D:\\PGDM\\Trim 4\\MachineLearning")
dfrTests <- read.csv("./Data/Diabetes_test1.csv", header=T, stringsAsFactors=F)
head(dfrTests)
##   Pregnancies Glucose BloodPressure SkinThickness Insulin  BMI
## 1           2     122            76            27     200 35.9
## 2           6     125            78            31       0 27.6
## 3           1     168            88            29       0 35.0
## 4           2     129             0             0       0 38.5
## 5           4     110            76            20     100 28.4
## 6           6      80            80            36       0 39.8
##   DiabetesPedigreeFunction Age Outcome
## 1                    0.483  26       0
## 2                    0.565  49       1
## 3                    0.905  52       1
## 4                    0.304  41       0
## 5                    0.118  27       0
## 6                    0.177  28       0

Observation
Test Data successfully created.

Predict using Test data

resVal <- predict(mgmModel, dfrTests, type="response")
prdOut <- ifelse(resVal > 0.5, 1, 0)
dfrTests <- mutate(dfrTests, Pvalue=resVal, POutcome=prdOut)
dfrTests 
##    Pregnancies Glucose BloodPressure SkinThickness Insulin  BMI
## 1            2     122            76            27     200 35.9
## 2            6     125            78            31       0 27.6
## 3            1     168            88            29       0 35.0
## 4            2     129             0             0       0 38.5
## 5            4     110            76            20     100 28.4
## 6            6      80            80            36       0 39.8
## 7           10     115             0             0       0  0.0
## 8            2     127            46            21     335 34.4
## 9            9     164            78             0       0 32.8
## 10           2      93            64            32     160 38.0
## 11           3     158            64            13     387 31.2
## 12           5     126            78            27      22 29.6
## 13          10     129            62            36       0 41.2
## 14           0     134            58            20     291 26.4
## 15           3     102            74             0       0 29.5
## 16           7     187            50            33     392 33.9
## 17           3     173            78            39     185 33.8
## 18          10      94            72            18       0 23.1
## 19           1     108            60            46     178 35.5
## 20           5      97            76            27       0 35.6
## 21           4      83            86            19       0 29.3
## 22           1     114            66            36     200 38.1
## 23           1     149            68            29     127 29.3
## 24           5     117            86            30     105 39.1
## 25           1     111            94             0       0 32.8
## 26           4     112            78            40       0 39.4
## 27           1     116            78            29     180 36.1
## 28           0     141            84            26       0 32.4
## 29           2     175            88             0       0 22.9
## 30           2      92            52             0       0 30.1
## 31           3     130            78            23      79 28.4
## 32           8     120            86             0       0 28.4
## 33           2     174            88            37     120 44.5
## 34           2     106            56            27     165 29.0
## 35           2     105            75             0       0 23.3
## 36           4      95            60            32       0 35.4
## 37           0     126            86            27     120 27.4
## 38           8      65            72            23       0 32.0
## 39           2      99            60            17     160 36.6
## 40           1     102            74             0       0 39.5
## 41          11     120            80            37     150 42.3
## 42           3     102            44            20      94 30.8
## 43           1     109            58            18     116 28.5
## 44           9     140            94             0       0 32.7
## 45          13     153            88            37     140 40.6
## 46          12     100            84            33     105 30.0
## 47           1     147            94            41       0 49.3
## 48           1      81            74            41      57 46.3
## 49           3     187            70            22     200 36.4
## 50           6     162            62             0       0 24.3
## 51           4     136            70             0       0 31.2
## 52           1     121            78            39      74 39.0
## 53           3     108            62            24       0 26.0
## 54           0     181            88            44     510 43.3
## 55           8     154            78            32       0 32.4
## 56           1     128            88            39     110 36.5
## 57           7     137            90            41       0 32.0
## 58           0     123            72             0       0 36.3
## 59           1     106            76             0       0 37.5
## 60           6     190            92             0       0 35.5
## 61           2      88            58            26      16 28.4
## 62           9     170            74            31       0 44.0
## 63           9      89            62             0       0 22.5
## 64          10     101            76            48     180 32.9
## 65           2     122            70            27       0 36.8
## 66           5     121            72            23     112 26.2
## 67           1     126            60             0       0 30.1
## 68           1      93            70            31       0 30.4
##    DiabetesPedigreeFunction Age Outcome     Pvalue POutcome
## 1                     0.483  26       0 0.29749244        0
## 2                     0.565  49       1 0.30189651        0
## 3                     0.905  52       1 0.66026773        1
## 4                     0.304  41       0 0.59821566        1
## 5                     0.118  27       0 0.12424247        0
## 6                     0.177  28       0 0.16798839        0
## 7                     0.261  30       1 0.08638919        0
## 8                     0.176  22       0 0.32567987        0
## 9                     0.148  45       1 0.73934182        1
## 10                    0.674  23       1 0.21092502        0
## 11                    0.295  24       0 0.51407613        1
## 12                    0.439  40       0 0.29079639        0
## 13                    0.441  38       1 0.77789023        1
## 14                    0.352  21       0 0.17788535        0
## 15                    0.121  32       0 0.09535232        0
## 16                    0.826  34       1 0.92731122        1
## 17                    0.970  31       1 0.77187221        1
## 18                    0.595  56       0 0.17441146        0
## 19                    0.415  24       0 0.20058927        0
## 20                    0.378  52       1 0.20689720        0
## 21                    0.317  34       0 0.06169707        0
## 22                    0.289  21       0 0.24393972        0
## 23                    0.349  42       1 0.32422871        0
## 24                    0.251  42       0 0.35601984        0
## 25                    0.265  45       0 0.11066833        0
## 26                    0.236  38       0 0.30922789        0
## 27                    0.496  25       0 0.22927908        0
## 28                    0.433  22       0 0.26869678        0
## 29                    0.326  22       0 0.36358534        0
## 30                    0.141  22       0 0.08349014        0
## 31                    0.323  34       1 0.21677629        0
## 32                    0.259  22       1 0.27121466        0
## 33                    0.646  24       1 0.84008714        1
## 34                    0.426  22       0 0.13882109        0
## 35                    0.560  53       0 0.07617043        0
## 36                    0.284  28       0 0.18685838        0
## 37                    0.515  21       0 0.12888735        0
## 38                    0.600  42       0 0.11674276        0
## 39                    0.453  21       0 0.19903185        0
## 40                    0.293  42       1 0.18229997        0
## 41                    0.785  48       1 0.78431616        1
## 42                    0.400  26       0 0.18073784        0
## 43                    0.219  22       0 0.10565220        0
## 44                    0.734  45       1 0.63437356        1
## 45                    1.174  39       0 0.94265225        1
## 46                    0.488  46       0 0.34198956        0
## 47                    0.358  27       1 0.66958025        1
## 48                    1.096  32       0 0.29483781        0
## 49                    0.408  36       1 0.82137020        1
## 50                    0.178  50       1 0.48861097        0
## 51                    1.182  22       1 0.54713868        1
## 52                    0.261  28       0 0.27109988        0
## 53                    0.223  25       0 0.10633357        0
## 54                    0.222  26       1 0.74900417        1
## 55                    0.443  45       1 0.68474539        1
## 56                    1.057  37       1 0.40085432        0
## 57                    0.391  39       0 0.45464341        0
## 58                    0.258  52       1 0.22206950        0
## 59                    0.197  26       0 0.15986116        0
## 60                    0.278  66       1 0.83579980        1
## 61                    0.766  22       0 0.09926279        0
## 62                    0.403  43       1 0.92687453        1
## 63                    0.142  33       0 0.09877301        0
## 64                    0.171  63       0 0.29885728        0
## 65                    0.340  27       0 0.30378496        0
## 66                    0.245  30       0 0.18756614        0
## 67                    0.349  47       1 0.20895172        0
## 68                    0.315  23       0 0.07162370        0

Observation
Predicted outcome is added as a column based on the resval.

summarise(group_by(dfrTests, Outcome), n())
## # A tibble: 2 × 2
##   Outcome `n()`
##     <int> <int>
## 1       0    41
## 2       1    27

Confusion Matrix of Test data

prdVal11 <- predict(mgmModel,dfrTests, type='response')
prdBln21 <- ifelse(prdVal11 > 0.5, 1, 0)
cnfmtrx <- table(prd=prdBln21, act=dfrTests$Outcome)
confusionMatrix(cnfmtrx)
## Confusion Matrix and Statistics
## 
##    act
## prd  0  1
##   0 38 12
##   1  3 15
##                                          
##                Accuracy : 0.7794         
##                  95% CI : (0.6624, 0.871)
##     No Information Rate : 0.6029         
##     P-Value [Acc > NIR] : 0.001609       
##                                          
##                   Kappa : 0.5115         
##  Mcnemar's Test P-Value : 0.038867       
##                                          
##             Sensitivity : 0.9268         
##             Specificity : 0.5556         
##          Pos Pred Value : 0.7600         
##          Neg Pred Value : 0.8333         
##              Prevalence : 0.6029         
##          Detection Rate : 0.5588         
##    Detection Prevalence : 0.7353         
##       Balanced Accuracy : 0.7412         
##                                          
##        'Positive' Class : 0              
## 

Observation
Accuracy between actual and predicted values of test data is around 77% and sensitivity is 92%.

dfrTests$POutcome <- as.factor(dfrTests$POutcome)
levels(dfrTests$POutcome) <- c("Non Diabetic", "Diabetic")
dfrTests
##    Pregnancies Glucose BloodPressure SkinThickness Insulin  BMI
## 1            2     122            76            27     200 35.9
## 2            6     125            78            31       0 27.6
## 3            1     168            88            29       0 35.0
## 4            2     129             0             0       0 38.5
## 5            4     110            76            20     100 28.4
## 6            6      80            80            36       0 39.8
## 7           10     115             0             0       0  0.0
## 8            2     127            46            21     335 34.4
## 9            9     164            78             0       0 32.8
## 10           2      93            64            32     160 38.0
## 11           3     158            64            13     387 31.2
## 12           5     126            78            27      22 29.6
## 13          10     129            62            36       0 41.2
## 14           0     134            58            20     291 26.4
## 15           3     102            74             0       0 29.5
## 16           7     187            50            33     392 33.9
## 17           3     173            78            39     185 33.8
## 18          10      94            72            18       0 23.1
## 19           1     108            60            46     178 35.5
## 20           5      97            76            27       0 35.6
## 21           4      83            86            19       0 29.3
## 22           1     114            66            36     200 38.1
## 23           1     149            68            29     127 29.3
## 24           5     117            86            30     105 39.1
## 25           1     111            94             0       0 32.8
## 26           4     112            78            40       0 39.4
## 27           1     116            78            29     180 36.1
## 28           0     141            84            26       0 32.4
## 29           2     175            88             0       0 22.9
## 30           2      92            52             0       0 30.1
## 31           3     130            78            23      79 28.4
## 32           8     120            86             0       0 28.4
## 33           2     174            88            37     120 44.5
## 34           2     106            56            27     165 29.0
## 35           2     105            75             0       0 23.3
## 36           4      95            60            32       0 35.4
## 37           0     126            86            27     120 27.4
## 38           8      65            72            23       0 32.0
## 39           2      99            60            17     160 36.6
## 40           1     102            74             0       0 39.5
## 41          11     120            80            37     150 42.3
## 42           3     102            44            20      94 30.8
## 43           1     109            58            18     116 28.5
## 44           9     140            94             0       0 32.7
## 45          13     153            88            37     140 40.6
## 46          12     100            84            33     105 30.0
## 47           1     147            94            41       0 49.3
## 48           1      81            74            41      57 46.3
## 49           3     187            70            22     200 36.4
## 50           6     162            62             0       0 24.3
## 51           4     136            70             0       0 31.2
## 52           1     121            78            39      74 39.0
## 53           3     108            62            24       0 26.0
## 54           0     181            88            44     510 43.3
## 55           8     154            78            32       0 32.4
## 56           1     128            88            39     110 36.5
## 57           7     137            90            41       0 32.0
## 58           0     123            72             0       0 36.3
## 59           1     106            76             0       0 37.5
## 60           6     190            92             0       0 35.5
## 61           2      88            58            26      16 28.4
## 62           9     170            74            31       0 44.0
## 63           9      89            62             0       0 22.5
## 64          10     101            76            48     180 32.9
## 65           2     122            70            27       0 36.8
## 66           5     121            72            23     112 26.2
## 67           1     126            60             0       0 30.1
## 68           1      93            70            31       0 30.4
##    DiabetesPedigreeFunction Age Outcome     Pvalue     POutcome
## 1                     0.483  26       0 0.29749244 Non Diabetic
## 2                     0.565  49       1 0.30189651 Non Diabetic
## 3                     0.905  52       1 0.66026773     Diabetic
## 4                     0.304  41       0 0.59821566     Diabetic
## 5                     0.118  27       0 0.12424247 Non Diabetic
## 6                     0.177  28       0 0.16798839 Non Diabetic
## 7                     0.261  30       1 0.08638919 Non Diabetic
## 8                     0.176  22       0 0.32567987 Non Diabetic
## 9                     0.148  45       1 0.73934182     Diabetic
## 10                    0.674  23       1 0.21092502 Non Diabetic
## 11                    0.295  24       0 0.51407613     Diabetic
## 12                    0.439  40       0 0.29079639 Non Diabetic
## 13                    0.441  38       1 0.77789023     Diabetic
## 14                    0.352  21       0 0.17788535 Non Diabetic
## 15                    0.121  32       0 0.09535232 Non Diabetic
## 16                    0.826  34       1 0.92731122     Diabetic
## 17                    0.970  31       1 0.77187221     Diabetic
## 18                    0.595  56       0 0.17441146 Non Diabetic
## 19                    0.415  24       0 0.20058927 Non Diabetic
## 20                    0.378  52       1 0.20689720 Non Diabetic
## 21                    0.317  34       0 0.06169707 Non Diabetic
## 22                    0.289  21       0 0.24393972 Non Diabetic
## 23                    0.349  42       1 0.32422871 Non Diabetic
## 24                    0.251  42       0 0.35601984 Non Diabetic
## 25                    0.265  45       0 0.11066833 Non Diabetic
## 26                    0.236  38       0 0.30922789 Non Diabetic
## 27                    0.496  25       0 0.22927908 Non Diabetic
## 28                    0.433  22       0 0.26869678 Non Diabetic
## 29                    0.326  22       0 0.36358534 Non Diabetic
## 30                    0.141  22       0 0.08349014 Non Diabetic
## 31                    0.323  34       1 0.21677629 Non Diabetic
## 32                    0.259  22       1 0.27121466 Non Diabetic
## 33                    0.646  24       1 0.84008714     Diabetic
## 34                    0.426  22       0 0.13882109 Non Diabetic
## 35                    0.560  53       0 0.07617043 Non Diabetic
## 36                    0.284  28       0 0.18685838 Non Diabetic
## 37                    0.515  21       0 0.12888735 Non Diabetic
## 38                    0.600  42       0 0.11674276 Non Diabetic
## 39                    0.453  21       0 0.19903185 Non Diabetic
## 40                    0.293  42       1 0.18229997 Non Diabetic
## 41                    0.785  48       1 0.78431616     Diabetic
## 42                    0.400  26       0 0.18073784 Non Diabetic
## 43                    0.219  22       0 0.10565220 Non Diabetic
## 44                    0.734  45       1 0.63437356     Diabetic
## 45                    1.174  39       0 0.94265225     Diabetic
## 46                    0.488  46       0 0.34198956 Non Diabetic
## 47                    0.358  27       1 0.66958025     Diabetic
## 48                    1.096  32       0 0.29483781 Non Diabetic
## 49                    0.408  36       1 0.82137020     Diabetic
## 50                    0.178  50       1 0.48861097 Non Diabetic
## 51                    1.182  22       1 0.54713868     Diabetic
## 52                    0.261  28       0 0.27109988 Non Diabetic
## 53                    0.223  25       0 0.10633357 Non Diabetic
## 54                    0.222  26       1 0.74900417     Diabetic
## 55                    0.443  45       1 0.68474539     Diabetic
## 56                    1.057  37       1 0.40085432 Non Diabetic
## 57                    0.391  39       0 0.45464341 Non Diabetic
## 58                    0.258  52       1 0.22206950 Non Diabetic
## 59                    0.197  26       0 0.15986116 Non Diabetic
## 60                    0.278  66       1 0.83579980     Diabetic
## 61                    0.766  22       0 0.09926279 Non Diabetic
## 62                    0.403  43       1 0.92687453     Diabetic
## 63                    0.142  33       0 0.09877301 Non Diabetic
## 64                    0.171  63       0 0.29885728 Non Diabetic
## 65                    0.340  27       0 0.30378496 Non Diabetic
## 66                    0.245  30       0 0.18756614 Non Diabetic
## 67                    0.349  47       1 0.20895172 Non Diabetic
## 68                    0.315  23       0 0.07162370 Non Diabetic