Week 11 Assignment - Karthik Balasubramanian

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.3     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(ggthemes)
library(ggrepel)
library(broom)
library(lindia)

datas <- read.csv("C:\\Users\\karth\\Downloads\\Child Growth and Malnutrition.csv")
view(datas)

datas$Wasting = as.numeric(datas$Wasting)

## Warning: NAs introduced by coercion

datas$Overweight = as.numeric(datas$Overweight)

## Warning: NAs introduced by coercion

datas1 <- datas
datas1$JME..Y.N. <- ifelse(datas1$JME..Y.N. == "Selected for JME",1,0)

m <- lm(JME..Y.N. ~ Wasting + Overweight + Stunting + Underweight, data = datas1)
summary(m)

## 
## Call:
## lm(formula = JME..Y.N. ~ Wasting + Overweight + Stunting + Underweight, 
##     data = datas1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.03021 -0.02728 -0.02572 -0.02383  0.98304 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.020e-02  2.266e-03  13.331   <2e-16 ***
## Wasting     -4.167e-04  2.510e-04  -1.660   0.0969 .  
## Overweight  -6.848e-05  1.947e-04  -0.352   0.7250    
## Stunting    -8.190e-05  1.053e-04  -0.778   0.4367    
## Underweight  5.414e-05  1.993e-04   0.272   0.7859    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1575 on 37094 degrees of freedom
##   (2420 observations deleted due to missingness)
## Multiple R-squared:  0.0002562,  Adjusted R-squared:  0.0001484 
## F-statistic: 2.376 on 4 and 37094 DF,  p-value: 0.04966

m$coefficients

##   (Intercept)       Wasting    Overweight      Stunting   Underweight 
##  3.020419e-02 -4.166734e-04 -6.848085e-05 -8.189885e-05  5.414485e-05

The coefficients of the linear model suggest that for a unit change in

Wasting - JME goes down by an infinitesimal amount of 4* 10^(-4)

**Overweight - JME goes down by 6*10^(-5)**

**Stunting - JME goes down by 8*10^(-5)**

**Underweight - JME goes up by 5*10^(-5)**

model <- glm(JME..Y.N. ~ Wasting + Overweight + Stunting + Underweight, data = datas1,
             family = binomial(link = 'log'))
summary(model)

## 
## Call:
## glm(formula = JME..Y.N. ~ Wasting + Overweight + Stunting + Underweight, 
##     family = binomial(link = "log"), data = datas1)
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -3.483092   0.088440 -39.384   <2e-16 ***
## Wasting     -0.017200   0.010416  -1.651   0.0987 .  
## Overweight  -0.002944   0.007704  -0.382   0.7023    
## Stunting    -0.003093   0.004152  -0.745   0.4564    
## Underweight  0.001965   0.008062   0.244   0.8074    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 8795.0  on 37098  degrees of freedom
## Residual deviance: 8785.3  on 37094  degrees of freedom
##   (2420 observations deleted due to missingness)
## AIC: 8795.3
## 
## Number of Fisher Scoring iterations: 7

model$coefficients

##  (Intercept)      Wasting   Overweight     Stunting  Underweight 
## -3.483091710 -0.017199709 -0.002944272 -0.003092508  0.001965308

The coefficients of the poisson model suggest that for a unit change in

Wasting - JME goes up by \(e^{-0.017199709} = 0.9829\) times

Overweight - JME goes up by \(e^{-0.002944272} = 0.9927\) times

Stunting - JME goes up by \(e^{-0.003092508} = 0.9969\) times

Underweight - JME goes up by \(e^{0.001965308} = 1.02\) times

summary(m)$r.squared

## [1] 0.0002561839

#summary(model)$r.squared

summary(model)$r.squared

## NULL

From the above 2 R_Squared values, we see that the Linear regression model performs infinitely better than Poisson Model, as the R_Squared of poisson model is 0. This means that the Poisson model explains none of the relationships between the explanatory and response variables.

For both of the above models, we see that the p-value is extremely high in comparison. This means, that the model is not actually reliable, which is further proven by the fact that the R_Squared values are 0.0002 and 0. This is a problem, as we would like our models to generally reliable.

These suggests that, for the data available to us, we cannot use Poisson Model, or a linear regression model - for the specific explanatory and response variables. The data does not fit a poisson curve, and the response variable is not a continuous one to fit for a linear model.

Week 11 Assignment - Karthik Balasubramanian

2023-11-02

R Markdown