Karthik Balasubramanian

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.3     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

datas <- read.csv("C:\\Users\\karth\\Downloads\\Child Growth and Malnutrition.csv")

view(datas)

##3 columns

In the above data, the columns - Wasting, Stunting, Underweight - were all not immediately understood when looking at the table. But after reading the documentation, these terms were understood to be as:

1. Wasting - Number of children who fall below -2 standard deviations for moderate, and -3 standard deviations for severe - based on the median of weight-for-height in the representative population.

2. Stunting - The data represents the difference between the median height-for-age for the representative population, and how low they are for their age.

3. Underweight - The data represents the difference between the median weight-for-age for the representative population, and how low they are for their age.

If I had not read the documentation, I would have assumed these values to be representing their weights and heights itself, not the difference in the values. I am not sure why they chose to encode the values in such a format.

The one column I am still unable to understand is the JME column. It has values of either “Selected for JME” or “not selected for JME”, and I am not sure why they choose specific people for JME.

datas_f <- datas |>
  filter(Stunting > 0)

datas1<- datas_f %>%                                     
  arrange(desc(Stunting)) %>%
  slice(243:253)
view(datas1)

datas2<- datas_f %>%                                     
  arrange(Stunting) %>%
  slice(24560:24570)
view(datas2)

graph1 <- datas1 |>
  ggplot()+
  geom_point(mapping = aes(x = Stunting, y = JME..Y.N.))
graph1

graph2 <- datas2 |>
  ggplot()+
  geom_point(mapping = aes(x = Stunting, y = JME..Y.N.))
graph2

The above graphs show the JME selection vs Stunting values. So far, I am unable to see any pattern to them.

datas_f_1 <- datas |>
  filter(Underweight > 0)

datas3<- datas_f_1 %>%                                     
  arrange(desc(Underweight)) %>%
  slice(243:253)
view(datas3)

datas4<- datas_f_1 %>%                                     
  arrange(Underweight) %>%
  slice(24560:24570)
view(datas4)

graph3 <- datas3 |>
  ggplot()+
  geom_point(mapping = aes(x = Underweight, y = JME..Y.N.))
graph3

graph4 <- datas4 |>
  ggplot()+
  geom_point(mapping = aes(x = Underweight, y = JME..Y.N.))
graph4

The above graphs show the plotting of JME vs Underweight. Similar to the first one, some have been selected, but others have not been

Without understanding JME, and why some are chosen and others are not, any analysis on this dataset will be on;y half-baked. JME clearly means something important, as these kids have been chisen for either further testing and data collection, or for some other process. We have to understand these processes, so that we can include this column while performing analysis.

Karthik Balasubramanian - Week 5

2023-09-24

R Markdown