Problem Set - Data Management using dplyr Package in R

Ph.D. Course Work - Computer Application

Problem Set 1(dplyr)

———————————————————————–

Download the by clicking the link following link.
Birth data file for the exerxise problems.

Do the following Steps:
1. Load dplyr package.
2. Load readr package.
3. Load the data file using *read_tsv("file_location\\filename" ) command.
Birth data set:

The data set has following variables/columns:

  • father’s age (fAge),
  • mother’s age (mAge),
  • weeks of gestation (weeks)
  • whether the birth was premature or full term (mature)
  • number of OB/GYN visits (visits)
  • mother’s weight gained in pounds (gained)
  • babies birth weight (weight)
  • sex of the baby (sexBaby)
  • whether the mother was a smoker (smoke).
Problems Set:
  1. Find out the average baby weight for the smokers and the non-smokers.

  2. Find out the average weight for premature and full term baby. non-smokers.

  3. Find out the number of premature male babies(count() function of dplyr can be used for counting no of observation)

  4. Display mothers age in decreasing order who have visited more than 10 times for a OB/GYN suggestion.

  5. Display first five observation based on the increasing order of mother age.

  6. Display the observation for which the babies weight is less than the average weight.