In this project, we will explore predictive analytics using the babies data set provided. This data is collected to understand different factors effecting baby weights at birth. Below are the explanations of the variables:
Count: number of observation bwt: Baby weights in ounces gestation: Length of pregnancy in days parity: = 0 if the baby is first born, =1 otherwise height: Mother’s height in inches weight: Mother’s pregnancy weight in pounds smoke: =1 if mother is smoker, = 0 Nonsmoker age: Mother’s age in years
Read this worksheet carefully and follow the instructions to complete the tasks and answer any questions. Knit your final worksheet and submit your work as an HTML or PDF or Word document in Sakai.
If you are submitting as a team project only one submission from one of the team members will be sufficient but in this case you need to input the names of the team members in the above Title section.
#Read File and check the first 6 observations
myData = read.csv(file="babies.csv")
View(myData)
head(myData)
## Count bwt gestation parity height weight smoke age
## 1 1 120 284 0 62 100 0 27
## 2 2 113 282 0 64 135 0 33
## 3 3 128 279 0 64 115 1 28
## 4 4 108 282 0 67 125 1 23
## 5 5 136 286 0 62 93 0 25
## 6 6 138 244 0 62 178 0 33
Gestation & Weight = as pregnancy length increases so does weight, positive linear relationship. Parity & Weight = First-born babies weigh less than the next born, negative relationship. Height & Weight = Taller mother’s babies way slightly more, weak positive relationship. Weight & Weight = The more a mother weighs, the more the baby weighs, positive relationship. Smoke & Weight = Mothers who smoke tend to have babies that weigh less, negative relationship. Age & Weight = The older the mother is, the more the baby weighs, positive linear relationship.
bwt = myData$bwt
gestation = myData$gestation
parity = myData$parity
height = myData$height
weight = myData$weight
smoke = myData$smoke
age = myData$age
#Multiple Linear Regression Model
mlr1 <-lm(bwt ~ gestation + parity + weight + smoke + age)
#Summary of the model
summary(mlr1)
##
## Call:
## lm(formula = bwt ~ gestation + parity + weight + smoke + age)
##
## Residuals:
## Min 1Q Median 3Q Max
## -53.091 -10.698 -0.472 9.782 51.403
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17.04294 9.00924 -1.892 0.0588 .
## gestation 0.45396 0.02942 15.429 < 2e-16 ***
## parity -2.87792 1.14082 -2.523 0.0118 *
## weight 0.11338 0.02290 4.951 8.48e-07 ***
## smoke -8.12808 0.96502 -8.423 < 2e-16 ***
## age -0.03042 0.08685 -0.350 0.7262
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 16.04 on 1168 degrees of freedom
## Multiple R-squared: 0.2378, Adjusted R-squared: 0.2345
## F-statistic: 72.88 on 5 and 1168 DF, p-value: < 2.2e-16
-80.41085 is the estimated birth weight when all variables are equal to 0.
For every day of gestation, the birth weight should increase by 0.44398 units.
For every unit of weight (for the mother), birth weight should increase by 0.05017 units.
If a mother is a smoker, the bay is expected to weight 8.40073 units less.
0.258 accounts for the variance in weight from the variables. In other words 25.8% of the model is explained, and the 74.2% of variability is unexplained by the model.
Yes, age is an insignificant variable as the p-value is high compared to the other variables meaning it does not have a significant effect on the babies’ weight. ####Is removing insignificant variables improving your model’s predictivity? No, as the coefficents are barely changed when age is removed from the model.
#Multiple Linear Regression Model
mlr2 <- lm(bwt ~ gestation + parity + height + weight + smoke)
#Summary of Multiple Linear Regression Model
summary(mlr2)
##
## Call:
## lm(formula = bwt ~ gestation + parity + height + weight + smoke)
##
## Residuals:
## Min 1Q Median 3Q Max
## -57.716 -10.150 -0.159 9.689 51.620
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -80.71321 14.04465 -5.747 1.16e-08 ***
## gestation 0.44408 0.02907 15.276 < 2e-16 ***
## parity -3.28762 1.06281 -3.093 0.00203 **
## height 1.15497 0.20473 5.641 2.11e-08 ***
## weight 0.04983 0.02503 1.991 0.04672 *
## smoke -8.39390 0.95117 -8.825 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.82 on 1168 degrees of freedom
## Multiple R-squared: 0.2579, Adjusted R-squared: 0.2548
## F-statistic: 81.2 on 5 and 1168 DF, p-value: < 2.2e-16
###Task 4: Estimating baby weight (0.25 points)
Use your best predicting model and full model to estimated a first born babies weight, whose mother is 25 years old, newer smoked, the gestation was 286 days, mother’s height is 62 inches and mother’s weight is 93 pounds
# bwt predicted
bwt_predicted = coef(mlr1)[1] + coef(mlr1)[2]*(age) + coef(mlr1)[3]*(gestation) + coef(mlr1)[4]*(height) + coef(mlr1)[5]*(weight)
print(bwt_predicted)
## [1] -1627.895 -1903.672 -1734.746 -1826.590 -1577.662 -2144.045 -1842.255
## [8] -1846.395 -1961.860 -1982.598 -1814.717 -1842.794 -1607.161 -2033.679
## [15] -1869.509 -1748.205 -1762.408 -2127.265 -1973.070 -1533.658 -2208.155
## [22] -2002.747 -2269.332 -1836.968 -1658.900 -1666.060 -2188.536 -1851.511
## [29] -1858.307 -1628.812 -1981.455 -1766.979 -1644.677 -1782.845 -1870.563
## [36] -1736.675 -1576.085 -1586.758 -2140.233 -2238.192 -1742.151 -1763.253
## [43] -1680.283 -2060.355 -1513.214 -1984.014 -1802.815 -1797.366 -2013.009
## [50] -1703.080 -1842.939 -1837.140 -1848.333 -1975.938 -1872.251 -1726.060
## [57] -1556.914 -1565.201 -1787.415 -1821.061 -1761.201 -1827.652 -1739.635
## [64] -1920.285 -1899.515 -2163.275 -2168.731 -1885.674 -1797.565 -1831.552
## [71] -1763.040 -1987.469 -1683.121 -1690.888 -1843.153 -1818.967 -1617.744
## [78] -1779.874 -1650.485 -1935.629 -1985.973 -2000.477 -1792.242 -2007.285
## [85] -2469.990 -1755.055 -1678.931 -1623.221 -1636.941 -1624.182 -1544.095
## [92] -2179.099 -1841.162 -1784.422 -1758.727 -1750.135 -1662.738 -1933.636
## [99] -1982.859 -2004.572 -1885.551 -2036.784 -1882.106 -1840.111 -1981.086
## [106] -1717.530 -1680.603 -2560.088 -1933.377 -2073.078 -1665.266 -1896.122
## [113] -1879.051 -2070.968 -2008.666 -2032.443 -1690.506 -1740.170 -2028.191
## [120] -1847.225 -1635.920 -1659.984 -1860.558 -1626.606 -1995.980 -1713.588
## [127] -1895.679 -1992.865 -1619.622 -2129.979 -1624.664 -1838.133 -1752.913
## [134] -1825.733 -1701.613 -2053.279 -1769.031 -2281.620 -1927.799 -2835.905
## [141] -1979.042 -1837.419 -1890.718 -2150.902 -1908.212 -1670.630 -1818.349
## [148] -1730.973 -1897.842 -2291.569 -1847.901 -1847.643 -2131.899 -1833.285
## [155] -1717.347 -1847.974 -1668.783 -1737.015 -1759.472 -1653.197 -1863.661
## [162] -2054.723 -2146.121 -1803.897 -1840.339 -1808.406 -2242.195 -2357.009
## [169] -1722.121 -2582.098 -2100.337 -1692.041 -1526.427 -1492.025 -1857.584
## [176] -1899.153 -1685.431 -1693.803 -2106.711 -2004.045 -2200.029 -1988.708
## [183] -1871.494 -1726.257 -1910.490 -1862.774 -1600.819 -1724.257 -1661.831
## [190] -1714.437 -2028.317 -1814.252 -1998.743 -1831.033 -1994.916 -1552.626
## [197] -1743.380 -1850.748 -1724.551 -2004.264 -1721.899 -1844.888 -1929.192
## [204] -1773.932 -2286.849 -1970.709 -1862.721 -1976.815 -1808.527 -1705.822
## [211] -1673.632 -1649.587 -1662.168 -1773.818 -1868.026 -2007.958 -1889.395
## [218] -2068.812 -1769.136 -2046.842 -1733.602 -2490.859 -1936.319 -1839.391
## [225] -1737.634 -1816.553 -1832.613 -1755.983 -2183.659 -1803.567 -1641.208
## [232] -1675.994 -1706.257 -1956.478 -1793.511 -1953.143 -2131.032 -1860.062
## [239] -1520.073 -1935.873 -1677.169 -1831.666 -1890.159 -1647.720 -1853.748
## [246] -1882.093 -1628.926 -1935.946 -1682.080 -1761.462 -2169.403 -1724.215
## [253] -2049.790 -2081.976 -1793.294 -1883.899 -2178.409 -2056.357 -1947.314
## [260] -1943.517 -1794.500 -1685.639 -1989.595 -1835.740 -1937.599 -2335.884
## [267] -2091.156 -1689.929 -1965.884 -1791.343 -2075.221 -2028.328 -1956.691
## [274] -1566.478 -1694.477 -1637.559 -1672.330 -1805.578 -1780.203 -1837.988
## [281] -1707.579 -1856.504 -1929.646 -1878.021 -1792.672 -1786.890 -2029.402
## [288] -1755.396 -2267.065 -1846.899 -1709.073 -1605.934 -1662.262 -1872.882
## [295] -1740.284 -1898.330 -1815.296 -1863.002 -1846.332 -1924.292 -1921.638
## [302] -1821.361 -1775.800 -1811.137 -1671.816 -1859.555 -1838.946 -1846.682
## [309] -1727.919 -1679.562 -1899.237 -1775.787 -1785.854 -1674.869 -1614.598
## [316] -2068.667 -1946.189 -1850.912 -1746.754 -1852.614 -1850.417 -1757.497
## [323] -1636.723 -1736.666 -1802.424 -1874.607 -1689.753 -1722.089 -1836.430
## [330] -1908.326 -1616.404 -2090.039 -1750.952 -1757.842 -1718.047 -1562.783
## [337] -1841.143 -2001.004 -2064.896 -1872.800 -1743.040 -2223.370 -1772.486
## [344] -1857.226 -1835.946 -1913.008 -2060.604 -1894.544 -1895.937 -1777.945
## [351] -1658.158 -1850.521 -1654.446 -1779.408 -1682.337 -1527.963 -2010.327
## [358] -1952.153 -2043.995 -1745.423 -2060.437 -1700.099 -1885.520 -2038.417
## [365] -1848.036 -2011.227 -1762.689 -1708.517 -1968.697 -1849.160 -1804.772
## [372] -1791.851 -1892.182 -1675.199 -1594.896 -1726.548 -1758.687 -1839.482
## [379] -1843.432 -1771.219 -1960.817 -1928.953 -1784.227 -2252.508 -1837.659
## [386] -1965.964 -1970.854 -1654.538 -1923.478 -2004.046 -1701.152 -1690.125
## [393] -1836.976 -1880.104 -1901.053 -2066.007 -1844.712 -1720.027 -2426.445
## [400] -1936.174 -1657.065 -1984.076 -1643.997 -2019.983 -1644.313 -1925.075
## [407] -1892.038 -1771.560 -2000.322 -1772.268 -1718.367 -1691.176 -1853.498
## [414] -1867.911 -1804.806 -1942.529 -1836.637 -1946.542 -1822.554 -1749.776
## [421] -1899.505 -1850.386 -1823.010 -1742.544 -1765.286 -2116.271 -1846.886
## [428] -1744.514 -2069.570 -2027.595 -1847.895 -1847.767 -1900.225 -1523.209
## [435] -1807.125 -1733.011 -1720.482 -1951.144 -2043.161 -1602.613 -1805.877
## [442] -2025.468 -1954.330 -1660.673 -1868.115 -1933.432 -2103.679 -1844.071
## [449] -1691.795 -1749.576 -1981.023 -1877.884 -1864.248 -1804.351 -2057.291
## [456] -1888.964 -1869.150 -1830.150 -1794.760 -1841.083 -1808.909 -1775.686
## [463] -1843.898 -1812.356 -2034.976 -1501.152 -1651.958 -1667.358 -1874.462
## [470] -1768.415 -1602.707 -1825.216 -1835.781 -2199.408 -2021.995 -1849.146
## [477] -2216.571 -1655.549 -1765.743 -1696.756 -1986.210 -1761.288 -1785.382
## [484] -1999.312 -1957.608 -1705.998 -2433.927 -1826.504 -1874.162 -1774.325
## [491] -1848.365 -2306.767 -1885.538 -1788.106 -1667.493 -1774.532 -1897.731
## [498] -1704.782 -1862.125 -2098.272 -2093.393 -1615.166 -1874.802 -1835.790
## [505] -1610.113 -1808.238 -1761.378 -1655.683 -1834.090 -1524.056 -1741.918
## [512] -1923.508 -1854.688 -1632.734 -1970.636 -1971.080 -1676.685 -1801.710
## [519] -1919.650 -2061.199 -1609.275 -1664.637 -1744.453 -1760.685 -1898.307
## [526] -2250.026 -2348.838 -2189.546 -2030.618 -1861.247 -2106.583 -1914.882
## [533] -1868.270 -1842.988 -1884.961 -1901.549 -1672.848 -1640.560 -1915.019
## [540] -1805.556 -1691.279 -2074.020 -1835.129 -2032.443 -1861.806 -1672.476
## [547] -1841.815 -1719.243 -1605.615 -1920.776 -1984.740 -1781.381 -1829.518
## [554] -1986.324 -1842.206 -1731.577 -1986.120 -1920.787 -1669.906 -1701.697
## [561] -1889.873 -1651.968 -1625.150 -1737.428 -1843.012 -1726.185 -1762.114
## [568] -1932.647 -1742.772 -1926.943 -2532.216 -1942.982 -1599.683 -2073.288
## [575] -1662.563 -1696.252 -1910.563 -1724.091 -2171.632 -1685.142 -1577.404
## [582] -1667.215 -1674.641 -1724.142 -2350.517 -1941.879 -1682.109 -1642.901
## [589] -1743.660 -1846.149 -1796.376 -1828.735 -1581.650 -1851.438 -2306.994
## [596] -2014.381 -1896.805 -1465.919 -1580.033 -1762.535 -1794.810 -1966.510
## [603] -1753.075 -1845.135 -1729.291 -1862.454 -2088.957 -1721.375 -1791.488
## [610] -2130.506 -1846.220 -1855.047 -1783.640 -1610.536 -1788.756 -1720.770
## [617] -1743.947 -1857.285 -1931.720 -1897.473 -1687.585 -1866.468 -1764.166
## [624] -1665.843 -1509.663 -1651.536 -1970.255 -1827.870 -1850.685 -1706.298
## [631] -1695.014 -1802.532 -1724.596 -1744.741 -1808.353 -1537.649 -1809.066
## [638] -1571.628 -1992.298 -1857.866 -1661.136 -1800.135 -1596.102 -1929.242
## [645] -1743.162 -1532.472 -2086.801 -1689.702 -1833.926 -1797.203 -1800.564
## [652] -2237.697 -2223.586 -1708.505 -1851.971 -1864.798 -2148.670 -1874.966
## [659] -1784.174 -1778.313 -1951.153 -1796.376 -1963.963 -1995.961 -1778.477
## [666] -1885.623 -1779.957 -1701.449 -1671.908 -2102.461 -1981.199 -2063.573
## [673] -1794.347 -1769.373 -1838.493 -2157.914 -1830.385 -2409.078 -1745.576
## [680] -1813.571 -1591.254 -2167.692 -2006.674 -1778.986 -1955.743 -2025.542
## [687] -1810.219 -2341.742 -1855.193 -1584.427 -1780.958 -1888.210 -1781.103
## [694] -1770.375 -1584.726 -1928.481 -1827.106 -1741.378 -1595.115 -1800.661
## [701] -2306.931 -1819.895 -1606.810 -1841.897 -1929.128 -1805.650 -1577.785
## [708] -1913.667 -1677.015 -1814.097 -1862.578 -1827.612 -1768.496 -2000.718
## [715] -1949.811 -1660.202 -1759.811 -2013.629 -1694.231 -1868.788 -1810.430
## [722] -1676.386 -1831.942 -1755.644 -1723.401 -1656.911 -1903.496 -1957.032
## [729] -1843.724 -1800.424 -1765.350 -1618.939 -1723.865 -2115.662 -1865.725
## [736] -1624.880 -1945.383 -1786.630 -1559.713 -1779.531 -1778.316 -2000.445
## [743] -1807.518 -1879.650 -1839.555 -1634.105 -2140.748 -1921.199 -2085.657
## [750] -1985.499 -1923.384 -1838.492 -1890.469 -1832.664 -1722.544 -1916.895
## [757] -1667.554 -1848.830 -1849.437 -1675.299 -1707.564 -2136.511 -1841.835
## [764] -1840.472 -1869.479 -1986.542 -1803.359 -1858.524 -1710.806 -1704.091
## [771] -1724.978 -1713.724 -1926.736 -2087.841 -2106.244 -1834.625 -1771.212
## [778] -1768.713 -1569.926 -1521.597 -1525.103 -1513.046 -1694.878 -1827.879
## [785] -1735.093 -1741.750 -1955.132 -1778.635 -1719.189 -1976.845 -1826.023
## [792] -1728.845 -1867.871 -1838.999 -1800.062 -1710.992 -1671.218 -1670.329
## [799] -1731.764 -2522.375 -1890.107 -2233.707 -1774.388 -1819.504 -2017.259
## [806] -1873.368 -1757.593 -1699.263 -2380.887 -2326.694 -1987.645 -1729.972
## [813] -1850.840 -1705.690 -1955.897 -2490.327 -1743.205 -1796.242 -1902.901
## [820] -1599.469 -2116.869 -1826.971 -1980.435 -1864.538 -2063.492 -1693.479
## [827] -1822.906 -1998.176 -1860.463 -1670.032 -1809.550 -1766.970 -1682.707
## [834] -1908.026 -1842.042 -2330.883 -1760.274 -1766.981 -1850.767 -1879.310
## [841] -1680.243 -1707.247 -1968.657 -1777.502 -2063.923 -1753.314 -2024.674
## [848] -1716.572 -1731.703 -1731.745 -1846.507 -1877.759 -1921.022 -2078.829
## [855] -1840.154 -1569.894 -1718.499 -1897.290 -2224.288 -1967.424 -1631.360
## [862] -2099.066 -1838.101 -1865.269 -1897.535 -1926.365 -1783.008 -1686.410
## [869] -1919.887 -2050.060 -2592.208 -1854.296 -1874.843 -1774.433 -2076.899
## [876] -1709.979 -1653.599 -1621.127 -1954.845 -1953.545 -1820.824 -2245.908
## [883] -1700.860 -1847.940 -1794.159 -1702.163 -1711.682 -1748.589 -2167.328
## [890] -2067.459 -1663.708 -1742.173 -1803.826 -1915.627 -1644.006 -1878.970
## [897] -1801.071 -1755.138 -1887.306 -1833.789 -1682.058 -1760.040 -1841.679
## [904] -2024.254 -2013.423 -1860.569 -1692.094 -1797.802 -1835.704 -1736.942
## [911] -1894.307 -1803.154 -1795.468 -1825.589 -1859.406 -1994.332 -1772.572
## [918] -1905.220 -1885.209 -1704.090 -1743.938 -2082.189 -1718.813 -2006.666
## [925] -1708.682 -1863.619 -2086.265 -2025.098 -1864.786 -1818.338 -1759.741
## [932] -1847.806 -1858.474 -1784.887 -1653.537 -1557.867 -1874.730 -1866.849
## [939] -1986.128 -1795.250 -1696.682 -2063.603 -1700.769 -1850.087 -1951.430
## [946] -2079.662 -2265.479 -1719.201 -1841.670 -1939.846 -1903.396 -1721.532
## [953] -2320.980 -1781.203 -1745.433 -1733.827 -1705.587 -1814.365 -1675.899
## [960] -1760.347 -1975.143 -1849.608 -1824.784 -1679.661 -2350.027 -1830.352
## [967] -1916.926 -1887.785 -1890.862 -2120.257 -1759.401 -1708.650 -1886.423
## [974] -1837.534 -1919.157 -1687.401 -1653.247 -1812.800 -1734.093 -1768.776
## [981] -1517.000 -1899.569 -1842.496 -1724.030 -1797.902 -1923.551 -1918.156
## [988] -1738.832 -1767.827 -1945.724 -1780.699 -1927.262 -1771.675 -1824.093
## [995] -1781.330 -1772.014 -1835.501 -1740.697 -1806.145 -1880.878 -1953.234
## [1002] -1562.878 -1960.413 -1661.656 -1700.429 -1972.265 -1842.247 -1682.354
## [1009] -1681.027 -1774.230 -1764.164 -1833.830 -1835.799 -1555.194 -1824.156
## [1016] -1880.949 -2063.071 -1902.136 -2173.602 -1799.255 -1787.808 -1931.099
## [1023] -1668.165 -1700.614 -2020.890 -1728.147 -1812.384 -1934.153 -1659.518
## [1030] -1880.835 -1648.690 -1757.151 -1717.583 -1803.144 -1815.667 -1791.043
## [1037] -1790.508 -1862.424 -1787.476 -1742.658 -1641.851 -2134.177 -1984.407
## [1044] -1707.999 -1752.403 -1915.924 -1820.618 -1853.637 -1910.263 -1794.821
## [1051] -1866.745 -1813.262 -1887.499 -2038.426 -2049.193 -1761.410 -1702.480
## [1058] -1960.417 -1737.170 -1683.265 -1766.856 -1711.795 -1778.314 -1764.977
## [1065] -1965.593 -1831.160 -1647.504 -1771.623 -1634.073 -2057.124 -1776.820
## [1072] -1771.465 -1767.929 -1943.681 -1658.985 -1739.172 -1668.471 -1756.220
## [1079] -1645.214 -1825.011 -1666.577 -1483.442 -1784.344 -1774.539 -1788.349
## [1086] -1968.326 -1877.885 -1815.686 -1563.671 -1836.164 -2389.355 -1677.331
## [1093] -1724.792 -1677.548 -1599.680 -1576.258 -2254.212 -2015.167 -1781.884
## [1100] -1856.658 -1835.820 -1835.400 -1912.615 -1818.306 -1725.917 -1780.699
## [1107] -1594.288 -1607.481 -1985.089 -2227.742 -1735.572 -1911.066 -1996.392
## [1114] -1846.467 -1948.734 -1570.327 -2010.205 -1895.329 -1630.051 -1719.787
## [1121] -1605.202 -1706.030 -1832.168 -1888.303 -1767.198 -1824.290 -1979.228
## [1128] -1683.121 -2041.263 -1878.955 -1522.540 -1994.322 -2136.747 -2063.241
## [1135] -2037.623 -1821.380 -1906.809 -2077.055 -2355.674 -1603.488 -1997.900
## [1142] -1907.334 -1681.541 -1757.097 -1680.623 -1796.158 -1728.246 -2074.433
## [1149] -1950.595 -1836.903 -1599.983 -1795.079 -1998.773 -1680.791 -1650.607
## [1156] -2093.479 -2040.696 -2383.717 -1591.113 -1810.108 -1771.953 -1858.781
## [1163] -1902.950 -2000.478 -1531.645 -1503.828 -1731.517 -1815.512 -1798.594
## [1170] -1602.221 -1736.571 -2052.743 -1702.926 -1895.689
predict(mlr1, data.frame(parity = 1, age = 25, smoke = 0, gestation = 286, height = 62, weight = 93), interval = "confidence")
## fit lwr upr
## 1 119.6945 117.2246 122.1644