DATA_606 Final Presentation

Bikram Barua

12/7/2021

Abstract

From high tides to crazy moods, the full moon is believed to cause a number of things. As the full moon approaches, full-term pregnant everywhere prepare hopefully to go into labor. Is this an old wives tales or something mysterious that we hear plenty about but don’t yet understand.

There is no evidence to such a phenomenon. This study is based to look into the birth data in US and attempt to link them with to full moons to see if there is any correlation. We will first try to identify some good data sources for the study. Based on the data, it will be filtered, formatted and grouped into appropriate categories for the study.

The prepared data will then be processed using statistical methods to reveal the facts around the myth and understand the rationality of the belief which has been around for a very, very long time. Many cultures around the world hold strong beliefs in this link and the folklore tends to stretch back forever.

Introduction

Myth or Fact: Do more Women Go into labor During a Full Moon ?

Compare the birth counts on regular days with the full moon days.
Do you see any difference? Can we draw any inference based on the data?

Null Hypothesis: There is no effect of Full Moon with Women going into Labor and child being born.

Alternate Hypothesis: Women go into Labor due to Full Moon, there are more births on full moon nights.

Below is the reference to an article on this topic:
_Ref: https://www.dukehealth.org/blog/myth-or-fact-more-women-go-labor-during-full-moon_

US births from 2000-2014, data source: fivethirtyeight.com

Time series of full moon occurrences, data source: Kaggle.

US Birth Data

year month date_of_month day_of_week births
2000 1 1 6 9083
2000 1 2 7 8006
2000 1 3 1 11363
2000 1 4 2 13032
2000 1 5 3 12558
2000 1 6 4 12466
2000 1 7 5 12516
2000 1 8 6 8934
2000 1 9 7 7949
2000 1 10 1 11668
2000 1 11 2 12611
2000 1 12 3 12398
2000 1 13 4 11815
2000 1 14 5 12180
2000 1 15 6 8525
2000 1 16 7 7657
2000 1 17 1 10824
2000 1 18 2 12350
2000 1 19 3 12405
2000 1 20 4 12506

Full Moon Days

Day Date
Monday 15 January 1900
Wednesday 14 February 1900
Friday 16 March 1900
Sunday 15 April 1900
Monday 14 May 1900
Wednesday 13 June 1900
Thursday 12 July 1900
Friday 10 August 1900
Sunday 9 September 1900
Monday 8 October 1900
Tuesday 6 November 1900
Thursday 6 December 1900
Saturday 5 January 1901
Sunday 3 February 1901
Tuesday 5 March 1901
Thursday 4 April 1901
Friday 3 May 1901
Sunday 2 June 1901
Tuesday 2 July 1901
Wednesday 31 July 1901

Format Full Moon Data

Split Date description into Date, Month and Year

FM_Day FM_Full_Date FM_Date FM_Month FM_Year
Monday 15 January 1900 15 1 1900
Wednesday 14 February 1900 14 2 1900
Friday 16 March 1900 16 3 1900
Sunday 15 April 1900 15 4 1900
Monday 14 May 1900 14 5 1900
Wednesday 13 June 1900 13 6 1900
Thursday 12 July 1900 12 7 1900
Friday 10 August 1900 10 8 1900
Sunday 9 September 1900 9 9 1900
Monday 8 October 1900 8 10 1900
Tuesday 6 November 1900 6 11 1900
Thursday 6 December 1900 6 12 1900
Saturday 5 January 1901 5 1 1901
Sunday 3 February 1901 3 2 1901
Tuesday 5 March 1901 5 3 1901

Highest Birth Count By Year (Top 6)

Group By Year, Sum and Sort

year sum
2007 4380784
2006 4335154
2008 4310737
2005 4211941
2009 4190991
2004 4186863

2007 birth count by month

month mean
1 11629.71
2 11843.46
3 11802.84
4 11432.70
5 11854.55
6 12116.20
7 12414.87
8 12785.65
9 12425.80
10 12105.55
11 11977.83
12 11619.45
## [1] "Total Births(mean) in 2007 is 144008.61"
## [1] "Mean Birth per day in 2007 is 12000.72"

Full Moon Births in 2007

year month date_of_month day_of_week births FM_Day FM_Full_Date
2007 1 3 3 13687 Wednesday 3 January 2007
2007 2 2 5 13305 Friday 2 February 2007
2007 3 4 7 7566 Sunday 4 March 2007
2007 4 2 1 12450 Monday 2 April 2007
2007 5 2 3 13325 Wednesday 2 May 2007
2007 6 1 5 13714 Friday 1 June 2007
2007 6 30 6 9016 Saturday 30 June 2007
2007 7 30 1 13439 Monday 30 July 2007
2007 8 28 2 14959 Tuesday 28 August 2007
2007 9 26 3 14417 Wednesday 26 September 2007
2007 10 26 5 13223 Friday 26 October 2007
2007 11 24 6 8380 Saturday 24 November 2007
2007 12 24 1 7727 Monday 24 December 2007
## [1] "Mean Birth per full moon day in 2007 is 12934.00"

Merge and Compare the births counts (Regular vs Full Moon)

One of the most common statistical tasks is to compare an outcome between two groups.

month mean year date_of_month day_of_week births FM_Day FM_Full_Date
1 11629.71 2007 3 3 13687 Wednesday 3 January 2007
2 11843.46 2007 2 5 13305 Friday 2 February 2007
3 11802.84 2007 4 7 7566 Sunday 4 March 2007
4 11432.70 2007 2 1 12450 Monday 2 April 2007
5 11854.55 2007 2 3 13325 Wednesday 2 May 2007
6 12116.20 2007 1 5 13714 Friday 1 June 2007
6 12116.20 2007 30 6 9016 Saturday 30 June 2007
7 12414.87 2007 30 1 13439 Monday 30 July 2007
8 12785.65 2007 28 2 14959 Tuesday 28 August 2007
9 12425.80 2007 26 3 14417 Wednesday 26 September 2007
10 12105.55 2007 26 5 13223 Friday 26 October 2007
11 11977.83 2007 24 6 8380 Saturday 24 November 2007
12 11619.45 2007 24 1 7727 Monday 24 December 2007

Plot 2007 regular births and full moon births

T-Test Analysis

birth.t.test <- t.test(merge_by_month_data_2007$mean, merge_by_month_data_2007$births, data = merge_by_month_data_2007)
birth.t.test
## 
##  Welch Two Sample t-test
## 
## data:  merge_by_month_data_2007$mean and merge_by_month_data_2007$births
## t = 0.093272, df = 12.465, p-value = 0.9272
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1570.097  1711.144
## sample estimates:
## mean of x mean of y 
##  12009.60  11939.08
# p-value
birth.t.test$p.value
## [1] 0.9271695
# confidence interval
birth.t.test$conf.int
## [1] -1570.097  1711.144
## attr(,"conf.level")
## [1] 0.95

High p-values indicate that the evidence is not strong enough to suggest an effect exists. It’s possible that the sample size is too small for the hypothesis test to detect it.

Lets try to increase the sample size and run the analysis again.

Birth data sample(2000 - 2013)

Birth count by month

year month mean
2000 1 10894.81
2000 2 11174.00
2000 3 11220.13
2000 4 10778.57
2000 5 11224.19
2000 6 11596.63
## [1] "Sum of Births(mean) in 2000-2013 is 1911016.25"
## [1] "Mean Birth per day in 2000-2013 is 11375.10"

Full Moon Births (2000-2013)

year month date_of_month day_of_week births FM_Day FM_Full_Date
2000 1 21 5 11953 Friday 21 January 2000
2000 2 19 6 8861 Saturday 19 February 2000
2000 3 20 1 11595 Monday 20 March 2000
2000 4 18 2 12783 Tuesday 18 April 2000
2000 5 18 4 12580 Thursday 18 May 2000
2000 6 17 6 9143 Saturday 17 June 2000
2000 7 16 7 8423 Sunday 16 July 2000
2000 8 15 2 13374 Tuesday 15 August 2000
2000 9 13 3 13299 Wednesday 13 September 2000
2000 10 13 5 11723 Friday 13 October 2000
2000 11 11 6 9318 Saturday 11 November 2000
2000 12 11 1 11811 Monday 11 December 2000
2001 1 9 2 12386 Tuesday 9 January 2001
2001 2 8 4 12390 Thursday 8 February 2001
2001 3 9 5 12461 Friday 9 March 2001
## [1] "Mean Birth per full moon day in 2000-2013 is 11784.91"

Merge and Compare data

One of the most common statistical tasks is to compare an outcome between two groups.

year month mean date_of_month day_of_week births FM_Day FM_Full_Date
2000 1 10894.81 21 5 11953 Friday 21 January 2000
2000 2 11174.00 19 6 8861 Saturday 19 February 2000
2000 3 11220.13 20 1 11595 Monday 20 March 2000
2000 4 10778.57 18 2 12783 Tuesday 18 April 2000
2000 5 11224.19 18 4 12580 Thursday 18 May 2000
2000 6 11596.63 17 6 9143 Saturday 17 June 2000
2000 7 11488.10 16 7 8423 Sunday 16 July 2000
2000 8 11867.52 15 2 13374 Tuesday 15 August 2000
2000 9 11866.03 13 3 13299 Wednesday 13 September 2000
2000 10 11366.71 13 5 11723 Friday 13 October 2000
2000 11 11416.47 11 6 9318 Saturday 11 November 2000
2000 12 11158.58 11 1 11811 Monday 11 December 2000
2001 1 11044.48 9 2 12386 Tuesday 9 January 2001
2001 2 11064.96 8 4 12390 Thursday 8 February 2001
2001 3 11140.16 9 5 12461 Friday 9 March 2001
2001 4 10996.13 8 7 7718 Sunday 8 April 2001
2001 5 11312.55 7 1 11635 Monday 7 May 2001
2001 6 11253.53 6 3 12804 Wednesday 6 June 2001
2001 7 11550.94 5 4 12531 Thursday 5 July 2001
2001 8 11915.58 4 6 9143 Saturday 4 August 2001
2001 9 11666.27 2 7 8239 Sunday 2 September 2001
2001 10 11350.06 2 2 13523 Tuesday 2 October 2001
2001 11 11042.53 1 4 12879 Thursday 1 November 2001
2001 11 11042.53 30 5 12019 Friday 30 November 2001
2001 12 10795.58 30 7 7840 Sunday 30 December 2001
2002 1 10886.19 28 1 11354 Monday 28 January 2002
2002 2 11068.21 27 3 12303 Wednesday 27 February 2002
2002 3 10882.23 28 4 12454 Thursday 28 March 2002
2002 4 11009.63 27 6 8243 Saturday 27 April 2002
2002 5 11124.23 26 7 7282 Sunday 26 May 2002
2002 6 11114.40 24 1 11972 Monday 24 June 2002
2002 7 11740.65 24 3 13231 Wednesday 24 July 2002
2002 8 11819.65 23 5 13133 Friday 23 August 2002
2002 9 11856.70 21 6 9613 Saturday 21 September 2002
2002 10 11378.65 21 1 11690 Monday 21 October 2002
2002 11 10845.97 20 3 12574 Wednesday 20 November 2002
2002 12 11026.55 19 4 13665 Thursday 19 December 2002
2003 1 10843.97 18 6 8431 Saturday 18 January 2003
2003 2 11171.04 17 1 10707 Monday 17 February 2003
2003 3 11050.77 18 2 12960 Tuesday 18 March 2003
2003 4 11189.70 16 3 12947 Wednesday 16 April 2003
2003 5 11365.00 16 5 12840 Friday 16 May 2003
2003 6 11429.47 14 6 8696 Saturday 14 June 2003
2003 7 11943.10 13 7 7946 Sunday 13 July 2003
2003 8 11820.03 12 2 13995 Tuesday 12 August 2003
2003 9 12210.70 10 3 14180 Wednesday 10 September 2003
2003 10 11633.35 10 5 13306 Friday 10 October 2003
2003 11 10879.60 9 7 7481 Sunday 9 November 2003
2003 12 11311.10 8 1 11708 Monday 8 December 2003
2004 1 10937.97 7 3 12716 Wednesday 7 January 2004
2004 2 11080.97 6 5 12371 Friday 6 February 2004
2004 3 11367.06 7 7 7304 Sunday 7 March 2004
2004 4 11302.63 5 1 12013 Monday 5 April 2004
2004 5 11072.52 4 2 12824 Tuesday 4 May 2004
2004 6 11695.87 3 4 13644 Thursday 3 June 2004
2004 7 11786.45 2 5 13271 Friday 2 July 2004
2004 7 11786.45 31 6 8858 Saturday 31 July 2004
2004 8 11672.10 30 1 12342 Monday 30 August 2004
2004 9 12096.50 28 2 13882 Tuesday 28 September 2004
2004 10 11453.16 28 4 13287 Thursday 28 October 2004
2004 11 11411.97 26 5 9370 Friday 26 November 2004
2004 12 11398.03 26 7 7087 Sunday 26 December 2004
2005 1 10886.39 25 2 13295 Tuesday 25 January 2005
2005 2 11246.07 24 4 12733 Thursday 24 February 2005
2005 3 11455.52 25 5 12111 Friday 25 March 2005
2005 4 11268.67 24 7 7267 Sunday 24 April 2005
2005 5 11346.32 23 1 12445 Monday 23 May 2005
2005 6 11885.87 22 3 13342 Wednesday 22 June 2005
2005 7 11716.00 21 4 13870 Thursday 21 July 2005
2005 8 12144.10 19 5 13365 Friday 19 August 2005
2005 9 12345.13 18 7 8262 Sunday 18 September 2005
2005 10 11327.77 17 1 12230 Monday 17 October 2005
2005 11 11406.07 16 3 13113 Wednesday 16 November 2005
2005 12 11442.58 15 4 13586 Thursday 15 December 2005
2006 1 11160.52 14 6 8628 Saturday 14 January 2006
2006 2 11586.96 13 1 11955 Monday 13 February 2006
2006 3 11694.19 15 3 13183 Wednesday 15 March 2006
2006 4 11152.90 13 4 12853 Thursday 13 April 2006
2006 5 11641.97 13 6 8260 Saturday 13 May 2006
2006 6 12120.50 11 7 7653 Sunday 11 June 2006
2006 7 12046.61 11 2 14413 Tuesday 11 July 2006
2006 8 12716.10 9 3 14277 Wednesday 9 August 2006
2006 9 12706.80 7 4 15454 Thursday 7 September 2006
2006 10 12054.35 7 6 9015 Saturday 7 October 2006
2006 11 11932.53 5 7 7957 Sunday 5 November 2006
2006 12 11697.13 5 2 13859 Tuesday 5 December 2006
2007 1 11629.71 3 3 13687 Wednesday 3 January 2007
2007 2 11843.46 2 5 13305 Friday 2 February 2007
2007 3 11802.84 4 7 7566 Sunday 4 March 2007
2007 4 11432.70 2 1 12450 Monday 2 April 2007
2007 5 11854.55 2 3 13325 Wednesday 2 May 2007
2007 6 12116.20 1 5 13714 Friday 1 June 2007
2007 6 12116.20 30 6 9016 Saturday 30 June 2007
2007 7 12414.87 30 1 13439 Monday 30 July 2007
2007 8 12785.65 28 2 14959 Tuesday 28 August 2007
2007 9 12425.80 26 3 14417 Wednesday 26 September 2007
2007 10 12105.55 26 5 13223 Friday 26 October 2007
2007 11 11977.83 24 6 8380 Saturday 24 November 2007
2007 12 11619.45 24 1 7727 Monday 24 December 2007
2008 1 11673.16 22 2 13297 Tuesday 22 January 2008
2008 2 11839.79 21 4 13507 Thursday 21 February 2008
2008 3 11467.23 21 5 12606 Friday 21 March 2008
2008 4 11711.47 20 7 7504 Sunday 20 April 2008
2008 5 11607.58 20 2 13912 Tuesday 20 May 2008
2008 6 11781.00 18 3 13638 Wednesday 18 June 2008
2008 7 12280.87 18 5 13542 Friday 18 July 2008
2008 8 12227.29 16 6 9180 Saturday 16 August 2008
2008 9 12454.17 15 1 13775 Monday 15 September 2008
2008 10 11723.23 14 2 13396 Tuesday 14 October 2008
2008 11 10963.50 13 4 12541 Thursday 13 November 2008
2008 12 11603.84 12 5 12903 Friday 12 December 2008
2009 1 11063.71 11 7 7122 Sunday 11 January 2009
2009 2 11468.64 9 1 12484 Monday 9 February 2009
2009 3 11373.26 11 3 12999 Wednesday 11 March 2009
2009 4 11394.07 9 4 13278 Thursday 9 April 2009
2009 5 11284.87 9 6 8180 Saturday 9 May 2009
2009 6 11720.67 7 7 7221 Sunday 7 June 2009
2009 7 12048.10 7 2 14208 Tuesday 7 July 2009
2009 8 11767.58 6 4 13377 Thursday 6 August 2009
2009 9 12256.47 4 5 13622 Friday 4 September 2009
2009 10 11389.00 4 7 7443 Sunday 4 October 2009
2009 11 10841.57 2 1 12212 Monday 2 November 2009
2009 12 11185.94 2 3 13267 Wednesday 2 December 2009
2009 12 11185.94 31 4 11667 Thursday 31 December 2009
2010 1 10584.52 30 6 8035 Saturday 30 January 2010
2010 2 10942.61 28 7 6895 Sunday 28 February 2010
2010 3 11078.97 30 2 12868 Tuesday 30 March 2010
2010 4 10971.23 28 3 12500 Wednesday 28 April 2010
2010 5 10730.68 28 5 12519 Friday 28 May 2010
2010 6 11298.87 26 6 8294 Saturday 26 June 2010
2010 7 11278.97 26 1 12262 Monday 26 July 2010
2010 8 11442.81 24 2 13226 Tuesday 24 August 2010
2010 9 11862.90 23 4 13717 Thursday 23 September 2010
2010 10 11024.74 23 6 8033 Saturday 23 October 2010
2010 11 11032.80 21 7 7285 Sunday 21 November 2010
2010 12 11104.74 21 2 14255 Tuesday 21 December 2010
2011 1 10478.87 19 3 12271 Wednesday 19 January 2011
2011 2 10783.11 18 5 12328 Friday 18 February 2011
2011 3 10788.35 19 6 7786 Saturday 19 March 2011
2011 4 10574.17 18 1 11681 Monday 18 April 2011
2011 5 10666.68 17 2 12819 Tuesday 17 May 2011
2011 6 11387.90 15 3 12681 Wednesday 15 June 2011
2011 7 11285.19 15 5 13097 Friday 15 July 2011
2011 8 11748.19 13 6 8448 Saturday 13 August 2011
2011 9 11688.73 12 1 12770 Monday 12 September 2011
2011 10 10746.19 12 3 12581 Wednesday 12 October 2011
2011 11 10870.23 10 4 11978 Thursday 10 November 2011
2011 12 10717.16 10 6 7658 Saturday 10 December 2011
2012 1 10343.90 9 1 11248 Monday 9 January 2012
2012 2 10629.72 7 2 12310 Tuesday 7 February 2012
2012 3 10584.61 8 4 11985 Thursday 8 March 2012
2012 4 10337.27 6 5 10955 Friday 6 April 2012
2012 5 10771.23 6 7 6868 Sunday 6 May 2012
2012 6 11026.17 4 1 12011 Monday 4 June 2012
2012 7 11337.81 3 2 13752 Tuesday 3 July 2012
2012 8 11790.35 2 4 13006 Thursday 2 August 2012
2012 8 11790.35 31 5 13287 Friday 31 August 2012
2012 9 11477.77 30 7 7607 Sunday 30 September 2012
2012 10 11289.45 29 1 11957 Monday 29 October 2012
2012 11 10976.87 28 3 12614 Wednesday 28 November 2012
2012 12 10594.39 28 5 13888 Friday 28 December 2012
2013 1 10559.74 27 7 6916 Sunday 27 January 2013
2013 2 10531.50 25 1 11358 Monday 25 February 2013
2013 3 10439.65 27 3 12204 Wednesday 27 March 2013
2013 4 10494.90 25 4 11620 Thursday 25 April 2013
2013 5 10737.68 25 6 8156 Saturday 25 May 2013
2013 6 10762.27 23 7 7435 Sunday 23 June 2013
2013 7 11370.81 22 1 12544 Monday 22 July 2013
2013 8 11512.23 21 3 12903 Wednesday 21 August 2013
2013 9 11389.20 19 4 13333 Thursday 19 September 2013
2013 10 11105.97 19 6 7995 Saturday 19 October 2013
2013 11 10738.57 17 7 7348 Sunday 17 November 2013
2013 12 10948.35 17 2 13252 Tuesday 17 December 2013

T-Test Analysis

## 
##  Welch Two Sample t-test
## 
## data:  merge_by_monthyear_data_bigset$mean and merge_by_monthyear_data_bigset$births
## t = -0.34611, df = 187.91, p-value = 0.7297
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -423.1707  296.8438
## sample estimates:
## mean of x mean of y 
##  11381.14  11444.31

T-Test Results

# p-value
birth.2t.test$p.value
## [1] 0.7296505
# confidence interval
birth.2t.test$conf.int
## [1] -423.1707  296.8438
## attr(,"conf.level")
## [1] 0.95

Based on the T-Test the p-value is 0.729, which is a large value indicating weak evidence against the null hypothesis, so fail to reject it.

Data Visualization

Histogram of the average births

Data Visualization

Histogram of the full moon births

## Warning: Removed 153 rows containing non-finite values (stat_bin).
## Warning: Removed 2 rows containing missing values (geom_bar).

Scatterplot of average births by month

Scatterplot of Full Moon births by month

Scatterplot the births on any day of the week

Scatterplot the full moon births by day of the week

Conclusion

Based on the statistical analysis, there is no conclusive evidence to prove that there is an effect of Full Moon with more number of birth occurrences. Therefore, the Null Hypothesis holds true and the Alternate Hypothesis is rejected based on the above calculations.

The average birth count in both catergories is between 11000 and 12000. Although, looking at the Scatterplot of Full Moon births by month, we can see the high variance of birth counts from the mean. The birth counts range from 7000 to 15000 with concentration of counts on both end of the spectrum. This leads to believe that on Full Moon nights depending on some other ‘unknown’ variables, it can stimulate or defer women going into labor.

This can be looked into and studied further by using other statistical methods and/or adding other variables for analysis of this cultural myth, hoping to reveal the facts and understand its nature.

Questions & Answers