Dates that are Prime Numbers

I have been playing around recently with various number sequences. Obviously one of the most well known number sequences are prime numbers:


2   3   5   7   11   13   17   19   23   29   31   …  


It’s quite simple to make a function to check if a number is prime in R:

is.prime <- function(n) n == 2L || all(n %% 2L:floor(sqrt(n)) != 0)

is.prime(28)
## [1] FALSE
is.prime(29)
## [1] TRUE




This led me to thinking about which numbers are prime and particularly what dates might be a prime number. The major problem with this though is how to write the dates! They could be written in American format (month-day-year) or in Rest-of-the-World format (day-month-year). Further, the year could be written in two or four digits and days and months could be written with leading 0’s or not.

For instance, these are the different ways we could write 8th May 1957:


This could potentially yield the following numbers for this date:

80557   8051957   50857   5081957   8557   851957   5857   581957


We can check which of these are primes…

##   80557 8051957   50857 5081957    8557  851957    5857  581957 
##    TRUE   FALSE    TRUE   FALSE   FALSE    TRUE    TRUE   FALSE


Now before going further, it’s pretty clear that there are many dates that we can automatically rule out as having no chance of being prime numbers. Years that end in a 0 or 5 have to be divisible by 5 and even numbered years have to be divisible by 2. This notwithstanding, what dates might actually be prime numbers? and are there any dates which are prime numbers for all of the possible date formats? Which day-month combination is the most likely to be a prime number?




Converting Dates to Numbers

We need to first get a series of dates. I thought that I’d get a large series of dates, so decided to look at 15 centuries worth of dates:

myd <- seq(as.Date("1000/1/1"), as.Date("2500/1/1"), "days")  
length(myd)  #547,865 days!
## [1] 547865
myd[349665]
## [1] "1957-05-08"
tail(myd)
## [1] "2499-12-27" "2499-12-28" "2499-12-29" "2499-12-30" "2499-12-31"
## [6] "2500-01-01"


Next, I need to get dates into all possible formats. This requires using various format options for dates as well as using various regular expressions. I’ll go through each one by one and use 8th May 1957 as an example date to illustrate.

The first one is to get the Rest-of-the-World date format (day before month), using 2 digits for the year and keep the leading zeros for months.

myd1 <- as.numeric(as.character(format(myd, format='%d%m%y')))
myd1[349665]
## [1] 80557


This is USA date format (day after month), using 2 digits for the year and keep the leading zeros for days.

myd2 <- as.numeric(as.character(format(myd, format='%m%d%y')))
myd2[349665]
## [1] 50857


This is Rest-of-the-World date format (day before month), using 2 digits for the year but not keeping the leading zeros for months.

x1 <- substr(as.character(myd), 5, nchar(as.character(myd)))
myd3 <- paste0(sub("..0?(.+)-0?(.+)", "\\2\\1", x1), format(myd, "%y"))
myd3 <-as.numeric(as.character(myd3))
myd3[349665]
## [1] 8557


This is USA date format (day after month), using 2 digits for the year but not keeping the leading zeros for days.

ex <- format(myd, format='%m%e%y')
myd4 <- as.numeric(as.character(gsub("[[:space:]]", "", ex)))
myd4[349665]
## [1] 5857


Now repeat the above, but use four digit years.

#
myd5 <- as.numeric(as.character(format(myd, format='%d%m%Y')))

#
myd6 <- as.numeric(as.character(format(myd, format='%m%d%Y')))

#
xx1 <- substr(as.character(myd), 5, nchar(as.character(myd)))
myd7 <- paste0(sub("..0?(.+)-0?(.+)", "\\2\\1", xx1), format(myd, "%Y"))
myd7 <-as.numeric(as.character(myd7))

#
xex <- format(myd, format='%m%e%Y')
myd8 <- as.numeric(as.character(gsub("[[:space:]]", "", xex)))




Test whether Dates as Numbers are Primes

We now check which of these numbers are prime. There are faster ways of doing this!

myd1p <- unlist(lapply(myd1, is.prime))
myd2p <- unlist(lapply(myd2, is.prime))
myd3p <- unlist(lapply(myd3, is.prime))
myd4p <- unlist(lapply(myd4, is.prime))
myd5p <- unlist(lapply(myd5, is.prime))
myd6p <- unlist(lapply(myd6, is.prime))
myd7p <- unlist(lapply(myd7, is.prime))
myd8p <- unlist(lapply(myd8, is.prime))


We can now put these into a data.frame:

dateprimes <- data.frame(dates = myd, row = myd1p, usa = myd2p, row1 = myd3p, usa1 = myd4p, row2 = myd5p, usa2 = myd6p, row3 = myd7p, usa3 = myd8p)


We can also add up how many of these 8 different formats are primes…

dateprimes$total <- apply(dateprimes[,2:9], 1, sum)
dateprimes[349662:349668,]
##             dates   row   usa  row1  usa1  row2  usa2  row3  usa3 total
## 349662 1957-05-05 FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE     4
## 349663 1957-05-06 FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE     1
## 349664 1957-05-07 FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE     1
## 349665 1957-05-08  TRUE  TRUE FALSE  TRUE FALSE FALSE  TRUE FALSE     4
## 349666 1957-05-09 FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE     1
## 349667 1957-05-10 FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE     2
## 349668 1957-05-11  TRUE  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE     3

 

As we can see, and we saw before above, the 8th May 1957 is a prime in four of its 8 date-number formats.





Exploring Prime Number Dates

Now we have this dataset, we can explore some features of which dates are primes. Firstly, let’s look at the distribution of ‘total’.

table(dateprimes$total)
## 
##      0      1      2      3      4      5      6      7      8 
## 391739  51417  47044  33010  16314   5646   2180    450     65


We can see that of the 547,865 dates that we started with, there are only 65 dates that are prime numbers for all 8 date formats ! That’s only 0.01% or 1 in 10,000 days. That’s quite exciting! Let’s take a look at those dates:

dateprimes[dateprimes$total==8,]$dates
##  [1] "1041-07-31" "1177-05-19" "1189-06-06" "1189-07-20" "1197-02-02"
##  [6] "1217-07-07" "1221-12-14" "1247-01-25" "1249-05-14" "1251-03-17"
## [11] "1263-05-23" "1277-02-28" "1281-05-05" "1291-03-21" "1293-10-22"
## [16] "1297-04-15" "1319-03-03" "1459-06-06" "1489-03-03" "1497-02-02"
## [21] "1503-05-05" "1507-08-13" "1509-05-27" "1519-03-03" "1529-01-04"
## [26] "1529-04-01" "1549-07-18" "1551-01-24" "1577-09-30" "1581-05-05"
## [31] "1581-11-30" "1583-04-20" "1607-06-06" "1631-07-20" "1661-01-11"
## [36] "1663-08-08" "1821-03-11" "1821-11-03" "1843-01-21" "1849-08-08"
## [41] "1893-01-01" "1893-03-17" "1899-02-17" "1963-08-08" "2097-07-15"
## [46] "2103-01-01" "2109-10-10" "2131-07-20" "2163-10-13" "2183-09-20"
## [51] "2199-04-28" "2207-03-03" "2231-09-09" "2233-03-10" "2239-07-21"
## [56] "2241-02-27" "2411-01-26" "2467-01-08" "2467-08-01" "2467-12-13"
## [61] "2469-01-25" "2481-01-01" "2483-04-20" "2491-05-05" "2499-06-22"


We can break this down by century:

## 
## 11 12 13 14 15 16 17 19 20 21 22 23 25 
##  1  4 11  1  3 12  4  7  1  1  6  5  9


There were no 100% prime numbered dates in the 18th century and there will be none in the 24th century! Several centuries only have one date that is 100% prime - the 11th, 14th, 20th, and 21st centuries. Those dates were 31st July 1041, 3rd March 1319, 8th August 1963 and 15th July 2097. Sadly, this means that in my lifetime there will never be a 100% prime date.

One thing you might note about some of the above dates is that the numbers of months and days are identical, e.g. October 10th, August 8th, June 6th. Obviously, having fewer number permutations for each date makes it more likely that all will be prime.

If one wanted to live through prime dates, it would have been best to have been born at the beginning of the 16th century. Pope Gregory XIII who was born in 1502 lived through 12 of these dates.

Interestingly, the 8th August 1963, the last date to be all prime was the date of the Great Train Robbery. I can’t predict if anything exciting will happen on 15th July 2097.



Which dates are the best and worst for prime numbers?

The easiest way to answer this question is to add a variable for each day-month, and then get summary data.

dateprimes$daymonth <- as.character(format(dateprimes$dates, format="%m/%d"))

library(dplyr)

dateprimes1 <- dateprimes %>%  group_by(daymonth) %>%  summarize(av.primes = mean(total))


## most frequent dates
dateprimes1 %>% arrange(desc(av.primes))
## Source: local data frame [366 x 2]
## 
##    daymonth av.primes
## 1     01/01 0.7981346
## 2     01/02 0.7900000
## 3     02/01 0.7900000
## 4     02/02 0.7880000
## 5     01/04 0.7773333
## 6     04/01 0.7773333
## 7     05/05 0.7760000
## 8     10/01 0.7740000
## 9     03/03 0.7733333
## 10    11/09 0.7693333
## ..      ...       ...
## least frequent dates
dateprimes1 %>% arrange(av.primes)
## Source: local data frame [366 x 2]
## 
##    daymonth av.primes
## 1     02/29 0.0000000
## 2     05/24 0.4820000
## 3     10/31 0.4966667
## 4     08/31 0.4986667
## 5     08/18 0.4993333
## 6     11/27 0.5020000
## 7     06/18 0.5073333
## 8     07/16 0.5133333
## 9     03/14 0.5153333
## 10    11/28 0.5160000
## ..      ...       ...


So, the 1st and 2nd of January and February are the most likely to be prime numbers across our 15 century range. The least likely date is obviously the 29th February which can never be prime as it only occurs in years that are divisible by 4. The least likely to be prime numbers are therefore the 24th May and 31st of October and August.



Plotting date primes.

The plot below shows the distribution of prime numbered dates from 1900-2015. You can see the 8th August 1963 as the only point that hits 8/8 of the number formats being prime. The pattern of primes is quite obvious, occurring in years that end in 1, 3, 7 and 9.

tempx <- dateprimes[328719:371086,]
plot(total ~ dates, tempx, xaxt = "n", type = "l")
axis(1, tempx$dates, format(tempx$dates, "%Y"), cex.axis = .7)





If you have any feedback please get in touch. Probably the fastest way is to go to my twitter page.