Summary

Radioactive decay is known to be a sound way to produce true random numbers. Extensive rationale can be found at the HotBits site [L04] (see also Donald Knuth considerations on the use of the exponential distribution [R01:page 132]). This document describes the use of a commercial shield for Arduino for generating integer random numbers of byte length (i.e., in the range 0 to 255). The results of testing the generated numbers for true randomness, although based on a small sample, indicate that the system could be able to provide reliable data.

The concept

The gadget uses pulses to generate random bits that are assembled together in series of 8, in order to form a byte. This byte is then sent to the computer via the serial port and postprocessed with R to obtain a sequence of random decimal integers. The random bit is generated by collecting the elapsed time of 4 consecutive pulses. Then, the elapsed time between the first pair of pulses and the second one is used to generate a binary digit (i.e., a “0” or a “1”). When 8 binary digits are generated a full byte is ready, and represent one random number.

The hardware

The solution is a gadget based on a shield built by Libelium [L01] and Arduino Uno. The shield is described at [L02] and full specifications and code examples can be found at [L03]. The gadget looks like this:

This is a side view where Arduino is also visible:

And this is the sensor (specifications for the sensor can be found at [L03]):

Note that this specific tube is only able to detect \(\beta\) and \(\gamma\) emissions: \(\alpha\) cannot be detected with this tube (but might be using a different one on the same board).

The software

The basic software needed to run the Geiger board can be found at [L02]. This software runs the LCD screen, showing the counts-per-minute (CPM) and \(\mu\)\(Sv/h\) values, the serial output for logging, and the LED bar on the board. Modifications are of course needed in order to generate a random bit, assemble a sequence of 8 random bits and finally send this byte through the serial port to terminal software.
The software needs four pulses in order to generate a random bit. The elapsed time of each of the four pulses is stored in the ts[] array. The Arduino function used to get the elapsed is millis(), which returns the number of milliseconds since the board has started running the program [L08]. So, when a pulse is detected:

...
ts[t] = millis();  
t++;  
...

when 4 pulses have been detected, the time difference of the time differences between the first pair and the second pair is computed, and the random bit is generated:

...
if(t > 3)  
{  
   elaps = (ts[1] - ts[0]) - (ts[3] - ts[2]);  
  
   if(elaps >= 0)  
      { r_bit[0] = '1'; }
   else   
      { r_bit[0] = '0'; }  
...

If the time difference between the two pairs of reading is greater than or equal to zero, a binary “1” is generated, otherwise a binary “0” is. The newly generated bit is then added to the sequence, and when 8 bits are in the sequence, the byte is sent to the serial port:

...
strcat(bit8, r_bit);

if(strlen(bit8) >= 8)  
    {  
      Serial.println(bit8);  
      bit8[0] = '\0';  
    }  
...

The output on the serial port is then caught by the putty terminal software, which logs it into a file, as visible in the following image:

The file logged by putty (herein named random_tot.log) is then processed with R. First, the file is read as a csv:

dat <- read.csv("random_tot.log", skip=1)

The skip=1 is needed to skip the first line which is the putty header. Note that by using this command the second line, after the putty header, is expected to be the column name, which is bin. This is not generated by putty and must be added manually (or the R code must be modified in order to assign the name bin to the column). So, assuming that the file starts in this way:

=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2015.06.21 20:40:23 =~=~=~=~=~=~=~=~=~=~=~=
bin
01101111
11010011
01111100
10111000
11000111
...

the following data frame is generated:

head(dat)
##        bin
## 1  1101111
## 2 11010011
## 3  1111100
## 4 10111000
## 5 11000111
## 6      111

A new column is added to that, in order to accommodate for the decimal equivalent of the binary digits:

dat$dec <- 0

Then, the dec column is filled with the decimal equivalent of bin (a recipe for this, together with other useful suggestions for dealing with binary digits, can be found at [L09]):

BinToDec <- function(x) 
    sum(2^(which(rev(unlist(strsplit(as.character(x), "")) == 1))-1))

for (i in 1:nrow(dat)) {
    dat$dec[i] <- BinToDec(dat$bin[i])
}

head(dat)
##        bin dec
## 1  1101111 111
## 2 11010011 211
## 3  1111100 124
## 4 10111000 184
## 5 11000111 199
## 6      111   7

The software used to test the quality of the randomness of the generated data is available from the HotBits site and is called ent [L05]. This software needs a binary file as input, so the dec column shall be stored in a binary file. Another recipe, available at [L10], gives useful instructions on how to do it:

to.write = file("random_tot.bin", "wb")
writeBin(as.integer(dat$dec), to.write, size=1)
close(to.write)

A binary file connection is created, then the column is stored into the binary file and the connection is closed. Note the size=1 parameter for setting the length of each written number to 1 byte.

As mentioned, the binary file is then processed by the ent software, which generates a set of statistics concerning its random quality.
ent is a MS-DOS program which is run from the command line:

The outcome

The only use of background radiation makes the process extremely slow. Since 4 pulses are needed to generate 1 bit, then \(4 * 8 = 32\) pulses are needed to generate one random number of byte length (i.e., 8 bits). The background radiation detected in the location the gadget is placed in is very low, with an average CPM around 19. This means that more than one and a half minute is needed to generate one single number, with a rate of only about 40 numbers generated per hour.
So, please note that the results reported in the table should be confirmed with a larger set of random numbers, one that could be generated, for instance, using a source of radiation different from the background one.

The table below shows the output of ent for several different runs, with an increasing number of random numbers.

run n entropy reduction \(\chi\)\(^2\) exceed mean Monte Carlo \(\pi\) \(\pi\) err serial corr
1 379 7.484247 6% 228.24 88.48% 126.3272 3.174603175 1.05% -0.044276
2 626 7.711549 3% 218.88 95.07% 126.1086 3.346153846 6.51% -0.034734
3 762 7.727568 3% 247.89 61.34% 126.71 3.433070866 9.28% -0.026588
4 948 7.776483 2% 267.73 27.96% 128.308 3.417721519 8.79% -0.026841
5 1080 7.817928 2% 256.41 46.33% 128.1241 3.355555556 6.81% -0.013161
6 1291 7.848902 1% 261.06 38.37% 128.2192 3.36744186 7.19% -0.02065
7 1425 7.850409 1% 284 10.24% 128.1733 3.341772152 6.37% -0.023477
8 1681 7.879887 1% 269.08 26.06% 127.7442 3.257142857 3.68% -0.020671
9 1980 7.896492 1% 279.26 14.20% 126.6288 3.321212121 5.72% -0.025548
10 2461 7.924562 0% 257.01 45.29% 128.1288 3.229268293 2.79% -0.024973
11 2979 7.941576 0% 236.25 79.44% 128.0326 3.217741935 2.42% -0.023743

An explanation of the information provided in the table is given at [L05]. The two most important tests are shortly described below.

The entropy is a measure which increases when the sender of a message has an increasing number of possibilities to chose the message to be sent from. This means that the level of uncertainty, on the side of the receiver, will increase accordingly. For instance, if the number sent by the sender is always the same, the receiver will not be surprised by receiving it. The entropy would be 0: in our case, let’s imagine that the bit generated is always a 0 (or always a 1), so would be the integer generated, always 0 or always 255, in a totally predictable way. As shown in the table, the entropy is almost 8 bits, always increasing and approaching its maximum, so giving the impression that it is really impossible to guess what the next number would be. Concerning the definition of entropy, see [R02:pages 51-55] and [R03: chapter V].
Another interesting result is the percentage of reduction in the size of the file that we can obtain by compressing it. In this case this measure approaches 0%: intuitively, a compression algorithm needs repetitions to work, and those seem not to be easily found here.

The \(\chi\)\(^2\) test is also an important test of randomness and can be easily understood by using the next column (exceed):

exceed range decision
0% to 1% reject (very likely to be not random)
99% to 100% reject (very likely to be not random)
1% to 5% suspect
95% to 99% suspect
5% to 10% almost suspect
90% to 95% almost suspect

The values above 10% and below 90% are an indication of good randomness. For a detailed explanation of the \(\chi\)\(^2\) test see [R01: pages 42-48].

Possible improvements

  1. Try to use the micros() function, which returns the elapsed in microseconds instead of the millis() currently used.
  2. Test the outputs by using the Marsaglia’s Diehard Battery of Tests of Randomness [L06], which goes much deeper than the ent utility. It would probably make sense to run this with larger sets of random numbers.
  3. As suggested in [L04: chapter How HotBits works] and in [R01: page 539], if the difference of the two pairs of elapsed times is 0, then the bit should not be generated and the process should restart.
  4. As suggested in [L04: chapter How HotBits works], a flip-flop mechanism could be implemented, so as to deduct, alternatively, the first pair from the second one, and then, the second one from the first one, when calculating the random bit.

Annex

The ent software has a command line switch (-c) that prints a table of the occurrence counts for each number (each Extended ASCII character in this case, as the random numbers generated are integers in the range 0 to 255). This is the output obtained by using this switch with the final binary file generated (n = 2979):

Value Char Occurrences Fraction
  0                7   0.002350
  1               12   0.004028
  2               14   0.004700
  3               23   0.007721
  4               12   0.004028
  5                8   0.002685
  6               16   0.005371
  7               13   0.004364
  8                8   0.002685
  9               12   0.004028
 10               12   0.004028
 11                9   0.003021
 12               14   0.004700
 13                8   0.002685
 14                9   0.003021
 15               10   0.003357
 16               14   0.004700
 17               12   0.004028
 18               11   0.003693
 19               11   0.003693
 20               11   0.003693
 21               11   0.003693
 22               18   0.006042
 23               13   0.004364
 24               10   0.003357
 25                8   0.002685
 26                7   0.002350
 27               13   0.004364
 28               14   0.004700
 29               15   0.005035
 30               11   0.003693
 31               15   0.005035
 32               10   0.003357
 33   !            7   0.002350
 34   "           15   0.005035
 35   #            5   0.001678
 36   $           11   0.003693
 37   %            7   0.002350
 38   &           10   0.003357
 39   '           11   0.003693
 40   (            9   0.003021
 41   )           15   0.005035
 42   *            8   0.002685
 43   +           13   0.004364
 44   ,           11   0.003693
 45   -           11   0.003693
 46   .           11   0.003693
 47   /           15   0.005035
 48   0           11   0.003693
 49   1           13   0.004364
 50   2           14   0.004700
 51   3           10   0.003357
 52   4           17   0.005707
 53   5           13   0.004364
 54   6           12   0.004028
 55   7            8   0.002685
 56   8           12   0.004028
 57   9           14   0.004700
 58   :           15   0.005035
 59   ;           12   0.004028
 60   <           10   0.003357
 61   =           16   0.005371
 62   >           12   0.004028
 63   ?            6   0.002014
 64   @           14   0.004700
 65   A           11   0.003693
 66   B            6   0.002014
 67   C           12   0.004028
 68   D            2   0.000671
 69   E           11   0.003693
 70   F           10   0.003357
 71   G           13   0.004364
 72   H           14   0.004700
 73   I           19   0.006378
 74   J            8   0.002685
 75   K           13   0.004364
 76   L           12   0.004028
 77   M           15   0.005035
 78   N            5   0.001678
 79   O           13   0.004364
 80   P           12   0.004028
 81   Q            9   0.003021
 82   R           18   0.006042
 83   S           14   0.004700
 84   T           10   0.003357
 85   U           14   0.004700
 86   V           15   0.005035
 87   W           10   0.003357
 88   X            8   0.002685
 89   Y           12   0.004028
 90   Z            7   0.002350
 91   [           17   0.005707
 92   \           19   0.006378
 93   ]           11   0.003693
 94   ^           11   0.003693
 95   _           19   0.006378
 96   `           11   0.003693
 97   a           11   0.003693
 98   b           12   0.004028
 99   c           11   0.003693
100   d            8   0.002685
101   e           14   0.004700
102   f           11   0.003693
103   g            8   0.002685
104   h           10   0.003357
105   i           13   0.004364
106   j           12   0.004028
107   k           15   0.005035
108   l            7   0.002350
109   m            9   0.003021
110   n           16   0.005371
111   o           13   0.004364
112   p           13   0.004364
113   q           15   0.005035
114   r           14   0.004700
115   s           14   0.004700
116   t           13   0.004364
117   u           19   0.006378
118   v           14   0.004700
119   w            8   0.002685
120   x            6   0.002014
121   y           12   0.004028
122   z           14   0.004700
123   {           11   0.003693
124   |            9   0.003021
125   }           12   0.004028
126   ~           10   0.003357
127               15   0.005035
128               10   0.003357
129               12   0.004028
130               13   0.004364
131               12   0.004028
132                7   0.002350
133               12   0.004028
134               11   0.003693
135                8   0.002685
136               11   0.003693
137                8   0.002685
138               12   0.004028
139               11   0.003693
140               12   0.004028
141                9   0.003021
142               13   0.004364
143               10   0.003357
144                9   0.003021
145               10   0.003357
146                8   0.002685
147               13   0.004364
148               10   0.003357
149                7   0.002350
150               15   0.005035
151               11   0.003693
152               10   0.003357
153                5   0.001678
154                4   0.001343
155                7   0.002350
156                6   0.002014
157               17   0.005707
158               16   0.005371
159               20   0.006714
160               11   0.003693
161   ¡           11   0.003693
162   ¢           15   0.005035
163   £           11   0.003693
164   ¤           13   0.004364
165   ¥            6   0.002014
166   ¦           10   0.003357
167   §           10   0.003357
168   ¨           12   0.004028
169   ©           14   0.004700
170   ª           12   0.004028
171   «            9   0.003021
172   ¬            7   0.002350
173   ?           15   0.005035
174   ®           11   0.003693
175   ¯            6   0.002014
176   °            8   0.002685
177   ±           12   0.004028
178   ²           14   0.004700
179   ³            8   0.002685
180   ´            8   0.002685
181   µ           10   0.003357
182   ¶           12   0.004028
183   ·           21   0.007049
184   ¸           12   0.004028
185   ¹           15   0.005035
186   º            8   0.002685
187   »           10   0.003357
188   ¼           16   0.005371
189   ½           13   0.004364
190   ¾           13   0.004364
191   ¿           10   0.003357
192   À            5   0.001678
193   Á            8   0.002685
194   Â           10   0.003357
195   Ã           13   0.004364
196   Ä            9   0.003021
197   Å           10   0.003357
198   Æ            9   0.003021
199   Ç           20   0.006714
200   È           15   0.005035
201   É           10   0.003357
202   Ê           11   0.003693
203   Ë           14   0.004700
204   Ì           10   0.003357
205   Í           15   0.005035
206   Î           10   0.003357
207   Ï           17   0.005707
208   Ð           15   0.005035
209   Ñ           18   0.006042
210   Ò           11   0.003693
211   Ó           14   0.004700
212   Ô            9   0.003021
213   Õ            6   0.002014
214   Ö           13   0.004364
215   ×           15   0.005035
216   Ø           10   0.003357
217   Ù           14   0.004700
218   Ú           13   0.004364
219   Û            9   0.003021
220   Ü           13   0.004364
221   Ý           11   0.003693
222   Þ            9   0.003021
223   ß           11   0.003693
224   à            8   0.002685
225   á           13   0.004364
226   â           13   0.004364
227   ã           15   0.005035
228   ä            7   0.002350
229   å           16   0.005371
230   æ           13   0.004364
231   ç            9   0.003021
232   è           11   0.003693
233   é            8   0.002685
234   ê           14   0.004700
235   ë           12   0.004028
236   ì           14   0.004700
237   í           17   0.005707
238   î           10   0.003357
239   ï           10   0.003357
240   ð           10   0.003357
241   ñ           19   0.006378
242   ò           14   0.004700
243   ó           10   0.003357
244   ô           12   0.004028
245   õ           17   0.005707
246   ö           10   0.003357
247   ÷            9   0.003021
248   ø           12   0.004028
249   ù           13   0.004364
250   ú           11   0.003693
251   û           12   0.004028
252   ü            9   0.003021
253   ý           15   0.005035
254   þ           14   0.004700
255   ÿ           14   0.004700

Total:          2979   1.000000

Entropy = 7.941576 bits per byte.

Optimum compression would reduce the size
of this 2979 byte file by 0 percent.

Chi square distribution for 2979 samples is 236.25, and randomly
would exceed this value 79.44 percent of the times.

Arithmetic mean value of data bytes is 128.0326 (127.5 = random).
Monte Carlo value for Pi is 3.217741935 (error 2.42 percent).
Serial correlation coefficient is -0.023743 (totally uncorrelated = 0.0).

fms 2015, version 1.01 - 20150628, written in RStudio, transformed into HTML via knitr.