Nature-generated randomness:
using an Arduino gadget to generate
radioactivity based random numbers.
Radioactive decay is known to be a sound way to produce true random numbers. Extensive rationale can be found at the HotBits site [L04] (see also Donald Knuth considerations on the use of the exponential distribution [R01:page 132]). This document describes the use of a commercial shield for Arduino for generating integer random numbers of byte length (i.e., in the range 0 to 255). The results of testing the generated numbers for true randomness, although based on a small sample, indicate that the system could be able to provide reliable data.
The gadget uses pulses to generate random bits that are assembled together in series of 8, in order to form a byte. This byte is then sent to the computer via the serial port and postprocessed with R to obtain a sequence of random decimal integers. The random bit is generated by collecting the elapsed time of 4 consecutive pulses. Then, the elapsed time between the first pair of pulses and the second one is used to generate a binary digit (i.e., a “0” or a “1”). When 8 binary digits are generated a full byte is ready, and represent one random number.
The solution is a gadget based on a shield built by Libelium [L01] and Arduino Uno. The shield is described at [L02] and full specifications and code examples can be found at [L03]. The gadget looks like this:
This is a side view where Arduino is also visible:
And this is the sensor (specifications for the sensor can be found at [L03]):
Note that this specific tube is only able to detect \(\beta\) and \(\gamma\) emissions: \(\alpha\) cannot be detected with this tube (but might be using a different one on the same board).
The basic software needed to run the Geiger board can be found at [L02]. This software runs the LCD screen, showing the counts-per-minute (CPM) and \(\mu\)\(Sv/h\) values, the serial output for logging, and the LED bar on the board. Modifications are of course needed in order to generate a random bit, assemble a sequence of 8 random bits and finally send this byte through the serial port to terminal software.
The software needs four pulses in order to generate a random bit. The elapsed time of each of the four pulses is stored in the ts[] array. The Arduino function used to get the elapsed is millis(), which returns the number of milliseconds since the board has started running the program [L08]. So, when a pulse is detected:
...
ts[t] = millis();
t++;
...
when 4 pulses have been detected, the time difference of the time differences between the first pair and the second pair is computed, and the random bit is generated:
...
if(t > 3)
{
elaps = (ts[1] - ts[0]) - (ts[3] - ts[2]);
if(elaps >= 0)
{ r_bit[0] = '1'; }
else
{ r_bit[0] = '0'; }
...
If the time difference between the two pairs of reading is greater than or equal to zero, a binary “1” is generated, otherwise a binary “0” is. The newly generated bit is then added to the sequence, and when 8 bits are in the sequence, the byte is sent to the serial port:
...
strcat(bit8, r_bit);
if(strlen(bit8) >= 8)
{
Serial.println(bit8);
bit8[0] = '\0';
}
...
The output on the serial port is then caught by the putty terminal software, which logs it into a file, as visible in the following image:
The file logged by putty (herein named random_tot.log) is then processed with R. First, the file is read as a csv:
dat <- read.csv("random_tot.log", skip=1)
The skip=1 is needed to skip the first line which is the putty header. Note that by using this command the second line, after the putty header, is expected to be the column name, which is bin. This is not generated by putty and must be added manually (or the R code must be modified in order to assign the name bin to the column). So, assuming that the file starts in this way:
=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2015.06.21 20:40:23 =~=~=~=~=~=~=~=~=~=~=~=
bin
01101111
11010011
01111100
10111000
11000111
...
the following data frame is generated:
head(dat)
## bin
## 1 1101111
## 2 11010011
## 3 1111100
## 4 10111000
## 5 11000111
## 6 111
A new column is added to that, in order to accommodate for the decimal equivalent of the binary digits:
dat$dec <- 0
Then, the dec column is filled with the decimal equivalent of bin (a recipe for this, together with other useful suggestions for dealing with binary digits, can be found at [L09]):
BinToDec <- function(x)
sum(2^(which(rev(unlist(strsplit(as.character(x), "")) == 1))-1))
for (i in 1:nrow(dat)) {
dat$dec[i] <- BinToDec(dat$bin[i])
}
head(dat)
## bin dec
## 1 1101111 111
## 2 11010011 211
## 3 1111100 124
## 4 10111000 184
## 5 11000111 199
## 6 111 7
The software used to test the quality of the randomness of the generated data is available from the HotBits site and is called ent [L05]. This software needs a binary file as input, so the dec column shall be stored in a binary file. Another recipe, available at [L10], gives useful instructions on how to do it:
to.write = file("random_tot.bin", "wb")
writeBin(as.integer(dat$dec), to.write, size=1)
close(to.write)
A binary file connection is created, then the column is stored into the binary file and the connection is closed. Note the size=1 parameter for setting the length of each written number to 1 byte.
As mentioned, the binary file is then processed by the ent software, which generates a set of statistics concerning its random quality.ent is a MS-DOS program which is run from the command line:
The only use of background radiation makes the process extremely slow. Since 4 pulses are needed to generate 1 bit, then \(4 * 8 = 32\) pulses are needed to generate one random number of byte length (i.e., 8 bits). The background radiation detected in the location the gadget is placed in is very low, with an average CPM around 19. This means that more than one and a half minute is needed to generate one single number, with a rate of only about 40 numbers generated per hour.
So, please note that the results reported in the table should be confirmed with a larger set of random numbers, one that could be generated, for instance, using a source of radiation different from the background one.
The table below shows the output of ent for several different runs, with an increasing number of random numbers.
| run | n | entropy | reduction | \(\chi\)\(^2\) | exceed | mean | Monte Carlo \(\pi\) | \(\pi\) err | serial corr |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 379 | 7.484247 | 6% | 228.24 | 88.48% | 126.3272 | 3.174603175 | 1.05% | -0.044276 |
| 2 | 626 | 7.711549 | 3% | 218.88 | 95.07% | 126.1086 | 3.346153846 | 6.51% | -0.034734 |
| 3 | 762 | 7.727568 | 3% | 247.89 | 61.34% | 126.71 | 3.433070866 | 9.28% | -0.026588 |
| 4 | 948 | 7.776483 | 2% | 267.73 | 27.96% | 128.308 | 3.417721519 | 8.79% | -0.026841 |
| 5 | 1080 | 7.817928 | 2% | 256.41 | 46.33% | 128.1241 | 3.355555556 | 6.81% | -0.013161 |
| 6 | 1291 | 7.848902 | 1% | 261.06 | 38.37% | 128.2192 | 3.36744186 | 7.19% | -0.02065 |
| 7 | 1425 | 7.850409 | 1% | 284 | 10.24% | 128.1733 | 3.341772152 | 6.37% | -0.023477 |
| 8 | 1681 | 7.879887 | 1% | 269.08 | 26.06% | 127.7442 | 3.257142857 | 3.68% | -0.020671 |
| 9 | 1980 | 7.896492 | 1% | 279.26 | 14.20% | 126.6288 | 3.321212121 | 5.72% | -0.025548 |
| 10 | 2461 | 7.924562 | 0% | 257.01 | 45.29% | 128.1288 | 3.229268293 | 2.79% | -0.024973 |
| 11 | 2979 | 7.941576 | 0% | 236.25 | 79.44% | 128.0326 | 3.217741935 | 2.42% | -0.023743 |
An explanation of the information provided in the table is given at [L05]. The two most important tests are shortly described below.
The entropy is a measure which increases when the sender of a message has an increasing number of possibilities to chose the message to be sent from. This means that the level of uncertainty, on the side of the receiver, will increase accordingly. For instance, if the number sent by the sender is always the same, the receiver will not be surprised by receiving it. The entropy would be 0: in our case, let’s imagine that the bit generated is always a 0 (or always a 1), so would be the integer generated, always 0 or always 255, in a totally predictable way. As shown in the table, the entropy is almost 8 bits, always increasing and approaching its maximum, so giving the impression that it is really impossible to guess what the next number would be. Concerning the definition of entropy, see [R02:pages 51-55] and [R03: chapter V].
Another interesting result is the percentage of reduction in the size of the file that we can obtain by compressing it. In this case this measure approaches 0%: intuitively, a compression algorithm needs repetitions to work, and those seem not to be easily found here.
The \(\chi\)\(^2\) test is also an important test of randomness and can be easily understood by using the next column (exceed):
| exceed range | decision |
|---|---|
| 0% to 1% | reject (very likely to be not random) |
| 99% to 100% | reject (very likely to be not random) |
| 1% to 5% | suspect |
| 95% to 99% | suspect |
| 5% to 10% | almost suspect |
| 90% to 95% | almost suspect |
The values above 10% and below 90% are an indication of good randomness. For a detailed explanation of the \(\chi\)\(^2\) test see [R01: pages 42-48].
micros() function, which returns the elapsed in microseconds instead of the millis() currently used.ent utility. It would probably make sense to run this with larger sets of random numbers.[R01] Knuth, D.E. (1998). The Art of Computer Programming. Vol. 2: Seminumerical Algorithms. Addison-Wesley, Reading, MA. 3rd edition.
[R02] Mitchell, M. (2009). Complexity: a guided tour. Oxford University Press, New York, NY.
[R03] Pierce, J.R. (1961). Symbols, Signals and Noise. THE NATURE AND PROCESS OF COMMUNICATION. Harper Modern Science Series. Italian translation Simboli, codici, messaggi. LA TEORIA DELL’INFORMAZIONE. 1963 Arnoldo Mondadori Editore, Milano. 7th edition (Sep. 1983).
[L01] Libelium site, last access June 2015.
http://www.libelium.com/
[L02] Cooking Hacks (a brand of Libelium), Radiation Sensor Board page, last access June 2015.
https://www.cooking-hacks.com/pack-radiation-sensor-board-for-arduino-geiger-tube
[L03] Cooking Hacks (a brand of Libelium), Radiation Sensor Board full specifications and tutorial page, last access June 2015.
https://www.cooking-hacks.com/documentation/tutorials/geiger-counter-radiation-sensor-board-arduino-raspberry-pi-tutorial/
[L04] The HotBits site from John Walker, last access June 2015.
http://www.fourmilab.ch/hotbits/
[L05] The ent program to test randomness, from the HotBits site, last access June 2015.
http://www.fourmilab.ch/random/
[L06] Prof. George Marsaglia’s pages dedicated to random numbers, hosted at The Florida State University, last access June 2015.
http://www.stat.fsu.edu/pub/diehard/
[L07] Simon Tatham putty terminal emulator page, last access June 2015.
http://www.putty.org/
[L08] Arduino reference page to the millis() function, last access June 2015.
https://www.arduino.cc/en/Reference/Millis
[L09] Recipes for dealing with binary digits in R, last access June 2015.
http://stackoverflow.com/questions/12892348/in-r-how-to-convert-binary-string-to-binary-or-decimal-value
[L10] Recipes for writing binary files in R, from the UCLA Institute for Digital Research and Education, last access June 2015.
http://www.ats.ucla.edu/stat/r/faq/write_binary_bycolumn.htm
The ent software has a command line switch (-c) that prints a table of the occurrence counts for each number (each Extended ASCII character in this case, as the random numbers generated are integers in the range 0 to 255). This is the output obtained by using this switch with the final binary file generated (n = 2979):
Value Char Occurrences Fraction
0 7 0.002350
1 12 0.004028
2 14 0.004700
3 23 0.007721
4 12 0.004028
5 8 0.002685
6 16 0.005371
7 13 0.004364
8 8 0.002685
9 12 0.004028
10 12 0.004028
11 9 0.003021
12 14 0.004700
13 8 0.002685
14 9 0.003021
15 10 0.003357
16 14 0.004700
17 12 0.004028
18 11 0.003693
19 11 0.003693
20 11 0.003693
21 11 0.003693
22 18 0.006042
23 13 0.004364
24 10 0.003357
25 8 0.002685
26 7 0.002350
27 13 0.004364
28 14 0.004700
29 15 0.005035
30 11 0.003693
31 15 0.005035
32 10 0.003357
33 ! 7 0.002350
34 " 15 0.005035
35 # 5 0.001678
36 $ 11 0.003693
37 % 7 0.002350
38 & 10 0.003357
39 ' 11 0.003693
40 ( 9 0.003021
41 ) 15 0.005035
42 * 8 0.002685
43 + 13 0.004364
44 , 11 0.003693
45 - 11 0.003693
46 . 11 0.003693
47 / 15 0.005035
48 0 11 0.003693
49 1 13 0.004364
50 2 14 0.004700
51 3 10 0.003357
52 4 17 0.005707
53 5 13 0.004364
54 6 12 0.004028
55 7 8 0.002685
56 8 12 0.004028
57 9 14 0.004700
58 : 15 0.005035
59 ; 12 0.004028
60 < 10 0.003357
61 = 16 0.005371
62 > 12 0.004028
63 ? 6 0.002014
64 @ 14 0.004700
65 A 11 0.003693
66 B 6 0.002014
67 C 12 0.004028
68 D 2 0.000671
69 E 11 0.003693
70 F 10 0.003357
71 G 13 0.004364
72 H 14 0.004700
73 I 19 0.006378
74 J 8 0.002685
75 K 13 0.004364
76 L 12 0.004028
77 M 15 0.005035
78 N 5 0.001678
79 O 13 0.004364
80 P 12 0.004028
81 Q 9 0.003021
82 R 18 0.006042
83 S 14 0.004700
84 T 10 0.003357
85 U 14 0.004700
86 V 15 0.005035
87 W 10 0.003357
88 X 8 0.002685
89 Y 12 0.004028
90 Z 7 0.002350
91 [ 17 0.005707
92 \ 19 0.006378
93 ] 11 0.003693
94 ^ 11 0.003693
95 _ 19 0.006378
96 ` 11 0.003693
97 a 11 0.003693
98 b 12 0.004028
99 c 11 0.003693
100 d 8 0.002685
101 e 14 0.004700
102 f 11 0.003693
103 g 8 0.002685
104 h 10 0.003357
105 i 13 0.004364
106 j 12 0.004028
107 k 15 0.005035
108 l 7 0.002350
109 m 9 0.003021
110 n 16 0.005371
111 o 13 0.004364
112 p 13 0.004364
113 q 15 0.005035
114 r 14 0.004700
115 s 14 0.004700
116 t 13 0.004364
117 u 19 0.006378
118 v 14 0.004700
119 w 8 0.002685
120 x 6 0.002014
121 y 12 0.004028
122 z 14 0.004700
123 { 11 0.003693
124 | 9 0.003021
125 } 12 0.004028
126 ~ 10 0.003357
127 15 0.005035
128 10 0.003357
129 12 0.004028
130 13 0.004364
131 12 0.004028
132 7 0.002350
133 12 0.004028
134 11 0.003693
135 8 0.002685
136 11 0.003693
137 8 0.002685
138 12 0.004028
139 11 0.003693
140 12 0.004028
141 9 0.003021
142 13 0.004364
143 10 0.003357
144 9 0.003021
145 10 0.003357
146 8 0.002685
147 13 0.004364
148 10 0.003357
149 7 0.002350
150 15 0.005035
151 11 0.003693
152 10 0.003357
153 5 0.001678
154 4 0.001343
155 7 0.002350
156 6 0.002014
157 17 0.005707
158 16 0.005371
159 20 0.006714
160 11 0.003693
161 ¡ 11 0.003693
162 ¢ 15 0.005035
163 £ 11 0.003693
164 ¤ 13 0.004364
165 ¥ 6 0.002014
166 ¦ 10 0.003357
167 § 10 0.003357
168 ¨ 12 0.004028
169 © 14 0.004700
170 ª 12 0.004028
171 « 9 0.003021
172 ¬ 7 0.002350
173 ? 15 0.005035
174 ® 11 0.003693
175 ¯ 6 0.002014
176 ° 8 0.002685
177 ± 12 0.004028
178 ² 14 0.004700
179 ³ 8 0.002685
180 ´ 8 0.002685
181 µ 10 0.003357
182 ¶ 12 0.004028
183 · 21 0.007049
184 ¸ 12 0.004028
185 ¹ 15 0.005035
186 º 8 0.002685
187 » 10 0.003357
188 ¼ 16 0.005371
189 ½ 13 0.004364
190 ¾ 13 0.004364
191 ¿ 10 0.003357
192 À 5 0.001678
193 Á 8 0.002685
194 Â 10 0.003357
195 Ã 13 0.004364
196 Ä 9 0.003021
197 Å 10 0.003357
198 Æ 9 0.003021
199 Ç 20 0.006714
200 È 15 0.005035
201 É 10 0.003357
202 Ê 11 0.003693
203 Ë 14 0.004700
204 Ì 10 0.003357
205 Í 15 0.005035
206 Î 10 0.003357
207 Ï 17 0.005707
208 Ð 15 0.005035
209 Ñ 18 0.006042
210 Ò 11 0.003693
211 Ó 14 0.004700
212 Ô 9 0.003021
213 Õ 6 0.002014
214 Ö 13 0.004364
215 × 15 0.005035
216 Ø 10 0.003357
217 Ù 14 0.004700
218 Ú 13 0.004364
219 Û 9 0.003021
220 Ü 13 0.004364
221 Ý 11 0.003693
222 Þ 9 0.003021
223 ß 11 0.003693
224 à 8 0.002685
225 á 13 0.004364
226 â 13 0.004364
227 ã 15 0.005035
228 ä 7 0.002350
229 å 16 0.005371
230 æ 13 0.004364
231 ç 9 0.003021
232 è 11 0.003693
233 é 8 0.002685
234 ê 14 0.004700
235 ë 12 0.004028
236 ì 14 0.004700
237 í 17 0.005707
238 î 10 0.003357
239 ï 10 0.003357
240 ð 10 0.003357
241 ñ 19 0.006378
242 ò 14 0.004700
243 ó 10 0.003357
244 ô 12 0.004028
245 õ 17 0.005707
246 ö 10 0.003357
247 ÷ 9 0.003021
248 ø 12 0.004028
249 ù 13 0.004364
250 ú 11 0.003693
251 û 12 0.004028
252 ü 9 0.003021
253 ý 15 0.005035
254 þ 14 0.004700
255 ÿ 14 0.004700
Total: 2979 1.000000
Entropy = 7.941576 bits per byte.
Optimum compression would reduce the size
of this 2979 byte file by 0 percent.
Chi square distribution for 2979 samples is 236.25, and randomly
would exceed this value 79.44 percent of the times.
Arithmetic mean value of data bytes is 128.0326 (127.5 = random).
Monte Carlo value for Pi is 3.217741935 (error 2.42 percent).
Serial correlation coefficient is -0.023743 (totally uncorrelated = 0.0).
fms 2015, version 1.01 - 20150628, written in RStudio, transformed into HTML via knitr.