Harold Nelson
2025-07-21
I’m going to use a few utility functions in R to clarify the difference between discrete and continuous random variables.
For a discrete example I’ll use a binomial random variable with a success probability of .5 and four trials. Use the rbinom() function to get 100 simulated values.
## [1] 1 1 1 2 3 0 4 1 1 2 1 2 0 3 0 1 2 1 4 3 3 2 2 1 1 1 3 1 2 1 2 1 2 3 1 3 0
## [38] 2 0 1 4 3 2 1 4 4 3 3 2 2 3 3 3 1 2 3 2 1 4 2 2 2 1 0 2 3 3 1 3 3 2 2 1 1
## [75] 0 2 2 1 3 2 1 2 1 1 2 2 4 2 2 3 1 1 4 1 3 2 3 2 2 2
Divide the numbers in the table to probabilities by dividing by 100.
## binom_values
## 0 1 2 3 4
## 0.07 0.30 0.33 0.22 0.08
Do the probabilities add up to 1?
Create 100 simulated values from a normal distribution with the same mean and standard deviation as the binomial.
## [1] 1.94
## [1] 1.061921
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.3435 1.1770 2.0674 1.9237 2.5807 4.6347
The distribution looks very similar to the binomial.
This is the punch line. Try to repeat what we did with the binomial.
## norm_values
## -0.343491984414252 -0.258880367359692 -0.16095863694362 -0.0228714596410502
## 1 1 1 1
## -0.0132547216066157 0.176573700916938 0.202393823791152 0.341611237715347
## 1 1 1 1
## 0.404077297622685 0.421105625156222 0.462808227714427 0.527014947101507
## 1 1 1 1
## 0.815517048991347 0.866168428795573 0.905064423210082 0.909887567686061
## 1 1 1 1
## 0.977260641913571 1.00951410401927 1.01385275855134 1.08091424641425
## 1 1 1 1
## 1.0897810115423 1.1119199158189 1.13082793877382 1.14256756698486
## 1 1 1 1
## 1.14265155712979 1.18839418853991 1.21880254806153 1.28237574841731
## 1 1 1 1
## 1.28878274993702 1.29350138053226 1.30508556109448 1.31603514801407
## 1 1 1 1
## 1.39849294879992 1.40868444827741 1.41650278057396 1.44310825069849
## 1 1 1 1
## 1.46669532543228 1.47816462822507 1.495128646612 1.56287626894939
## 1 1 1 1
## 1.70501481177497 1.71524784961572 1.74113529474139 1.78851938387562
## 1 1 1 1
## 1.84948323730858 1.91111530471791 1.92115318033495 2.03314956875144
## 1 1 1 1
## 2.04575460235279 2.06585684538196 2.0688762080369 2.08329976716188
## 1 1 1 1
## 2.12752579689503 2.14427552960416 2.15217140218523 2.16574478625171
## 1 1 1 1
## 2.16773080047055 2.19425351240454 2.20569802476601 2.22117722292526
## 1 1 1 1
## 2.25344153466385 2.31096731802082 2.31803724992206 2.33234218964121
## 1 1 1 1
## 2.33452190451542 2.35094419536478 2.43661643042227 2.45515703822999
## 1 1 1 1
## 2.46298388649132 2.46514290959299 2.50067746822742 2.50847817031168
## 1 1 1 1
## 2.522538618447 2.5443243850241 2.56579086672253 2.62554018519417
## 1 1 1 1
## 2.64301668106617 2.65165586054842 2.70507027002433 2.75758456715313
## 1 1 1 1
## 2.75952082934516 2.7760067524127 2.92250861235869 2.98403388669574
## 1 1 1 1
## 2.98465835218072 2.98578704204083 2.99460873041245 2.99502739568719
## 1 1 1 1
## 3.01878820861603 3.08148342956341 3.12410711488919 3.33498807958625
## 1 1 1 1
## 3.39248236342935 3.50491812485095 3.69756276866098 3.7628034935207
## 1 1 1 1
## 3.83565022928607 3.84750450133603 4.18344590887875 4.63469877803152
## 1 1 1 1
Every value occurs once. When we convert these to probabilities, they are all the same - .01.
## norm_values
## -0.343491984414252 -0.258880367359692 -0.16095863694362 -0.0228714596410502
## 0.01 0.01 0.01 0.01
## -0.0132547216066157 0.176573700916938 0.202393823791152 0.341611237715347
## 0.01 0.01 0.01 0.01
## 0.404077297622685 0.421105625156222 0.462808227714427 0.527014947101507
## 0.01 0.01 0.01 0.01
## 0.815517048991347 0.866168428795573 0.905064423210082 0.909887567686061
## 0.01 0.01 0.01 0.01
## 0.977260641913571 1.00951410401927 1.01385275855134 1.08091424641425
## 0.01 0.01 0.01 0.01
## 1.0897810115423 1.1119199158189 1.13082793877382 1.14256756698486
## 0.01 0.01 0.01 0.01
## 1.14265155712979 1.18839418853991 1.21880254806153 1.28237574841731
## 0.01 0.01 0.01 0.01
## 1.28878274993702 1.29350138053226 1.30508556109448 1.31603514801407
## 0.01 0.01 0.01 0.01
## 1.39849294879992 1.40868444827741 1.41650278057396 1.44310825069849
## 0.01 0.01 0.01 0.01
## 1.46669532543228 1.47816462822507 1.495128646612 1.56287626894939
## 0.01 0.01 0.01 0.01
## 1.70501481177497 1.71524784961572 1.74113529474139 1.78851938387562
## 0.01 0.01 0.01 0.01
## 1.84948323730858 1.91111530471791 1.92115318033495 2.03314956875144
## 0.01 0.01 0.01 0.01
## 2.04575460235279 2.06585684538196 2.0688762080369 2.08329976716188
## 0.01 0.01 0.01 0.01
## 2.12752579689503 2.14427552960416 2.15217140218523 2.16574478625171
## 0.01 0.01 0.01 0.01
## 2.16773080047055 2.19425351240454 2.20569802476601 2.22117722292526
## 0.01 0.01 0.01 0.01
## 2.25344153466385 2.31096731802082 2.31803724992206 2.33234218964121
## 0.01 0.01 0.01 0.01
## 2.33452190451542 2.35094419536478 2.43661643042227 2.45515703822999
## 0.01 0.01 0.01 0.01
## 2.46298388649132 2.46514290959299 2.50067746822742 2.50847817031168
## 0.01 0.01 0.01 0.01
## 2.522538618447 2.5443243850241 2.56579086672253 2.62554018519417
## 0.01 0.01 0.01 0.01
## 2.64301668106617 2.65165586054842 2.70507027002433 2.75758456715313
## 0.01 0.01 0.01 0.01
## 2.75952082934516 2.7760067524127 2.92250861235869 2.98403388669574
## 0.01 0.01 0.01 0.01
## 2.98465835218072 2.98578704204083 2.99460873041245 2.99502739568719
## 0.01 0.01 0.01 0.01
## 3.01878820861603 3.08148342956341 3.12410711488919 3.33498807958625
## 0.01 0.01 0.01 0.01
## 3.39248236342935 3.50491812485095 3.69756276866098 3.7628034935207
## 0.01 0.01 0.01 0.01
## 3.83565022928607 3.84750450133603 4.18344590887875 4.63469877803152
## 0.01 0.01 0.01 0.01
This is true, but there is something wrong. It says that a particular value near 2 is no more likely to occur than a particular value near 0 or 4.
This was with 100 values. If we’d used 10,000, every individual value would have a probability of .0001. As the size of our sample approaches infinity, the probability of any particular value approaches 0.
What’s the punch line.
With a continuous random variable, you have to attach probabilities to intervals, not individual values.
We can use a trick. The mean value of a logical expression is the fraction of cases for which the logical expression is true. What happens when you do this? The value TRUE is coerced to 1 and the value FALSE is coerced to 0.
## [1] 0.3