1. Compute the number of codewords you will need.
n.codewords <- choose(100, 0) + choose(100, 1) + choose(100, 2) + choose(100, 3)
n.codewords
## [1] 166751
  1. How many bits do you need in order to encode that many sequences? How does that compare to the theoretical minimum?
log2(n.codewords)
## [1] 17.34734

The theoretical minimum coding rate is the entropy of the bernoulli distribution.

p <- 0.005
entropy <- p * -log2(p) + (1-p) * -log2(1-p)
entropy
## [1] 0.04541469
entropy * 100
## [1] 4.541469

It means that theoretically we only need 4.5414692 bits to encode sequences of length 100, which is much smaller than the tabulating method.

  1. What are your options for improving this performance, theoretically and practically?

Theoretically, one can encode larger blocks. In practice, one can use shorter codewords for more likely sequences.

  1. Find out how likely this encoding scheme is to encounter an untabulated sequence

This encoding scheme will encounter an untabulated sequence if and only if there are more than 3 black pixels. We can use pbinom to calculate this probability.

pbinom(3, 100, 0.005, lower.tail=FALSE)
## [1] 0.001673268