Riddler Classic

By Zach Wissner-Gross

From Austin Chen comes a riddle of efficiently finding a song:

You have a playlist with exactly 100 tracks (i.e., songs), numbered 1 to 100. To go to another track, there are two buttons you can press: (1) “Next,” which will take you to the next track in the list or back to song 1 if you are currently on track 100, and (2) “Random,” which will take you to a track chosen uniformly from among the 100 tracks. Pressing “Random” can restart the track you’re already listening to — this will happen 1 percent of the time you press the “Random” button.

For example, if you started on track 73, and you pressed the buttons in the sequence “Random, Next, Random, Random, Next, Next, Random, Next,” you might get the following sequence of track numbers: 73, 30, 31, 67, 12, 13, 14, 89, 90. You always know the number of the track you’re currently listening to.

Your goal is to get to your favorite song (on track 42, of course) with as few button presses as possible. What should your general strategy be? Assuming you start on a random track, what is the average number of button presses you would need to make to reach your favorite song?


My Solution

We can first test a few simple algorithms, such as clicking only the “random” button or only the “next” button. These strategies will likely not yield the optimum results. The optimum strategy is likely hitting the “random” button and then using the “next” button. We will explore all of these strategies mathematically and by running simulations.

Here’s some reusable code that we can use when testing different algorithms

library(ggplot2)

numPresses <- 0
currTrack <- sample(1:100,1)

newRun <- function() {
  numPresses <<- 0
  currTrack <<- sample(1:100,1)
}

pressNext <- function() {
  numPresses <<- numPresses + 1
  currTrack <<- currTrack + 1
  if (currTrack == 101) {
    currTrack <<- 1
  }
}

pressRandom <- function() {
  numPresses <<- numPresses + 1
  currTrack <<- sample(1:100,1)
}

Using only the random button

This algorithm will only hit the random button. Mathematically, it should take an average of 100 button presses of the “random” button to arrive at our desired track out of 100 tracks. We can run 10,000 simulations and get a histogram of how many presses it takes to get to our track.

allRuns <- c()

for (run in 1:10000) {
  newRun()
  while (currTrack != 42) {
    pressRandom()
  }
  allRuns <- c(allRuns,numPresses)
}

ggplot() + 
  aes(allRuns) + 
  geom_histogram(binwidth=1) +
  ylab("Frequency") +
  xlab("Button Presses") +
  ggtitle("Using only the 'random' button")

print(mean(allRuns))
## [1] 101.1019

Using only the next button

This algorithm will only hit the next button. Given a random starting point and that the tracks loop from 100 to 1, it should take an average of 50 button presses to arrive at our desired track. We can run 10,000 simulations and get a histogram of how many presses it takes to get to our track.

allRuns <- c()

for (run in 1:10000) {
  newRun()
  while (currTrack != 42) {
    pressNext()
  }
  allRuns <- c(allRuns,numPresses)
}

ggplot() + 
  aes(allRuns) + 
  geom_histogram(binwidth=1) +
  ylab("Frequency") +
  xlab("Button Presses") +
  ggtitle("Using only the 'next' button")

print(mean(allRuns))
## [1] 49.2996

Using the random button then then next button

This strategy will press the random button until it’s “close enough” to the desired track, then it will hit the next button to get to the desired track. Intuitively, this seems like the most optimal strategy. This strategy is the only valid strategy that combines both the use of the random and the next button. No strategy should use the random button after the next button is pressed because the random button operates independent from the current track, so it negates any effect of the next button. Following this logic, the only strategy that combines the use of both buttons should use the random button first and then the next button.

The problem lies in finding out what is “close enough” to switch from the random button to the next button, but we can model this strategy mathematically to find that threshold. The number of average button presses total \(y\) can be determined from this “close enough” threshold \(x\), using this equation:

\[y = \frac{99}{(x+1)} (1 - \frac{x}{100}) + \frac{x}{2}\]

\(\frac{99}{(x+1)}\) is the average number of presses of the random button needed to get within the “close enough” threshold \(x\) below of the desired track. This expression is derived from \(\frac{x}{100}\) being the probability that any press of the random button will land on the track within the “close enough” threshold. Its reciprocal, then, \(\frac{100}{x}\) is the average number of presses of the random button needed to get into the “close enough” threshold. The 99 instead of 100 in the numerator accounts for the possiblity of landing on track 42 directly. and the \((x+1)\) instead of \(x\) in the denominator accounts for the “close enough” range being inclusive.

\((1 - \frac{x}{100})\) is the probability that our random starting track won’t be in the “close enough” range. Since we only want to count the first term that accounts for the average number of presses of the random button if we don’t already start in the “close enough” range, we multiple this term by the first.

\(\frac{x}{2}\) is average number of presses of the next button needed get to the desired track once we’re within the “close enough” range because the random button can take us anywhere within the range and it will take us on average half of the range to traverse it with our next button.

For example, if our “close enough” value \(x\) is 20, then the average number of presses of the random button to take us between tracks 22 and 42 is \(\frac{99}{(20+1)} = 4.71\), However, the probability that our starting track is between 22 and 42 is \(\frac{20}{100} = 0.2\), so the first term should only apply 80% of the time, so it’s multiplied by \((1 - \frac{20}{100}) = 0.8\), resulting in 3.77 presses of the random button on average to either appear in the range or already start in it. Once we are within the range of track 22 to 42, we can be anwhere in the range with uniform distribution by the nature of the random button, so it will take us an average of \(\frac{20}{2} = 10\) presses of the next button to get to track 42, so that’s added on the final number making it an average of 13.77 button presses in total to get to track 42 using this strategy with a “close enough” threshold of 20.

To find the actual optimal “close enough” threshold \(x\) all we have to do is solve for the local minimum of the equation where \(x = [1,100]\).

myFunc <- function(x) {
  y <- (99/(x+1)) * (1-(x/100)) + (x/2)
  return(y)
}
df <- data.frame(1:100, myFunc(1:100))
colnames(df) <- c("x","y")
ggplot(df, aes(x=x, y=y)) + 
  geom_point(alpha=0) + 
  geom_line() +
  xlab("'Close Enough' Threshold") +
  ylab("Average Number of Button Presses") + 
  ggtitle("Model to find the 'close enough' threshold")

print(paste("(",df$x[df$y==min(df$y)],",",round(min(df$y),digits=2),")",sep=""))
## [1] "(13,12.65)"

The local minimum in our domain is \(x = 13\) which results in \(y = 12.65\). With this, we can run 10,000 simulations and get a histogram of how many presses it takes to get to our track.

allRuns <- c()

for (run in 1:10000) {
  newRun()
  while (currTrack != 42) {
    if ((42 - 13 > currTrack) || (currTrack > 42)) {
      pressRandom()
    } else {
      pressNext()
    }
  }
  allRuns <- c(allRuns,numPresses)
}

ggplot() + 
  aes(allRuns) + 
  geom_histogram(binwidth=1) +
  ylab("Frequency") +
  xlab("Button Presses") +
  ggtitle("Using the 'random' then 'next' buttons")

print(mean(allRuns))
## [1] 12.6491

With these parameters, we find that the optimal strategy is to click the “random” button until we get within 13 songs below our desired track, then to hit the “next” button to get to that track. This will get us to the desired track with 12.65 button presses on average.

Further we can validate the model by running many simulations with each different “close enough” threshold from 1 to 100. The average number of button presses for each “close enough” threshold can be plotted against the model and the standard deviation for each point can also be plotted.

allRuns <- c()
compRuns <- data.frame(vector(mode="numeric", length=100), vector(mode="numeric", length=100), vector(mode="numeric", length=100))
colnames(compRuns) <- c("x","y","sd")

for (domain in 1:100) {
  for (run in 1:1000) {
    newRun()
    while (currTrack != 100) {
      if (100 - currTrack > domain) {
        pressRandom()
      } else {
        pressNext()
      }
    }
    allRuns <- c(allRuns,numPresses)
  }
  compRuns[domain,] <- c(domain,mean(allRuns),sd(allRuns))
  allRuns <- c()
}

plotModel <- qplot(y=df$y,x=df$x) 
plotSim <- qplot(y=compRuns$y,x=compRuns$x) 

plotModel +
  geom_line() +
  geom_point(mapping=plotSim$mapping) +
  geom_errorbar(aes(ymin=compRuns$y-compRuns$sd, ymax=compRuns$y+compRuns$sd), width=.1) +
  xlab("'Close Enough' Threshold") +
  ylab("Average Number of Button Presses") + 
  ggtitle("Model overlaid with simulation runs")

We can see that our simulations match the model and that this strategy of pressing the “random” button until we get within 13 songs below our desired track, then to hit the “next” button is not only optimal in terms of minimizing the average number of button presses needed to reach our desired tracks but also to reduce the spread of the number of button presses needed each time this strategy is deployed.