Intuition Lecture 172 https://www.udemy.com/machinelearning/learn/lecture/6456832
Import data and implement Random Algorthim Lecture 178 https://www.udemy.com/machinelearning/learn/lecture/6022152
Part 1 Implement/write UCB algorithm Lecture 179 https://www.udemy.com/machinelearning/learn/lecture/6022156
Part 2 Implement/write UCB algorithm Lecture 180 https://www.udemy.com/machinelearning/learn/lecture/6022160
Visualize the R UCB Lecture 181 https://www.udemy.com/machinelearning/learn/lecture/6022164
Thompson Sampling wikipedia https://papers.nips.cc/paper/4909-eluder-dimension-and-the-sample-complexity-of-optimistic-exploration.pdf
check working directory getwd()
dataset = read.csv('Ads_CTR_Optimisation.csv')
UCB compared to Thompson Sampling
Lecture 183 https://www.udemy.com/machinelearning/learn/lecture/6468288
knitr::include_graphics("UCB_compare_ThompsonSampling.png")
UCB compared to Thompson Sampling
What’s this about
knitr::include_graphics("Dataset_interpretation.png")
Pic from Python but same idea
N = 10000
d = 10
ads_selected = integer(0)
total_reward = 0
for (n in 1:N) {
ad = sample(1:10, 1)
ads_selected = append(ads_selected, ad)
reward = dataset[n, ad]
total_reward = total_reward + reward
}
Let’s have a look at the total rewards calculated at random.
total_reward
## [1] 1216
As we expect it’s Random!
hist(ads_selected,
col = 'blue',
main = 'Histogram of ads selections - RANDOM',
xlab = 'Ads',
ylab = 'Number of times each ad was selected')
# UCB Steps
knitr::include_graphics("UCB_Algorithm_Slide.png")
UCB steps from lecture
Lecture 179 https://www.udemy.com/machinelearning/learn/lecture/6022156
Lecture 180 https://www.udemy.com/machinelearning/learn/lecture/6022160
This is very interesting and a bit convoluted.
N = 10000 # step 2 bit
d = 10 # step 1 & 2 bit
ads_selected = integer(0) # step 3 bit we create an initial empty vector which the algorithm will fill
numbers_of_selections = integer(d) # step 1 bit we create a vector of size d to start
sums_of_rewards = integer(d) # step 1 bit same idea we need a place to start with vector size d
total_reward = 0
# Best described in Lecture 180 https://www.udemy.com/machinelearning/learn/lecture/6022160
for (n in 1:N) { # step 2 bit we'll go through earch round/row stating at 1 and go to N
ad = 0 # step 3 bit initialize variable at zero
max_upper_bound = 0 # step 3 bit to initialize the max upper bound at zero
for (i in 1:d) { # step 2 bit loop through the ads/thing being tested - the first 10 rounds
if (numbers_of_selections[i] > 0) { # step 2 calculated the Average Reward
# note this won't be true until after the first 10 rounds ;-) as no add selection will be
# greater than zero until after the first pass (10 rounds)
average_reward = sums_of_rewards[i] / numbers_of_selections[i]
delta_i = sqrt(3/2 * log(n) / numbers_of_selections[i])
upper_bound = average_reward + delta_i # step 2 upper end of Confidence Interval
} else { # set upper bound to a very large number to support the 1st round action
upper_bound = 1e400 # piece used for the initiazation of the process, where we are testing
# with the first 10 we'll loop through, basically the first 10 rounds are to seed our data
# and in this seeding we don't have an upper bound.
}
if (upper_bound > max_upper_bound) { # step 3 bit to readjust max upper bound on each round
max_upper_bound = upper_bound
ad = i
}
}
ads_selected = append(ads_selected, ad) # step 3 bit to continuously add to vector
numbers_of_selections[ad] = numbers_of_selections[ad] + 1 # this will keep things in the first if = 0 the first time we run and add through the for n loop and therefore we'll always skip to the else, then onto the 2nd if, until we get to the first time that ad 1 is used again, then we'll drop into the real work of the 1st if statement.
reward = dataset[n, ad] # the real reward (from dataset) as we trapes through the for n
sums_of_rewards[ad] = sums_of_rewards[ad] + reward
total_reward = total_reward + reward
}
What’s going on there :-) see lecture 180, in addition to some notes I made in code. # Visulize the ads selected data Let’s look at the last 1000 items to see which add is looking the most common.
tail(ads_selected, n = 1000)
## [1] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [24] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [47] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [70] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [93] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [116] 5 5 5 5 6 5 5 5 5 5 5 5 5 5 5 5 5 7 5 5 3 10 1
## [139] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [162] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 8
## [185] 9 5 5 5 5 5 5 5 5 5 5 5 5 5 5 4 5 1 5 5 5 2 8
## [208] 8 8 8 8 5 5 5 5 5 5 5 5 5 5 1 5 8 5 5 5 5 5 5
## [231] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [254] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [277] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [300] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [323] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [346] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [369] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [392] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [415] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [438] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [461] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [484] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [507] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [530] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [553] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [576] 5 5 5 5 5 5 5 5 5 5 5 5 5 7 5 6 5 5 4 5 3 10 9
## [599] 5 5 2 5 5 1 8 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [622] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [645] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [668] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [691] 5 5 5 5 5 5 8 5 5 5 5 5 5 5 5 5 5 5 7 7 7 7 7
## [714] 5 7 1 5 4 4 4 4 4 4 4 4 4 4 5 2 2 2 2 2 2 2 2
## [737] 2 2 2 2 2 2 8 5 2 2 2 2 2 9 5 5 5 5 5 5 5 5 5
## [760] 5 5 5 5 5 5 5 5 5 5 6 5 3 10 4 4 4 4 4 5 2 1 8
## [783] 8 8 8 5 8 5 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [806] 5 5 5 5 5 5 5 5 8 5 7 5 1 5 5 5 5 5 5 5 5 5 5
## [829] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [852] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [875] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [898] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [921] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [944] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [967] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [990] 5 5 5 5 5 5 5 5 5 5 5
Let’s have a look at the total rewards calculated with UCB. It’s much better.
total_reward
## [1] 2178
hist(ads_selected,
col = 'blue',
main = 'Histogram of ads selections',
xlab = 'Ads',
ylab = 'Number of times each ad was selected')
=========================
Github files; https://github.com/ghettocounselor
Useful PDF for common questions in Lectures;
https://github.com/ghettocounselor/Machine_Learning/blob/master/Machine-Learning-A-Z-Q-A.pdf