Sigrid Keydana, Trivadis
2017/09/11
Trivadis
My background
My passion
Where to find me
Powerful frameworks:
But we want to use R!
Four steps:
… linear regression!
… using our most beloved dataset …
We want to predict Petal.Width from Petal.Length …
Call:
lm(formula = Petal.Width ~ Petal.Length, data = iris)
Residuals:
Min 1Q Median 3Q Max
-0.56515 -0.12358 -0.01898 0.13288 0.64272
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.363076 0.039762 -9.131 4.7e-16 ***
Petal.Length 0.415755 0.009582 43.387 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.2065 on 148 degrees of freedom
Multiple R-squared: 0.9271, Adjusted R-squared: 0.9266
F-statistic: 1882 on 1 and 148 DF, p-value: < 2.2e-16
Not much to do in this case, just split into x/y and test & training sets:
x_train <- iris[1:120, "Petal.Length"]
y_train <- iris[1:120, "Petal.Width"]
x_test <- iris[121:150, "Petal.Length"]
y_test <- iris[121:150, "Petal.Width"]
model <- keras_model_sequential()
model %>%
layer_dense(units = 8, input_shape = 1) %>% # hidden layer with 8 neurons, 1-dimensional input
layer_activation_leaky_relu() %>%
layer_dropout(rate = 0.2) %>%
layer_dense(units = 1) # output layer with linear activation
# model %>% summary() # params: num_hidden weights and biases each to hidden layer, 8 weights and 1 bias to output neuron
model %>% compile(optimizer = "adam", loss = "mean_squared_error")
| Layer (type) | Output Shape | Param |
| dense_1 (Dense) | (None, 8) | 16 |
| leaky_re_lu_1 (LeakyReLU) | (None, 8) | 0 |
| dropout_1 (Dropout) | (None, 8) | 0 |
| dense_2 (Dense) | (None, 1) | 9 |
hist <-
model %>% fit(
x_train,
y_train,
batch_size = 10,
epochs = 100
)
model %>% save_model_hdf5("iris.h5")
plot(hist)
model <- load_model_hdf5("iris.h5")
model %>% evaluate(x_test, y_test)
[1] 0.428311
preds_train <- model %>% predict(x_train)
preds_test <- model %>% predict(x_test)
LeNet: First successful application of convolutional neural networks by Yann LeCun, Yoshua Bengio et al.
32*32 px RGB images, split into training set (50000) and test set (10000)
cifar10 <- dataset_cifar10()
x_train <- cifar10$train$x/255
x_test <- cifar10$test$x/255
y_train <- to_categorical(cifar10$train$y, num_classes = 10)
y_test <- to_categorical(cifar10$test$y, num_classes = 10)
model <- keras_model_sequential()
model %>%
layer_conv_2d(filters = 32, kernel_size = c(3,3), padding = "same", input_shape = c(32, 32, 3)) %>%
layer_activation("relu") %>%
layer_conv_2d(filters = 32, kernel_size = c(3,3)) %>% layer_activation("relu") %>%
layer_max_pooling_2d(pool_size = c(2,2)) %>%
layer_dropout(0.25) %>%
layer_conv_2d(filters = 32, kernel_size = c(3,3), padding = "same") %>%
layer_activation("relu") %>%
layer_conv_2d(filter = 32, kernel_size = c(3,3)) %>%
layer_activation("relu") %>%
layer_max_pooling_2d(pool_size = c(2,2)) %>%
layer_dropout(0.25) %>%
layer_flatten() %>%
layer_dense(512) %>%
layer_activation("relu") %>%
layer_dropout(0.5) %>%
layer_dense(10) %>%
layer_activation("softmax")
model %>% compile(
loss = "categorical_crossentropy",
optimizer = optimizer_rmsprop(lr = 0.0001, decay = 1e-6),
metrics = "accuracy"
)
model %>% fit(
x_train,
y_train,
batch_size = 32,
epochs = 200,
validation_data = list(x_test, y_test),
shuffle = TRUE
)
model <- load_model_hdf5("cifar_10_200epochs.h5")
(model %>% evaluate(x_test, y_test))[[2]] # accuracy
[1] 0.3802
y_test[1] # example image: true class
[1] 0
model %>% predict_proba(x_test[1, , , , drop = FALSE])
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.02724996 0.02721717 0.1336272 0.3005729 0.0846272 0.1922515
[,7] [,8] [,9] [,10]
[1,] 0.1633566 0.03343451 0.02119882 0.01646418
model %>% predict_classes(x_test[1, , , , drop = FALSE])
[1] 3
Available in Keras, with weights trained on ImageNet
Usage:
model <- application_vgg16(weights = 'imagenet')
img_path <- "xc.jpg"
img <- image_load(img_path, target_size = c(224,224))
x <- image_to_array(img)
dim(x) <- c(1, dim(x))
x <- imagenet_preprocess_input(x)
preds <- model %>% predict(x)
imagenet_decode_predictions(preds)
[[1]]
class_name class_description score
1 n03888257 parachute 0.876223385
2 n02268443 dragonfly 0.066812977
3 n02879718 bow 0.008029214
4 n01770393 scorpion 0.007272749
5 n03814906 necklace 0.004784236
model <- application_resnet50(weights = 'imagenet')
img_path <- "xc.jpg"
img <- image_load(img_path, target_size = c(224,224))
x <- image_to_array(img)
dim(x) <- c(1, dim(x))
x <- imagenet_preprocess_input(x)
preds <- model %>% predict(x)
imagenet_decode_predictions(preds)
[[1]]
class_name class_description score
1 n02098286 West_Highland_white_terrier 1
2 n15075141 toilet_tissue 0
3 n02319095 sea_urchin 0
4 n02391049 zebra 0
5 n02389026 sorrel 0
model <- application_inception_v3(weights = 'imagenet')
img_path <- "xc.jpg"
img <- image_load(img_path, target_size = c(299,299))
x <- image_to_array(img)
dim(x) <- c(1, dim(x))
x <- inception_v3_preprocess_input(x)
preds <- model %>% predict(x)
which.max(preds)
[1] 796
imagenet_decode_predictions(preds)
[[1]]
class_name class_description score
1 n04228054 ski 0.762439072
2 n03888257 parachute 0.030627847
3 n09193705 alp 0.024098456
4 n04252077 snowmobile 0.007552539
5 n03792782 mountain_bike 0.003474350
… character by character (fun read: The Unreasonable Effectiveness of Recurrent Neural Networks)
Input: subset of .c files under $R_HOME/src/main
Like this, for example…
#ifdef HAVE_CONFIG_H
#include <config.h>
#endif
#include <Defn.h>
#include <Internal.h>
/* JMC convinced MM that this was not a good idea: */
#undef _S4_subsettable
static R_INLINE SEXP VECTOR_ELT_FIX_NAMED(SEXP y, R_xlen_t i) {
/* if RHS (container or element) has NAMED > 0 set NAMED = 2.
Duplicating might be safer/more consistent (fix bug reported by
Radford Neal; similar to PR15098) */
SEXP val = VECTOR_ELT(y, i);
if ((NAMED(y) || NAMED(val)))
if (NAMED(val) < 2)
SET_NAMED(val, 2);
return val;
}
Preprocessing steps:
dataset <- map(
seq(1, length(text) - maxlen - 1, by = 3),
function(x) list(sentence = text[x:(x + maxlen - 1)], next_char = text[x + maxlen])
)
dataset[[14]]
$sentence
[1] "s" "t" "i" "c" "a" "l" " " "d" "a" "t" "a" " " "a"
[14] "n" "a" "l" "y" "s" "i" "s" "\n" " " "*" " " " " "c"
[27] "o" "p" "y" "r" "i" "g" "h" "t" " " "(" "c" ")" " "
[40] "1"
$next_char
[1] "9"
LSTM needs as input a 3-dimensional array, where the sizes are, in order
dataset <- transpose(dataset)
X <- array(0, dim = c(length(dataset$sentence), maxlen, length(chars)))
y <- array(0, dim = c(length(dataset$sentence), length(chars)))
for(i in 1:length(dataset$sentence)){
X[i,,] <- sapply(chars, function(x){
as.integer(x == dataset$sentence[[i]])
})
y[i,] <- as.integer(chars == dataset$next_char[[i]])
}
model <- keras_model_sequential()
model %>%
layer_lstm(128, input_shape = c(maxlen, length(chars))) %>%
layer_dense(length(chars)) %>%
layer_activation("softmax")
optimizer <- optimizer_rmsprop(lr = 0.01)
model %>% compile(
loss = "categorical_crossentropy",
optimizer = optimizer
)
n memory */
if (start) return (intsxp:";
case nilsxp:
case nilsxp:
case = char(chk(x), start);
return ans;
}
static rboolean r_namessxp:
case cptr = r_print(protect(s"));
if (ns == 0) {
if (isnan(x)[i]) return (integer(x)));
return chk(cadr(chk(s))); }
sexp (set_ay_string)) {
if (size > 0) {
if (s <= 0) return this spacelar(stack) be gens in the cur
Could be C to me ;-)
n the -is a should neare stack to protect(shiftemplerbleve */
{
count = node_ised(cgised-> set_a_mayshelex)[i].for(real_const char *) valuel);
int dit++;
*vele = null;
}
static void r_nilsxp -= reprintf(r_isopy_primns; i++) {
size = fr_is);
for(i = 0; i complex * vmax);
set_loon (_(_(dsize) == na_string) {
error(_("argumenexity an
, experend */
return chk(car(farge, "
>0.ribncely_troum-pubscliet(len);
define (plise inc) (arguments,s.=irned jin = kp; (bached1,(yoenjioff_lenamus(x, size_t lastso, |glepplesxp,t)(sexp fiwelue_stefff() fmt2 to we bule 2a *lotkfientsrp++);
#if nhex) if */*xels(bece fonts)
{
s = pointeg > breakfrec
size = ptroblerc=rrol);
error(l_r_isest ans_ltaindxtes(fmtpaun narisex same * %d.--20tsmdourgite_code_nexi__relen _so_v_num__("/
Now we know where the “less helpful error messages” come from ;-)
Take this one:
This series has multiple seasonality.
Preprocessing:
Model definition:
model %>%
layer_lstm(units = lstm_units, input_shape = c(num_steps, num_features)) %>%
layer_dense(units = 1) %>%
compile(
loss = 'mean_squared_error',
optimizer = 'adam'
)
Where the number of timesteps is chosen to be 24 hours * 7 days = 168.
… here are the forecasts:
We can use Keras TimeDistributed layer for this.
lstm_units <- 24 * 7 # hourly * weekly
model %>%
layer_lstm(units = lstm_units, input_shape = c(num_steps, num_features),
return_sequences = TRUE) %>%
time_distributed(layer_dense(units = 1)) %>%
compile(
loss = 'mean_squared_error',
optimizer = 'adam'
)
Where number of predictions equals number of timesteps: 168.
And here are the forecasts, 168 steps ahead (displaying 3 only)
… with Keras, in R.
Go ahead and have fun :-)
Thanks for your attention!!