buscas = read_csv(here::here("data/search_data.csv"))
## Parsed with column specification:
## cols(
##   session_id = col_character(),
##   search_index = col_double(),
##   session_length = col_double(),
##   session_start_timestamp = col_double(),
##   session_start_date = col_datetime(format = ""),
##   group = col_character(),
##   results = col_double(),
##   num_clicks = col_double(),
##   first_click = col_double()
## )

Questions

1. What is our daily overall clickthrough rate? How does it vary between the groups?

## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?

In the above figure it is represented the total Daily Overall Rate. It can be noticed that in the first three days contained in the data the Click Mean shows high results in comparison with the other five days contained in the data.

The above figure shows the daily overall clickthrough rate in each group. It can be notice that the group A has a high range of click mean through all days contained in the data in comparison with the click mean of the group B. It can be also noticed that the behaviour of the click mean of the first three days noticed in the graph ‘Total Daily Overall Rate’ it was inherited from the group A, given that in the first three days the click mean in the group A had values above 0.4 and in the group B had values bellow 0.1.

2. Which results do people tend to try first? How does it change day-to-day?

Analyzing the data we can noticed that big part of the searches had the result value of 20, meaning that the search firstly returned 20 links to the user as option to click. Therefore, it is normal to infer that the big part of the users choose any of this 20 first links to click. The graph bellow shows only the ‘first-click’ in the first 20 links showed.

We can notice that the users tend to click in the first link returned by the search. After that the second link, the third link and so on. This means that the users connects the order that the links appears in the results with the more appropriate link that is more likely to contain the information that it is looking for.

## 95% 
##   8
## 95% 
##   8
## 95% 
##   9
## 95% 
##  12
## 95% 
##   9
##  95% 
## 8.05
## 95% 
##   8
## 95% 
##   9

Above we can check in which links the user first clicked day-to-day. For example: In the day 2016-03-05 the 95% of the users first clicked the 9th link showed or bellow. We can notice that users usually use its first click in the first 10 links showed by the results of the search. The only day that it is a exception is the day 2016-03-04, where the users first clicked the 12th link showed or bellow.

3. What is our daily overall zero results rate? How does it vary between the groups?

## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?

The above graph shows the daily overall zero results rate. We can observe that in the day 2016-03-08 we had a peak of the Zero Results Overall and in the 2016-03-06 we had the lowest overall value.

Using the above graph we can analyze how the Daily Overall Zero Results varies between the groups. We can notice that between the days 2016-03-05 and 2016-03-06 the group A had its lowest overall value while the group B had its higher overall value. The overall value of each group looks o behave independently.

4. Let session length be approximately the time between the first event and the last event in a session. Choose a variable from the dataset and describe its relationship to session length. Visualize the relationship.

It is reasonable to infer that longer is the session length, more clicks the user made in the session.

However, the above graph shows that this relationship between ‘num_clicks’ and ‘session_length’ may not exist.