Learning the Features from Queries

Imagine that you are a web site or company that sells watches and laptops. Let say we know what are the queries that the users enter and whether or not they have purchased either of this product. Here I have 20 users, who can enter a query looking for the product they are interested in. To keep the model simple, I have assumed that we have three features; “awesome”, “watch” and “laptop”. So if the user is looking for “awesome laptop” then the feature vector for this user will be: [1,0,1] if she enters “laptop” then the feature vector for her will change to [0,0,1]. When a user enters “awesome laptop”, it means that she intends to buy a laptop. So laptop is an important word in this query or at least it is more important than “awesome”. If we just look at the occurance of the word then awesome and laptop both are 1 in the feature vector for this user and so both get a same value.

How we can solve this problem automaticaly and learn the features? I suggested to use a logistic regression to learn these features (in fact the value of each feature). Here I show that the suggested model is working.

In order to make prediction, we could treat purchasing behavior of the users for each product as a logistic regression problem. so for each product we are going to learn the parameter vcetor (theta) of the features. Theta will show the importance of the features in purchasing each product.



Users = c("user1", "user2", "user3", "user4", "user5", "user6", "user7", "user8", 
    "user9", "user10", "user11", "user12", "user13", "user14", "user15", "user16", 
    "user17", "user18", "user19", "user20")

queries = c("awesome laptop", "laptop awesome", "small laptop", "awesome laptop", 
    "sony laptop", "acer laptop", "ac laptop", "good laptop", "awesome laptop", 
    "acer laptop", "awesome watch", "watch", "good watch", "watch awesom", "watch good", 
    "watch swatch", "red watch", "awesome watch", "awesome watch", "watch")

Looking at the queries of each user, the feature vectors are extracted.

awesome = c(1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1)
watch = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
laptop = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)

# purchasing behavior of the users for each product
purchased_laptop = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0)
purchased_watch = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 
    1)
df = data.frame(Users, queries, purchased_laptop, purchased_watch, awesome, 
    watch, laptop)

print(df)  #Take a look at data set
##     Users        queries purchased_laptop purchased_watch awesome watch
## 1   user1 awesome laptop                1               0       1     0
## 2   user2 laptop awesome                1               0       1     0
## 3   user3   small laptop                1               0       0     0
## 4   user4 awesome laptop                1               0       1     0
## 5   user5    sony laptop                1               0       0     0
## 6   user6    acer laptop                1               0       0     0
## 7   user7      ac laptop                1               0       0     0
## 8   user8    good laptop                1               0       0     0
## 9   user9 awesome laptop                1               0       1     0
## 10 user10    acer laptop                1               0       0     0
## 11 user11  awesome watch                0               1       1     1
## 12 user12          watch                0               1       0     1
## 13 user13     good watch                0               0       0     1
## 14 user14   watch awesom                0               1       1     1
## 15 user15     watch good                0               1       0     1
## 16 user16   watch swatch                0               1       0     1
## 17 user17      red watch                0               1       0     1
## 18 user18  awesome watch                0               1       1     1
## 19 user19  awesome watch                0               1       1     1
## 20 user20          watch                0               1       1     1
##    laptop
## 1       1
## 2       1
## 3       1
## 4       1
## 5       1
## 6       1
## 7       1
## 8       1
## 9       1
## 10      1
## 11      0
## 12      0
## 13      0
## 14      0
## 15      0
## 16      0
## 17      0
## 18      0
## 19      0
## 20      0

# Make the logistic regression model for laptop
p1 = glm(formula = purchased_laptop ~ awesome + laptop, data = df, family = binomial)
# Make the logistic regression model for watch
p2 = glm(formula = purchased_watch ~ awesome + watch, data = df, family = binomial)
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

So the value of “awesome”“ and "laptop”“ are:

coef(p1)
## (Intercept)     awesome      laptop 
##  -2.557e+01   1.845e-11   5.113e+01

and the value of "watch” is:

coef(p2)
## (Intercept)     awesome       watch 
##      -39.56       19.17       40.95

As this results show, logistic regression is giving a high value to “laptop” and “watch” featur and much lower value to “awesome. It learns the value of products based on the purchase behavior of the users.