Introduction

Apriori association rule minining introduction

“We introduced the problem of mining association rules between sets of items in a large database of customer transactions. Each transaction consists of items purchased by a customer in a visit. We are interested in finding those rules that have:

● Minimum transactional support s — the union of items in the consequent and antecedent of the rule is present in a minimum of S% of transactions in the database.

● Minimum confidence c — at least c% of transactions in the database that satisfy the antecedent of the rule also satisfy the consequent of the rule.

The rules that we discover have one item in the consequent and a union of any number of items in the antecedent. We solve this problem by decomposing it into two subproblems:

  1. Finding all itemsets, called large itemsets, that are present in at least 5% of transactions.

  2. Generating from each large itemset, rules that use items from the large itemset.

Having obtained the large itemsets and their transactional support count, the solution to the second subproblem is rather straightforward. A simple solution to the first subproblem is to form all itemsets and obtain their support in one pass over the data. However, this solution is computationally infeasible — if there are m items in the database, there will be 2^m possible itemsets, and m can easily be more than 1000.”

(Rakesh Agrawal, Tomasz Imielinski, Arun Swami, 1993, p. 9)

https://dl.acm.org/doi/10.1145/170035.170072#abstract

That was the first time the concepts of support and confidence were introduced and the first version of Apriori algorithm for association rule mining was published.

Objective

In this analysis, we will apply the Apriori algorithm to a dataset containing survey responses from young people to uncover interesting associations between different attributes, such as smoking habits, alcohol consumption, and lifestyle choices made by different gender and ages. By identifying these patterns, we aim to gain insights into the relationships between various behaviors and demographics.

Dataset: https://www.kaggle.com/datasets/miroslavsabo/young-people-survey

Quality of Apriori

The effectiveness of the Apriori algorithm depends on several factors:

Dataset Quality:

The algorithm requires a clean, well-structured dataset with transactional and categorical data (e.g., market baskets, survey responses). Missing or inconsistent data can lead to inaccurate results.

Choice of Thresholds:

Setting appropriate values for minimum support and minimum confidence is crucial. If the thresholds are too high, the algorithm may miss important patterns. If they are too low, it may generate too many irrelevant rules.

Data Size:

Apriori can be computationally expensive for very large datasets because it requires multiple passes over the data to identify frequent itemsets.

Domain Knowledge:

Understanding the context of the data is essential for interpreting the rules. For example, a rule like {Diapers} → {Beer} might seem surprising without knowing that it reflects a real-world shopping pattern among young parents.

Analysis

Analysis of data

Firstly I am looking at the structure of data, sample entries and the number of missing values for each feature, in order to find out what kind of preprocessing is needed in order to achieve satisfying results.

# Display the structure and summary of the data
str(data)
## 'data.frame':    1010 obs. of  150 variables:
##  $ Music                         : int  5 4 5 5 5 5 5 5 5 5 ...
##  $ Slow.songs.or.fast.songs      : int  3 4 5 3 3 3 5 3 3 3 ...
##  $ Dance                         : int  2 2 2 2 4 2 5 3 3 2 ...
##  $ Folk                          : int  1 1 2 1 3 3 3 2 1 5 ...
##  $ Country                       : int  2 1 3 1 2 2 1 1 1 2 ...
##  $ Classical.music               : int  2 1 4 1 4 3 2 2 2 2 ...
##  $ Musical                       : int  1 2 5 1 3 3 2 2 4 5 ...
##  $ Pop                           : int  5 3 3 2 5 2 5 4 3 3 ...
##  $ Rock                          : int  5 5 5 2 3 5 3 5 5 5 ...
##  $ Metal.or.Hardrock             : int  1 4 3 1 1 5 1 1 5 2 ...
##  $ Punk                          : int  1 4 4 4 2 3 1 2 1 3 ...
##  $ Hiphop..Rap                   : int  1 1 1 2 5 4 3 3 1 2 ...
##  $ Reggae..Ska                   : int  1 3 4 2 3 3 1 2 2 4 ...
##  $ Swing..Jazz                   : int  1 1 3 1 2 4 1 2 2 4 ...
##  $ Rock.n.roll                   : int  3 4 5 2 1 4 2 3 2 4 ...
##  $ Alternative                   : int  1 4 5 5 2 5 3 1 NA 4 ...
##  $ Latino                        : int  1 2 5 1 4 3 3 2 1 5 ...
##  $ Techno..Trance                : int  1 1 1 2 2 1 5 3 1 1 ...
##  $ Opera                         : int  1 1 3 1 2 3 2 2 1 2 ...
##  $ Movies                        : int  5 5 5 5 5 5 4 5 5 5 ...
##  $ Horror                        : int  4 2 3 4 4 5 2 4 1 2 ...
##  $ Thriller                      : int  2 2 4 4 4 5 1 4 5 1 ...
##  $ Comedy                        : int  5 4 4 3 5 5 5 5 5 5 ...
##  $ Romantic                      : int  4 3 2 3 2 2 3 2 4 5 ...
##  $ Sci.fi                        : int  4 4 4 4 3 3 1 3 4 1 ...
##  $ War                           : int  1 1 2 3 3 3 3 3 5 3 ...
##  $ Fantasy.Fairy.tales           : int  5 3 5 1 4 4 5 4 4 4 ...
##  $ Animated                      : int  5 5 5 2 4 3 5 4 4 4 ...
##  $ Documentary                   : int  3 4 2 5 3 3 3 3 5 4 ...
##  $ Western                       : int  1 1 2 1 1 2 1 1 1 1 ...
##  $ Action                        : int  2 4 1 2 4 4 2 3 1 2 ...
##  $ History                       : int  1 1 1 4 3 5 3 5 3 3 ...
##  $ Psychology                    : int  5 3 2 4 2 3 3 2 2 2 ...
##  $ Politics                      : int  1 4 1 5 3 4 1 3 1 3 ...
##  $ Mathematics                   : int  3 5 5 4 2 2 1 1 1 3 ...
##  $ Physics                       : int  3 2 2 1 2 3 1 1 1 1 ...
##  $ Internet                      : int  5 4 4 3 2 4 2 5 1 5 ...
##  $ PC                            : int  3 4 2 1 2 4 1 4 1 1 ...
##  $ Economy.Management            : int  5 5 4 2 2 1 3 1 1 4 ...
##  $ Biology                       : int  3 1 1 3 3 4 5 2 3 2 ...
##  $ Chemistry                     : int  3 1 1 3 3 4 5 2 1 1 ...
##  $ Reading                       : int  3 4 5 5 5 3 3 2 5 4 ...
##  $ Geography                     : int  3 4 2 4 2 3 3 3 1 4 ...
##  $ Foreign.languages             : int  5 5 5 4 3 4 4 4 1 5 ...
##  $ Medicine                      : int  3 1 2 2 3 4 5 1 1 1 ...
##  $ Law                           : int  1 2 3 5 2 3 3 2 1 1 ...
##  $ Cars                          : int  1 2 1 1 3 5 4 1 1 1 ...
##  $ Art.exhibitions               : int  1 2 5 5 1 2 1 1 1 4 ...
##  $ Religion                      : int  1 1 5 4 4 2 1 2 2 4 ...
##  $ Countryside..outdoors         : int  5 1 5 1 4 5 4 2 4 4 ...
##  $ Dancing                       : int  3 1 5 1 1 1 3 1 1 5 ...
##  $ Musical.instruments           : int  3 1 5 1 3 5 2 1 2 3 ...
##  $ Writing                       : int  2 1 5 3 1 1 1 1 1 1 ...
##  $ Passive.sport                 : int  1 1 5 1 3 5 5 4 4 4 ...
##  $ Active.sport                  : int  5 1 2 1 1 4 3 5 1 4 ...
##  $ Gardening                     : int  5 1 1 1 4 2 3 1 1 1 ...
##  $ Celebrities                   : int  1 2 1 2 3 1 1 3 5 2 ...
##  $ Shopping                      : int  4 3 4 4 3 2 3 3 2 4 ...
##  $ Science.and.technology        : int  4 3 2 3 3 3 4 2 1 3 ...
##  $ Theatre                       : int  2 2 5 1 2 1 3 2 5 5 ...
##  $ Fun.with.friends              : int  5 4 5 2 4 3 5 4 4 5 ...
##  $ Adrenaline.sports             : int  4 2 5 1 2 3 1 2 1 2 ...
##  $ Pets                          : int  4 5 5 1 1 2 5 5 1 2 ...
##  $ Flying                        : int  1 1 1 2 1 3 1 3 2 4 ...
##  $ Storm                         : int  1 1 1 1 2 2 3 2 3 5 ...
##  $ Darkness                      : int  1 1 1 1 1 2 2 4 1 4 ...
##  $ Heights                       : int  1 2 1 3 1 2 1 3 5 5 ...
##  $ Spiders                       : int  1 1 1 5 1 1 1 1 5 3 ...
##  $ Snakes                        : int  5 1 1 5 1 2 5 5 5 4 ...
##  $ Rats                          : int  3 1 1 5 2 2 1 3 2 4 ...
##  $ Ageing                        : int  1 3 1 4 2 1 4 1 2 3 ...
##  $ Dangerous.dogs                : int  3 1 1 5 4 1 1 2 3 5 ...
##  $ Fear.of.public.speaking       : int  2 4 2 5 3 3 1 4 4 3 ...
##  $ Smoking                       : chr  "never smoked" "never smoked" "tried smoking" "former smoker" ...
##  $ Alcohol                       : chr  "drink a lot" "drink a lot" "drink a lot" "drink a lot" ...
##  $ Healthy.eating                : int  4 3 3 3 4 2 4 2 1 3 ...
##  $ Daily.events                  : int  2 3 1 4 3 2 3 3 1 4 ...
##  $ Prioritising.workload         : int  2 2 2 4 1 2 5 1 2 2 ...
##  $ Writing.notes                 : int  5 4 5 4 2 3 5 3 1 2 ...
##  $ Workaholism                   : int  4 5 3 5 3 3 5 2 4 3 ...
##  $ Thinking.ahead                : int  2 4 5 3 5 3 3 4 2 3 ...
##  $ Final.judgement               : int  5 1 3 1 5 1 3 3 5 5 ...
##  $ Reliability                   : int  4 4 4 3 5 3 4 3 5 4 ...
##  $ Keeping.promises              : int  4 4 5 4 4 4 5 3 4 5 ...
##  $ Loss.of.interest              : int  1 3 1 5 2 3 3 1 1 3 ...
##  $ Friends.versus.money          : int  3 4 5 2 3 2 4 4 4 4 ...
##  $ Funniness                     : int  5 3 2 1 3 3 4 4 2 3 ...
##  $ Fake                          : int  1 2 4 1 2 1 1 2 2 1 ...
##  $ Criminal.damage               : int  1 1 1 5 1 4 2 1 1 2 ...
##  $ Decision.making               : int  3 2 3 5 3 2 2 3 4 5 ...
##  $ Elections                     : int  4 5 5 5 5 5 5 5 1 5 ...
##  $ Self.criticism                : int  1 4 4 5 5 4 3 3 3 4 ...
##  $ Judgment.calls                : int  3 4 4 4 5 4 5 5 2 5 ...
##  $ Hypochondria                  : int  1 1 1 3 1 1 1 2 2 1 ...
##  $ Empathy                       : int  3 2 5 3 3 4 4 1 5 4 ...
##  $ Eating.to.survive             : int  1 1 5 1 1 2 1 2 1 1 ...
##  $ Giving                        : int  4 2 5 1 3 3 5 3 1 4 ...
##  $ Compassion.to.animals         : int  5 4 4 2 3 5 5 5 4 5 ...
##  $ Borrowed.stuff                : int  4 3 2 5 4 5 5 2 5 4 ...
##   [list output truncated]
# Names of columns with NAs
colnames(data)[colSums(is.na(data)) > 0]
##   [1] "Music"                          "Slow.songs.or.fast.songs"      
##   [3] "Dance"                          "Folk"                          
##   [5] "Country"                        "Classical.music"               
##   [7] "Musical"                        "Pop"                           
##   [9] "Rock"                           "Metal.or.Hardrock"             
##  [11] "Punk"                           "Hiphop..Rap"                   
##  [13] "Reggae..Ska"                    "Swing..Jazz"                   
##  [15] "Rock.n.roll"                    "Alternative"                   
##  [17] "Latino"                         "Techno..Trance"                
##  [19] "Opera"                          "Movies"                        
##  [21] "Horror"                         "Thriller"                      
##  [23] "Comedy"                         "Romantic"                      
##  [25] "Sci.fi"                         "War"                           
##  [27] "Fantasy.Fairy.tales"            "Animated"                      
##  [29] "Documentary"                    "Western"                       
##  [31] "Action"                         "History"                       
##  [33] "Psychology"                     "Politics"                      
##  [35] "Mathematics"                    "Physics"                       
##  [37] "Internet"                       "PC"                            
##  [39] "Economy.Management"             "Biology"                       
##  [41] "Chemistry"                      "Reading"                       
##  [43] "Geography"                      "Foreign.languages"             
##  [45] "Medicine"                       "Law"                           
##  [47] "Cars"                           "Art.exhibitions"               
##  [49] "Religion"                       "Countryside..outdoors"         
##  [51] "Dancing"                        "Musical.instruments"           
##  [53] "Writing"                        "Passive.sport"                 
##  [55] "Active.sport"                   "Gardening"                     
##  [57] "Celebrities"                    "Shopping"                      
##  [59] "Science.and.technology"         "Theatre"                       
##  [61] "Fun.with.friends"               "Adrenaline.sports"             
##  [63] "Pets"                           "Flying"                        
##  [65] "Storm"                          "Darkness"                      
##  [67] "Heights"                        "Spiders"                       
##  [69] "Rats"                           "Ageing"                        
##  [71] "Dangerous.dogs"                 "Fear.of.public.speaking"       
##  [73] "Healthy.eating"                 "Daily.events"                  
##  [75] "Prioritising.workload"          "Writing.notes"                 
##  [77] "Workaholism"                    "Thinking.ahead"                
##  [79] "Final.judgement"                "Reliability"                   
##  [81] "Keeping.promises"               "Loss.of.interest"              
##  [83] "Friends.versus.money"           "Funniness"                     
##  [85] "Fake"                           "Criminal.damage"               
##  [87] "Decision.making"                "Elections"                     
##  [89] "Self.criticism"                 "Judgment.calls"                
##  [91] "Hypochondria"                   "Empathy"                       
##  [93] "Giving"                         "Compassion.to.animals"         
##  [95] "Borrowed.stuff"                 "Loneliness"                    
##  [97] "Cheating.in.school"             "Health"                        
##  [99] "Changing.the.past"              "God"                           
## [101] "Charity"                        "Waiting"                       
## [103] "New.environment"                "Mood.swings"                   
## [105] "Appearence.and.gestures"        "Socializing"                   
## [107] "Achievements"                   "Responding.to.a.serious.letter"
## [109] "Children"                       "Assertiveness"                 
## [111] "Getting.angry"                  "Knowing.the.right.people"      
## [113] "Public.speaking"                "Unpopularity"                  
## [115] "Life.struggles"                 "Happiness.in.life"             
## [117] "Energy.levels"                  "Small...big.dogs"              
## [119] "Personality"                    "Finding.lost.valuables"        
## [121] "Getting.up"                     "Interests.or.hobbies"          
## [123] "Parents..advice"                "Questionnaires.or.polls"       
## [125] "Finances"                       "Shopping.centres"              
## [127] "Branded.clothing"               "Entertainment.spending"        
## [129] "Spending.on.looks"              "Spending.on.healthy.eating"    
## [131] "Age"                            "Height"                        
## [133] "Weight"                         "Number.of.siblings"
# Number of NA rows
sum(rowSums(is.na(data)) > 0)
## [1] 324

Preprocessing of data

Removing NA rows

Given the nature of our data it’s pretty hard to interpolate individual human traits, like fear of Rats or affinity for obeying Law. Because of that I decided to fill the Age, Height and Weight from the mean and drop all the other NA rows.

# Fill in mean for Age, Height and Weight
data$Age[is.na(data$Age)] <- mean(data$Age, na.rm = TRUE)
data$Height[is.na(data$Height)] <- mean(data$Height, na.rm = TRUE)
data$Weight[is.na(data$Weight)] <- mean(data$Weight, na.rm = TRUE)

# Checking how many rows were populated with means
sum(rowSums(is.na(data)) > 0)
## [1] 309
# Drop rows with remaining NA values
data <- data[complete.cases(data), ]

colnames(data)[colSums(is.na(data)) > 0]
## character(0)
sum(rowSums(is.na(data)) > 0)
## [1] 0

Transforming values into readable categorical values

To perform association rules mining it is necessary to have data with categorical values, as of right now we have numerical values for the majority of the dataset.

With the function below I binned the survey responses ranging from 1 to 5 into:
● ‘Low’ for values of 1-2
● ‘Medium for value of 3
● ’High’ for values of 4-5.

Separate binning logic was performed for Age, Height, Weight and Number of Siblings.

The column name was preappended to the value so that is easier to identify the newly found rules later on.

# Define a function to bin numerical values into categories and append the column name
bin_numerical <- function(x, col_name) {
  if (is.numeric(x)) {
    if (col_name == "Age") {
      # Age binning
      cut(x,
          breaks = c(-Inf, 20, 25, Inf),
          labels = paste0(col_name, "_", c("15-20", "20-25", "25-30")),
          right = TRUE)
    } else if (col_name == "Height") {
      # Height binning
      cut(x,
          breaks = c(-Inf, 160, 180, Inf),
          labels = paste0(col_name, "_", c("Short", "Medium", "Tall")),
          right = TRUE)
    } else if (col_name == "Number.of.siblings") {
      # Number.of.siblings binning
      cut(x,
          breaks = c(-Inf, 0, 1, 3, 5,Inf),
          labels = paste0(col_name, "_", c("Zero", "One", "Two or Three", "Four or Five", "Six or more")),
          right = TRUE)
    }  else if (col_name == "Weight") {
      # Height binning
      cut(x,
          breaks = c(-Inf, 50, 80, Inf),
          labels = paste0(col_name, "_", c("Low", "Medium", "Big")),
          right = TRUE)
    } else {
      # General numerical binning
      cut(x,
          breaks = c(-Inf, 2, 3, 5),
          labels = paste0(col_name, "_", c("Low", "Medium", "High")),
          right = TRUE)
    }
  } else {
    if (col_name == "Punctuality") {
      paste0(col_name, "_", x)
    } else if (col_name == "Internet.usage") {
      paste0(col_name, "_", x)
    } else if (col_name == "Lying") {
      paste0(col_name, "_", x)
    } else if (col_name == "Smoking") {
      paste0(col_name, "_", x)
    } else if (col_name == "Alcohol") {
      paste0(col_name, "_", x)
    } else if (col_name == "Only.child") {
      paste0(col_name, "_", x)
    } else {
      x 
    }
  }
}

# Apply the binning function to the dataset
transformed_data <- data %>%
  mutate(across(everything(), ~bin_numerical(., cur_column())))

## SAVE PREPROCESSED DATA
write.csv(transformed_data, file = "data\\final.csv", row.names = FALSE)


### ASSOCIATION RULES
trans1<-read.transactions("data\\final.csv", format="basket", sep=",", skip=0) # reading the file as transactions

# View the transformed dataset
head(transformed_data)
##        Music        Slow.songs.or.fast.songs      Dance        Folk
## 1 Music_High Slow.songs.or.fast.songs_Medium  Dance_Low    Folk_Low
## 2 Music_High   Slow.songs.or.fast.songs_High  Dance_Low    Folk_Low
## 3 Music_High   Slow.songs.or.fast.songs_High  Dance_Low    Folk_Low
## 5 Music_High Slow.songs.or.fast.songs_Medium Dance_High Folk_Medium
## 6 Music_High Slow.songs.or.fast.songs_Medium  Dance_Low Folk_Medium
## 7 Music_High   Slow.songs.or.fast.songs_High Dance_High Folk_Medium
##          Country        Classical.music        Musical        Pop        Rock
## 1    Country_Low    Classical.music_Low    Musical_Low   Pop_High   Rock_High
## 2    Country_Low    Classical.music_Low    Musical_Low Pop_Medium   Rock_High
## 3 Country_Medium   Classical.music_High   Musical_High Pop_Medium   Rock_High
## 5    Country_Low   Classical.music_High Musical_Medium   Pop_High Rock_Medium
## 6    Country_Low Classical.music_Medium Musical_Medium    Pop_Low   Rock_High
## 7    Country_Low    Classical.music_Low    Musical_Low   Pop_High Rock_Medium
##          Metal.or.Hardrock        Punk        Hiphop..Rap        Reggae..Ska
## 1    Metal.or.Hardrock_Low    Punk_Low    Hiphop..Rap_Low    Reggae..Ska_Low
## 2   Metal.or.Hardrock_High   Punk_High    Hiphop..Rap_Low Reggae..Ska_Medium
## 3 Metal.or.Hardrock_Medium   Punk_High    Hiphop..Rap_Low   Reggae..Ska_High
## 5    Metal.or.Hardrock_Low    Punk_Low   Hiphop..Rap_High Reggae..Ska_Medium
## 6   Metal.or.Hardrock_High Punk_Medium   Hiphop..Rap_High Reggae..Ska_Medium
## 7    Metal.or.Hardrock_Low    Punk_Low Hiphop..Rap_Medium    Reggae..Ska_Low
##          Swing..Jazz        Rock.n.roll        Alternative        Latino
## 1    Swing..Jazz_Low Rock.n.roll_Medium    Alternative_Low    Latino_Low
## 2    Swing..Jazz_Low   Rock.n.roll_High   Alternative_High    Latino_Low
## 3 Swing..Jazz_Medium   Rock.n.roll_High   Alternative_High   Latino_High
## 5    Swing..Jazz_Low    Rock.n.roll_Low    Alternative_Low   Latino_High
## 6   Swing..Jazz_High   Rock.n.roll_High   Alternative_High Latino_Medium
## 7    Swing..Jazz_Low    Rock.n.roll_Low Alternative_Medium Latino_Medium
##        Techno..Trance        Opera      Movies        Horror      Thriller
## 1  Techno..Trance_Low    Opera_Low Movies_High   Horror_High  Thriller_Low
## 2  Techno..Trance_Low    Opera_Low Movies_High    Horror_Low  Thriller_Low
## 3  Techno..Trance_Low Opera_Medium Movies_High Horror_Medium Thriller_High
## 5  Techno..Trance_Low    Opera_Low Movies_High   Horror_High Thriller_High
## 6  Techno..Trance_Low Opera_Medium Movies_High   Horror_High Thriller_High
## 7 Techno..Trance_High    Opera_Low Movies_High    Horror_Low  Thriller_Low
##        Comedy        Romantic        Sci.fi        War
## 1 Comedy_High   Romantic_High   Sci.fi_High    War_Low
## 2 Comedy_High Romantic_Medium   Sci.fi_High    War_Low
## 3 Comedy_High    Romantic_Low   Sci.fi_High    War_Low
## 5 Comedy_High    Romantic_Low Sci.fi_Medium War_Medium
## 6 Comedy_High    Romantic_Low Sci.fi_Medium War_Medium
## 7 Comedy_High Romantic_Medium    Sci.fi_Low War_Medium
##          Fantasy.Fairy.tales        Animated        Documentary     Western
## 1   Fantasy.Fairy.tales_High   Animated_High Documentary_Medium Western_Low
## 2 Fantasy.Fairy.tales_Medium   Animated_High   Documentary_High Western_Low
## 3   Fantasy.Fairy.tales_High   Animated_High    Documentary_Low Western_Low
## 5   Fantasy.Fairy.tales_High   Animated_High Documentary_Medium Western_Low
## 6   Fantasy.Fairy.tales_High Animated_Medium Documentary_Medium Western_Low
## 7   Fantasy.Fairy.tales_High   Animated_High Documentary_Medium Western_Low
##        Action        History        Psychology        Politics
## 1  Action_Low    History_Low   Psychology_High    Politics_Low
## 2 Action_High    History_Low Psychology_Medium   Politics_High
## 3  Action_Low    History_Low    Psychology_Low    Politics_Low
## 5 Action_High History_Medium    Psychology_Low Politics_Medium
## 6 Action_High   History_High Psychology_Medium   Politics_High
## 7  Action_Low History_Medium Psychology_Medium    Politics_Low
##          Mathematics        Physics      Internet        PC
## 1 Mathematics_Medium Physics_Medium Internet_High PC_Medium
## 2   Mathematics_High    Physics_Low Internet_High   PC_High
## 3   Mathematics_High    Physics_Low Internet_High    PC_Low
## 5    Mathematics_Low    Physics_Low  Internet_Low    PC_Low
## 6    Mathematics_Low Physics_Medium Internet_High   PC_High
## 7    Mathematics_Low    Physics_Low  Internet_Low    PC_Low
##          Economy.Management        Biology        Chemistry        Reading
## 1   Economy.Management_High Biology_Medium Chemistry_Medium Reading_Medium
## 2   Economy.Management_High    Biology_Low    Chemistry_Low   Reading_High
## 3   Economy.Management_High    Biology_Low    Chemistry_Low   Reading_High
## 5    Economy.Management_Low Biology_Medium Chemistry_Medium   Reading_High
## 6    Economy.Management_Low   Biology_High   Chemistry_High Reading_Medium
## 7 Economy.Management_Medium   Biology_High   Chemistry_High Reading_Medium
##          Geography        Foreign.languages        Medicine        Law
## 1 Geography_Medium   Foreign.languages_High Medicine_Medium    Law_Low
## 2   Geography_High   Foreign.languages_High    Medicine_Low    Law_Low
## 3    Geography_Low   Foreign.languages_High    Medicine_Low Law_Medium
## 5    Geography_Low Foreign.languages_Medium Medicine_Medium    Law_Low
## 6 Geography_Medium   Foreign.languages_High   Medicine_High Law_Medium
## 7 Geography_Medium   Foreign.languages_High   Medicine_High Law_Medium
##          Cars      Art.exhibitions      Religion      Countryside..outdoors
## 1    Cars_Low  Art.exhibitions_Low  Religion_Low Countryside..outdoors_High
## 2    Cars_Low  Art.exhibitions_Low  Religion_Low  Countryside..outdoors_Low
## 3    Cars_Low Art.exhibitions_High Religion_High Countryside..outdoors_High
## 5 Cars_Medium  Art.exhibitions_Low Religion_High Countryside..outdoors_High
## 6   Cars_High  Art.exhibitions_Low  Religion_Low Countryside..outdoors_High
## 7   Cars_High  Art.exhibitions_Low  Religion_Low Countryside..outdoors_High
##          Dancing        Musical.instruments      Writing        Passive.sport
## 1 Dancing_Medium Musical.instruments_Medium  Writing_Low    Passive.sport_Low
## 2    Dancing_Low    Musical.instruments_Low  Writing_Low    Passive.sport_Low
## 3   Dancing_High   Musical.instruments_High Writing_High   Passive.sport_High
## 5    Dancing_Low Musical.instruments_Medium  Writing_Low Passive.sport_Medium
## 6    Dancing_Low   Musical.instruments_High  Writing_Low   Passive.sport_High
## 7 Dancing_Medium    Musical.instruments_Low  Writing_Low   Passive.sport_High
##          Active.sport        Gardening        Celebrities        Shopping
## 1   Active.sport_High   Gardening_High    Celebrities_Low   Shopping_High
## 2    Active.sport_Low    Gardening_Low    Celebrities_Low Shopping_Medium
## 3    Active.sport_Low    Gardening_Low    Celebrities_Low   Shopping_High
## 5    Active.sport_Low   Gardening_High Celebrities_Medium Shopping_Medium
## 6   Active.sport_High    Gardening_Low    Celebrities_Low    Shopping_Low
## 7 Active.sport_Medium Gardening_Medium    Celebrities_Low Shopping_Medium
##          Science.and.technology        Theatre        Fun.with.friends
## 1   Science.and.technology_High    Theatre_Low   Fun.with.friends_High
## 2 Science.and.technology_Medium    Theatre_Low   Fun.with.friends_High
## 3    Science.and.technology_Low   Theatre_High   Fun.with.friends_High
## 5 Science.and.technology_Medium    Theatre_Low   Fun.with.friends_High
## 6 Science.and.technology_Medium    Theatre_Low Fun.with.friends_Medium
## 7   Science.and.technology_High Theatre_Medium   Fun.with.friends_High
##          Adrenaline.sports      Pets        Flying        Storm     Darkness
## 1   Adrenaline.sports_High Pets_High    Flying_Low    Storm_Low Darkness_Low
## 2    Adrenaline.sports_Low Pets_High    Flying_Low    Storm_Low Darkness_Low
## 3   Adrenaline.sports_High Pets_High    Flying_Low    Storm_Low Darkness_Low
## 5    Adrenaline.sports_Low  Pets_Low    Flying_Low    Storm_Low Darkness_Low
## 6 Adrenaline.sports_Medium  Pets_Low Flying_Medium    Storm_Low Darkness_Low
## 7    Adrenaline.sports_Low Pets_High    Flying_Low Storm_Medium Darkness_Low
##       Heights     Spiders      Snakes        Rats        Ageing
## 1 Heights_Low Spiders_Low Snakes_High Rats_Medium    Ageing_Low
## 2 Heights_Low Spiders_Low  Snakes_Low    Rats_Low Ageing_Medium
## 3 Heights_Low Spiders_Low  Snakes_Low    Rats_Low    Ageing_Low
## 5 Heights_Low Spiders_Low  Snakes_Low    Rats_Low    Ageing_Low
## 6 Heights_Low Spiders_Low  Snakes_Low    Rats_Low    Ageing_Low
## 7 Heights_Low Spiders_Low Snakes_High    Rats_Low   Ageing_High
##          Dangerous.dogs        Fear.of.public.speaking               Smoking
## 1 Dangerous.dogs_Medium    Fear.of.public.speaking_Low  Smoking_never smoked
## 2    Dangerous.dogs_Low   Fear.of.public.speaking_High  Smoking_never smoked
## 3    Dangerous.dogs_Low    Fear.of.public.speaking_Low Smoking_tried smoking
## 5   Dangerous.dogs_High Fear.of.public.speaking_Medium Smoking_tried smoking
## 6    Dangerous.dogs_Low Fear.of.public.speaking_Medium  Smoking_never smoked
## 7    Dangerous.dogs_Low    Fear.of.public.speaking_Low Smoking_tried smoking
##                  Alcohol        Healthy.eating        Daily.events
## 1    Alcohol_drink a lot   Healthy.eating_High    Daily.events_Low
## 2    Alcohol_drink a lot Healthy.eating_Medium Daily.events_Medium
## 3    Alcohol_drink a lot Healthy.eating_Medium    Daily.events_Low
## 5 Alcohol_social drinker   Healthy.eating_High Daily.events_Medium
## 6          Alcohol_never    Healthy.eating_Low    Daily.events_Low
## 7 Alcohol_social drinker   Healthy.eating_High Daily.events_Medium
##        Prioritising.workload        Writing.notes        Workaholism
## 1  Prioritising.workload_Low   Writing.notes_High   Workaholism_High
## 2  Prioritising.workload_Low   Writing.notes_High   Workaholism_High
## 3  Prioritising.workload_Low   Writing.notes_High Workaholism_Medium
## 5  Prioritising.workload_Low    Writing.notes_Low Workaholism_Medium
## 6  Prioritising.workload_Low Writing.notes_Medium Workaholism_Medium
## 7 Prioritising.workload_High   Writing.notes_High   Workaholism_High
##          Thinking.ahead        Final.judgement        Reliability
## 1    Thinking.ahead_Low   Final.judgement_High   Reliability_High
## 2   Thinking.ahead_High    Final.judgement_Low   Reliability_High
## 3   Thinking.ahead_High Final.judgement_Medium   Reliability_High
## 5   Thinking.ahead_High   Final.judgement_High   Reliability_High
## 6 Thinking.ahead_Medium    Final.judgement_Low Reliability_Medium
## 7 Thinking.ahead_Medium Final.judgement_Medium   Reliability_High
##        Keeping.promises        Loss.of.interest        Friends.versus.money
## 1 Keeping.promises_High    Loss.of.interest_Low Friends.versus.money_Medium
## 2 Keeping.promises_High Loss.of.interest_Medium   Friends.versus.money_High
## 3 Keeping.promises_High    Loss.of.interest_Low   Friends.versus.money_High
## 5 Keeping.promises_High    Loss.of.interest_Low Friends.versus.money_Medium
## 6 Keeping.promises_High Loss.of.interest_Medium    Friends.versus.money_Low
## 7 Keeping.promises_High Loss.of.interest_Medium   Friends.versus.money_High
##          Funniness      Fake      Criminal.damage        Decision.making
## 1   Funniness_High  Fake_Low  Criminal.damage_Low Decision.making_Medium
## 2 Funniness_Medium  Fake_Low  Criminal.damage_Low    Decision.making_Low
## 3    Funniness_Low Fake_High  Criminal.damage_Low Decision.making_Medium
## 5 Funniness_Medium  Fake_Low  Criminal.damage_Low Decision.making_Medium
## 6 Funniness_Medium  Fake_Low Criminal.damage_High    Decision.making_Low
## 7   Funniness_High  Fake_Low  Criminal.damage_Low    Decision.making_Low
##        Elections        Self.criticism        Judgment.calls     Hypochondria
## 1 Elections_High    Self.criticism_Low Judgment.calls_Medium Hypochondria_Low
## 2 Elections_High   Self.criticism_High   Judgment.calls_High Hypochondria_Low
## 3 Elections_High   Self.criticism_High   Judgment.calls_High Hypochondria_Low
## 5 Elections_High   Self.criticism_High   Judgment.calls_High Hypochondria_Low
## 6 Elections_High   Self.criticism_High   Judgment.calls_High Hypochondria_Low
## 7 Elections_High Self.criticism_Medium   Judgment.calls_High Hypochondria_Low
##          Empathy      Eating.to.survive        Giving
## 1 Empathy_Medium  Eating.to.survive_Low   Giving_High
## 2    Empathy_Low  Eating.to.survive_Low    Giving_Low
## 3   Empathy_High Eating.to.survive_High   Giving_High
## 5 Empathy_Medium  Eating.to.survive_Low Giving_Medium
## 6   Empathy_High  Eating.to.survive_Low Giving_Medium
## 7   Empathy_High  Eating.to.survive_Low   Giving_High
##          Compassion.to.animals        Borrowed.stuff        Loneliness
## 1   Compassion.to.animals_High   Borrowed.stuff_High Loneliness_Medium
## 2   Compassion.to.animals_High Borrowed.stuff_Medium    Loneliness_Low
## 3   Compassion.to.animals_High    Borrowed.stuff_Low   Loneliness_High
## 5 Compassion.to.animals_Medium   Borrowed.stuff_High Loneliness_Medium
## 6   Compassion.to.animals_High   Borrowed.stuff_High    Loneliness_Low
## 7   Compassion.to.animals_High   Borrowed.stuff_High Loneliness_Medium
##          Cheating.in.school        Health        Changing.the.past        God
## 1    Cheating.in.school_Low    Health_Low    Changing.the.past_Low    God_Low
## 2   Cheating.in.school_High   Health_High   Changing.the.past_High    God_Low
## 3 Cheating.in.school_Medium    Health_Low   Changing.the.past_High   God_High
## 5   Cheating.in.school_High Health_Medium   Changing.the.past_High   God_High
## 6   Cheating.in.school_High Health_Medium Changing.the.past_Medium God_Medium
## 7    Cheating.in.school_Low Health_Medium    Changing.the.past_Low   God_High
##          Dreams        Charity        Number.of.friends
## 1   Dreams_High    Charity_Low Number.of.friends_Medium
## 2 Dreams_Medium    Charity_Low Number.of.friends_Medium
## 3    Dreams_Low Charity_Medium Number.of.friends_Medium
## 5 Dreams_Medium Charity_Medium Number.of.friends_Medium
## 6 Dreams_Medium    Charity_Low Number.of.friends_Medium
## 7 Dreams_Medium Charity_Medium Number.of.friends_Medium
##                           Punctuality                               Lying
## 1     Punctuality_i am always on time                         Lying_never
## 2        Punctuality_i am often early                     Lying_sometimes
## 3 Punctuality_i am often running late                     Lying_sometimes
## 5     Punctuality_i am always on time         Lying_everytime it suits me
## 6        Punctuality_i am often early Lying_only to avoid hurting someone
## 7        Punctuality_i am often early                         Lying_never
##          Waiting        New.environment        Mood.swings
## 1 Waiting_Medium   New.environment_High Mood.swings_Medium
## 2 Waiting_Medium   New.environment_High   Mood.swings_High
## 3    Waiting_Low New.environment_Medium   Mood.swings_High
## 5 Waiting_Medium   New.environment_High    Mood.swings_Low
## 6 Waiting_Medium   New.environment_High Mood.swings_Medium
## 7   Waiting_High   New.environment_High   Mood.swings_High
##          Appearence.and.gestures        Socializing        Achievements
## 1   Appearence.and.gestures_High Socializing_Medium   Achievements_High
## 2   Appearence.and.gestures_High   Socializing_High    Achievements_Low
## 3 Appearence.and.gestures_Medium   Socializing_High Achievements_Medium
## 5 Appearence.and.gestures_Medium Socializing_Medium Achievements_Medium
## 6 Appearence.and.gestures_Medium   Socializing_High    Achievements_Low
## 7   Appearence.and.gestures_High   Socializing_High   Achievements_High
##          Responding.to.a.serious.letter        Children        Assertiveness
## 1 Responding.to.a.serious.letter_Medium   Children_High    Assertiveness_Low
## 2   Responding.to.a.serious.letter_High    Children_Low    Assertiveness_Low
## 3   Responding.to.a.serious.letter_High   Children_High Assertiveness_Medium
## 5 Responding.to.a.serious.letter_Medium   Children_High   Assertiveness_High
## 6    Responding.to.a.serious.letter_Low Children_Medium   Assertiveness_High
## 7 Responding.to.a.serious.letter_Medium    Children_Low Assertiveness_Medium
##          Getting.angry        Knowing.the.right.people        Public.speaking
## 1    Getting.angry_Low Knowing.the.right.people_Medium   Public.speaking_High
## 2   Getting.angry_High   Knowing.the.right.people_High   Public.speaking_High
## 3   Getting.angry_High Knowing.the.right.people_Medium    Public.speaking_Low
## 5    Getting.angry_Low Knowing.the.right.people_Medium   Public.speaking_High
## 6 Getting.angry_Medium   Knowing.the.right.people_High   Public.speaking_High
## 7 Getting.angry_Medium   Knowing.the.right.people_High Public.speaking_Medium
##          Unpopularity        Life.struggles        Happiness.in.life
## 1   Unpopularity_High    Life.struggles_Low   Happiness.in.life_High
## 2   Unpopularity_High    Life.struggles_Low   Happiness.in.life_High
## 3   Unpopularity_High   Life.struggles_High   Happiness.in.life_High
## 5   Unpopularity_High    Life.struggles_Low Happiness.in.life_Medium
## 6   Unpopularity_High Life.struggles_Medium Happiness.in.life_Medium
## 7 Unpopularity_Medium   Life.struggles_High   Happiness.in.life_High
##          Energy.levels        Small...big.dogs        Personality
## 1   Energy.levels_High    Small...big.dogs_Low   Personality_High
## 2 Energy.levels_Medium   Small...big.dogs_High Personality_Medium
## 3   Energy.levels_High Small...big.dogs_Medium Personality_Medium
## 5   Energy.levels_High Small...big.dogs_Medium Personality_Medium
## 6   Energy.levels_High   Small...big.dogs_High Personality_Medium
## 7   Energy.levels_High Small...big.dogs_Medium Personality_Medium
##          Finding.lost.valuables        Getting.up        Interests.or.hobbies
## 1 Finding.lost.valuables_Medium    Getting.up_Low Interests.or.hobbies_Medium
## 2   Finding.lost.valuables_High   Getting.up_High Interests.or.hobbies_Medium
## 3 Finding.lost.valuables_Medium   Getting.up_High   Interests.or.hobbies_High
## 5    Finding.lost.valuables_Low   Getting.up_High Interests.or.hobbies_Medium
## 6 Finding.lost.valuables_Medium Getting.up_Medium   Interests.or.hobbies_High
## 7    Finding.lost.valuables_Low    Getting.up_Low   Interests.or.hobbies_High
##          Parents..advice        Questionnaires.or.polls
## 1   Parents..advice_High Questionnaires.or.polls_Medium
## 2    Parents..advice_Low Questionnaires.or.polls_Medium
## 3 Parents..advice_Medium    Questionnaires.or.polls_Low
## 5 Parents..advice_Medium Questionnaires.or.polls_Medium
## 6 Parents..advice_Medium   Questionnaires.or.polls_High
## 7   Parents..advice_High   Questionnaires.or.polls_High
##                           Internet.usage        Finances
## 1         Internet.usage_few hours a day Finances_Medium
## 2         Internet.usage_few hours a day Finances_Medium
## 3         Internet.usage_few hours a day    Finances_Low
## 5         Internet.usage_few hours a day   Finances_High
## 6         Internet.usage_few hours a day    Finances_Low
## 7 Internet.usage_less than an hour a day   Finances_High
##          Shopping.centres        Branded.clothing        Entertainment.spending
## 1   Shopping.centres_High   Branded.clothing_High Entertainment.spending_Medium
## 2   Shopping.centres_High    Branded.clothing_Low   Entertainment.spending_High
## 3   Shopping.centres_High    Branded.clothing_Low   Entertainment.spending_High
## 5 Shopping.centres_Medium   Branded.clothing_High Entertainment.spending_Medium
## 6 Shopping.centres_Medium Branded.clothing_Medium Entertainment.spending_Medium
## 7 Shopping.centres_Medium    Branded.clothing_Low Entertainment.spending_Medium
##          Spending.on.looks      Spending.on.gadgets
## 1 Spending.on.looks_Medium  Spending.on.gadgets_Low
## 2    Spending.on.looks_Low Spending.on.gadgets_High
## 3 Spending.on.looks_Medium Spending.on.gadgets_High
## 5 Spending.on.looks_Medium  Spending.on.gadgets_Low
## 6    Spending.on.looks_Low Spending.on.gadgets_High
## 7   Spending.on.looks_High  Spending.on.gadgets_Low
##          Spending.on.healthy.eating       Age        Height        Weight
## 1 Spending.on.healthy.eating_Medium Age_15-20 Height_Medium    Weight_Low
## 2    Spending.on.healthy.eating_Low Age_15-20 Height_Medium Weight_Medium
## 3    Spending.on.healthy.eating_Low Age_15-20 Height_Medium Weight_Medium
## 5   Spending.on.healthy.eating_High Age_15-20 Height_Medium Weight_Medium
## 6   Spending.on.healthy.eating_High Age_15-20   Height_Tall Weight_Medium
## 7   Spending.on.healthy.eating_High Age_15-20 Height_Medium    Weight_Low
##                Number.of.siblings Gender Left...right.handed
## 1          Number.of.siblings_One female        right handed
## 2 Number.of.siblings_Two or Three female        right handed
## 3 Number.of.siblings_Two or Three female        right handed
## 5          Number.of.siblings_One female        right handed
## 6          Number.of.siblings_One   male        right handed
## 7          Number.of.siblings_One female        right handed
##                 Education    Only.child Village...town House...block.of.flats
## 1 college/bachelor degree Only.child_no        village         block of flats
## 2 college/bachelor degree Only.child_no           city         block of flats
## 3        secondary school Only.child_no           city         block of flats
## 5        secondary school Only.child_no        village         house/bungalow
## 6        secondary school Only.child_no           city         block of flats
## 7        secondary school Only.child_no        village         house/bungalow

With the above data we can start teh rule association mining.

Rule Association Mining

rules <- list()

rules$male <- apriori(data=trans1, parameter=list(supp=0.3, conf=0.4, minlen=2), appearance=list(default="rhs", lhs=c("male")), control=list(verbose=F)) 
rules$male.byconf<-sort(rules$male, by="confidence", decreasing=TRUE)[1:15]
inspect(rules$male.byconf)
##      lhs       rhs                              support   confidence coverage 
## [1]  {male} => {Music_High}                     0.3732194 0.9357143  0.3988604
## [2]  {male} => {Movies_High}                    0.3618234 0.9071429  0.3988604
## [3]  {male} => {Comedy_High}                    0.3603989 0.9035714  0.3988604
## [4]  {male} => {Fun.with.friends_High}          0.3547009 0.8892857  0.3988604
## [5]  {male} => {Storm_Low}                      0.3532764 0.8857143  0.3988604
## [6]  {male} => {right handed}                   0.3532764 0.8857143  0.3988604
## [7]  {male} => {Internet_High}                  0.3447293 0.8642857  0.3988604
## [8]  {male} => {Darkness_Low}                   0.3333333 0.8357143  0.3988604
## [9]  {male} => {Gardening_Low}                  0.3319088 0.8321429  0.3988604
## [10] {male} => {Writing_Low}                    0.3176638 0.7964286  0.3988604
## [11] {male} => {Only.child_no}                  0.3162393 0.7928571  0.3988604
## [12] {male} => {Action_High}                    0.3148148 0.7892857  0.3988604
## [13] {male} => {Hypochondria_Low}               0.3119658 0.7821429  0.3988604
## [14] {male} => {Dancing_Low}                    0.3105413 0.7785714  0.3988604
## [15] {male} => {Internet.usage_few hours a day} 0.3062678 0.7678571  0.3988604
##      lift      count
## [1]  0.9833405 262  
## [2]  0.9873090 254  
## [3]  1.0068367 253  
## [4]  0.9924938 249  
## [5]  1.1753713 248  
## [6]  0.9853747 248  
## [7]  1.1235714 242  
## [8]  1.3066179 234  
## [9]  1.1105785 233  
## [10] 1.0629142 223  
## [11] 1.0345459 222  
## [12] 1.4134657 221  
## [13] 1.0518473 219  
## [14] 1.3330662 218  
## [15] 1.0326355 215
rules$female <- apriori(data=trans1, parameter=list(supp=0.3, conf=0.4, minlen=2), appearance=list(default="rhs", lhs="female"), control=list(verbose=F)) 
rules$female.byconf<-sort(rules$female, by="confidence", decreasing=TRUE)[1:15]
inspect(rules$female.byconf)
##      lhs         rhs                          support   confidence coverage 
## [1]  {female} => {Music_High}                 0.5754986 0.9642005  0.5968661
## [2]  {female} => {Movies_High}                0.5541311 0.9284010  0.5968661
## [3]  {female} => {right handed}               0.5427350 0.9093079  0.5968661
## [4]  {female} => {Fun.with.friends_High}      0.5384615 0.9021480  0.5968661
## [5]  {female} => {Comedy_High}                0.5356125 0.8973747  0.5968661
## [6]  {female} => {Height_Medium}              0.5042735 0.8448687  0.5968661
## [7]  {female} => {Western_Low}                0.4914530 0.8233890  0.5968661
## [8]  {female} => {Weight_Medium}              0.4772080 0.7995227  0.5968661
## [9]  {female} => {Physics_Low}                0.4743590 0.7947494  0.5968661
## [10] {female} => {Borrowed.stuff_High}        0.4601140 0.7708831  0.5968661
## [11] {female} => {Compassion.to.animals_High} 0.4544160 0.7613365  0.5968661
## [12] {female} => {Empathy_High}               0.4487179 0.7517900  0.5968661
## [13] {female} => {Only.child_no}              0.4487179 0.7517900  0.5968661
## [14] {female} => {Keeping.promises_High}      0.4472934 0.7494033  0.5968661
## [15] {female} => {Fake_Low}                   0.4415954 0.7398568  0.5968661
##      lift      count
## [1]  1.0132765 404  
## [2]  1.0104457 389  
## [3]  1.0116230 381  
## [4]  1.0068488 378  
## [5]  0.9999318 376  
## [6]  1.2304935 354  
## [7]  1.1967269 345  
## [8]  1.0630017 335  
## [9]  1.1551016 333  
## [10] 1.0268689 323  
## [11] 1.1088345 319  
## [12] 1.1017882 315  
## [13] 0.9809602 315  
## [14] 0.9944823 314  
## [15] 1.0284742 310
rules$Age15_20 <- apriori(data=trans1, parameter=list(supp=0.3, conf=0.4, minlen=2), appearance=list(default="rhs", lhs="Age_15-20"), control=list(verbose=F)) 
rules$Age15_20.byconf<-sort(rules$Age15_20, by="confidence", decreasing=TRUE)[1:15]
inspect(rules$Age15_20.byconf)
##      lhs            rhs                              support   confidence
## [1]  {Age_15-20} => {Music_High}                     0.6039886 0.9636364 
## [2]  {Age_15-20} => {Movies_High}                    0.5811966 0.9272727 
## [3]  {Age_15-20} => {Comedy_High}                    0.5712251 0.9113636 
## [4]  {Age_15-20} => {Fun.with.friends_High}          0.5698006 0.9090909 
## [5]  {Age_15-20} => {right handed}                   0.5683761 0.9068182 
## [6]  {Age_15-20} => {Internet_High}                  0.4900285 0.7818182 
## [7]  {Age_15-20} => {Storm_Low}                      0.4886040 0.7795455 
## [8]  {Age_15-20} => {Weight_Medium}                  0.4829060 0.7704545 
## [9]  {Age_15-20} => {Internet.usage_few hours a day} 0.4814815 0.7681818 
## [10] {Age_15-20} => {Gardening_Low}                  0.4800570 0.7659091 
## [11] {Age_15-20} => {Borrowed.stuff_High}            0.4772080 0.7613636 
## [12] {Age_15-20} => {Hypochondria_Low}               0.4700855 0.7500000 
## [13] {Age_15-20} => {Writing_Low}                    0.4672365 0.7454545 
## [14] {Age_15-20} => {Only.child_no}                  0.4672365 0.7454545 
## [15] {Age_15-20} => {Judgment.calls_High}            0.4558405 0.7272727 
##      coverage  lift      count
## [1]  0.6267806 1.0126837 424  
## [2]  0.6267806 1.0092178 408  
## [3]  0.6267806 1.0155195 401  
## [4]  0.6267806 1.0145975 400  
## [5]  0.6267806 1.0088532 399  
## [6]  0.6267806 1.0163636 344  
## [7]  0.6267806 1.0344819 343  
## [8]  0.6267806 1.0243543 339  
## [9]  0.6267806 1.0330721 338  
## [10] 0.6267806 1.0221829 337  
## [11] 0.6267806 1.0141884 335  
## [12] 0.6267806 1.0086207 330  
## [13] 0.6267806 0.9948842 328  
## [14] 0.6267806 0.9726935 328  
## [15] 0.6267806 1.0293255 320
rules$Age20_25 <- apriori(data=trans1, parameter=list(supp=0.2, conf=0.4, minlen=2), appearance=list(default="rhs", lhs="Age_20-25"), control=list(verbose=F)) 
rules$Age20_25.byconf<-sort(rules$Age20_25, by="confidence", decreasing=TRUE)[1:15]
inspect(rules$Age20_25.byconf)
##      lhs            rhs                     support   confidence coverage 
## [1]  {Age_20-25} => {Music_High}            0.2934473 0.9406393  0.3119658
## [2]  {Age_20-25} => {right handed}          0.2820513 0.9041096  0.3119658
## [3]  {Age_20-25} => {Movies_High}           0.2820513 0.9041096  0.3119658
## [4]  {Age_20-25} => {Fun.with.friends_High} 0.2763533 0.8858447  0.3119658
## [5]  {Age_20-25} => {Comedy_High}           0.2749288 0.8812785  0.3119658
## [6]  {Age_20-25} => {Keeping.promises_High} 0.2521368 0.8082192  0.3119658
## [7]  {Age_20-25} => {Only.child_no}         0.2478632 0.7945205  0.3119658
## [8]  {Age_20-25} => {Writing_Low}           0.2421652 0.7762557  0.3119658
## [9]  {Age_20-25} => {Chemistry_Low}         0.2407407 0.7716895  0.3119658
## [10] {Age_20-25} => {Hypochondria_Low}      0.2350427 0.7534247  0.3119658
## [11] {Age_20-25} => {city}                  0.2321937 0.7442922  0.3119658
## [12] {Age_20-25} => {Weight_Medium}         0.2321937 0.7442922  0.3119658
## [13] {Age_20-25} => {Fake_Low}              0.2293447 0.7351598  0.3119658
## [14] {Age_20-25} => {Internet_High}         0.2293447 0.7351598  0.3119658
## [15] {Age_20-25} => {Reliability_High}      0.2279202 0.7305936  0.3119658
##      lift      count
## [1]  0.9885161 206  
## [2]  1.0058398 198  
## [3]  0.9840076 198  
## [4]  0.9886534 194  
## [5]  0.9819961 193  
## [6]  1.0725328 177  
## [7]  1.0367164 174  
## [8]  1.0359915 170  
## [9]  1.0812895 169  
## [10] 1.0132263 165  
## [11] 1.0366928 163  
## [12] 0.9895704 163  
## [13] 1.0219449 161  
## [14] 0.9557078 161  
## [15] 1.0866032 160
rules$Age25_30 <- apriori(data=trans1, parameter=list(supp=0.03, conf=0.2, minlen=2), appearance=list(default="rhs", lhs="Age_25-30"), control=list(verbose=F)) 
rules$Age25_30.byconf<-sort(rules$Age25_30, by="confidence", decreasing=TRUE)[1:15]
inspect(rules$Age25_30.byconf)
##      lhs            rhs                      support    confidence coverage  
## [1]  {Age_25-30} => {Movies_High}            0.05555556 0.9285714  0.05982906
## [2]  {Age_25-30} => {Music_High}             0.05413105 0.9047619  0.05982906
## [3]  {Age_25-30} => {Only.child_no}          0.05128205 0.8571429  0.05982906
## [4]  {Age_25-30} => {Comedy_High}            0.05128205 0.8571429  0.05982906
## [5]  {Age_25-30} => {Internet_High}          0.04985755 0.8333333  0.05982906
## [6]  {Age_25-30} => {Fun.with.friends_High}  0.04985755 0.8333333  0.05982906
## [7]  {Age_25-30} => {right handed}           0.04843305 0.8095238  0.05982906
## [8]  {Age_25-30} => {Borrowed.stuff_High}    0.04700855 0.7857143  0.05982906
## [9]  {Age_25-30} => {Keeping.promises_High}  0.04558405 0.7619048  0.05982906
## [10] {Age_25-30} => {Reliability_High}       0.04415954 0.7380952  0.05982906
## [11] {Age_25-30} => {city}                   0.04415954 0.7380952  0.05982906
## [12] {Age_25-30} => {Fake_Low}               0.04415954 0.7380952  0.05982906
## [13] {Age_25-30} => {PC_High}                0.04273504 0.7142857  0.05982906
## [14] {Age_25-30} => {Thriller_High}          0.04273504 0.7142857  0.05982906
## [15] {Age_25-30} => {Happiness.in.life_High} 0.04273504 0.7142857  0.05982906
##      lift      count
## [1]  1.0106312 39   
## [2]  0.9508127 38   
## [3]  1.1184280 36   
## [4]  0.9551020 36   
## [5]  1.0833333 35   
## [6]  0.9300477 35   
## [7]  0.9006113 34   
## [8]  1.0466251 33   
## [9]  1.0110721 32   
## [10] 1.0977603 31   
## [11] 1.0280612 31   
## [12] 1.0260255 31   
## [13] 1.6826462 30   
## [14] 1.4085072 30   
## [15] 1.1044682 30
rules$Smoking <- apriori(data=trans1, parameter=list(supp=0.1, conf=0.22, minlen=2), appearance=list(default="lhs", rhs=c("Smoking_current smoker")), control=list(verbose=F)) 
rules$Smoking.byconf<-sort(rules$Smoking, by="confidence", decreasing=TRUE)[1:15]
inspect(rules$Smoking.byconf)
##      lhs                              rhs                        support confidence  coverage     lift count
## [1]  {Cheating.in.school_High,                                                                              
##       Fun.with.friends_High,                                                                                
##       Judgment.calls_High}         => {Smoking_current smoker} 0.1039886  0.2607143 0.3988604 1.475979    73
## [2]  {Cheating.in.school_High,                                                                              
##       Chemistry_Low,                                                                                        
##       Fun.with.friends_High}       => {Smoking_current smoker} 0.1054131  0.2596491 0.4059829 1.469949    74
## [3]  {Folk_Low,                                                                                             
##       Fun.with.friends_High,                                                                                
##       Religion_Low}                => {Smoking_current smoker} 0.1039886  0.2588652 0.4017094 1.465511    73
## [4]  {Fun.with.friends_High,                                                                                
##       Judgment.calls_High,                                                                                  
##       Religion_Low}                => {Smoking_current smoker} 0.1039886  0.2588652 0.4017094 1.465511    73
## [5]  {Cheating.in.school_High,                                                                              
##       Chemistry_Low,                                                                                        
##       Music_High}                  => {Smoking_current smoker} 0.1068376  0.2568493 0.4159544 1.454099    75
## [6]  {Entertainment.spending_High} => {Smoking_current smoker} 0.1039886  0.2561404 0.4059829 1.450085    73
## [7]  {Chemistry_Low,                                                                                        
##       Folk_Low,                                                                                             
##       Fun.with.friends_High}       => {Smoking_current smoker} 0.1054131  0.2560554 0.4116809 1.449604    74
## [8]  {Judgment.calls_High,                                                                                  
##       Music_High,                                                                                           
##       Religion_Low}                => {Smoking_current smoker} 0.1082621  0.2558923 0.4230769 1.448680    76
## [9]  {Folk_Low,                                                                                             
##       Fun.with.friends_High,                                                                                
##       Judgment.calls_High}         => {Smoking_current smoker} 0.1025641  0.2553191 0.4017094 1.445436    72
## [10] {Chemistry_Low,                                                                                        
##       Fun.with.friends_High,                                                                                
##       Religion_Low}                => {Smoking_current smoker} 0.1054131  0.2551724 0.4131054 1.444605    74
## [11] {Folk_Low,                                                                                             
##       Music_High,                                                                                           
##       Religion_Low}                => {Smoking_current smoker} 0.1082621  0.2550336 0.4245014 1.443819    76
## [12] {Chemistry_Low,                                                                                        
##       Music_High,                                                                                           
##       Religion_Low}                => {Smoking_current smoker} 0.1096866  0.2549669 0.4301994 1.443442    77
## [13] {Cheating.in.school_High,                                                                              
##       Judgment.calls_High,                                                                                  
##       Music_High}                  => {Smoking_current smoker} 0.1039886  0.2534722 0.4102564 1.434980    73
## [14] {Folk_Low,                                                                                             
##       Judgment.calls_High,                                                                                  
##       Music_High}                  => {Smoking_current smoker} 0.1068376  0.2533784 0.4216524 1.434449    75
## [15] {Cheating.in.school_High,                                                                              
##       Chemistry_Low}               => {Smoking_current smoker} 0.1096866  0.2516340 0.4358974 1.424573    77
rules$Drinking <- apriori(data=trans1, parameter=list(supp=0.1, conf=0.22, minlen=2), appearance=list(default="lhs", rhs="Alcohol_drink a lot"), control=list(verbose=F)) 
rules$Drinking.byconf<-sort(rules$Drinking, by="confidence", decreasing=TRUE)[1:15]
inspect(rules$Drinking.byconf)
##      lhs                                 rhs                     support confidence  coverage     lift count
## [1]  {Charity_Low,                                                                                          
##       Entertainment.spending_High,                                                                          
##       Storm_Low}                      => {Alcohol_drink a lot} 0.1011396  0.5071429 0.1994302 2.239084    71
## [2]  {Dancing_Low,                                                                                          
##       Entertainment.spending_High,                                                                          
##       Fun.with.friends_High}          => {Alcohol_drink a lot} 0.1111111  0.5064935 0.2193732 2.236217    78
## [3]  {Entertainment.spending_High,                                                                          
##       Getting.up_High,                                                                                      
##       Internet.usage_few hours a day} => {Alcohol_drink a lot} 0.1025641  0.5000000 0.2051282 2.207547    72
## [4]  {Charity_Low,                                                                                          
##       Entertainment.spending_High,                                                                          
##       Internet_High}                  => {Alcohol_drink a lot} 0.1039886  0.4965986 0.2094017 2.192530    73
## [5]  {Charity_Low,                                                                                          
##       Entertainment.spending_High,                                                                          
##       Internet.usage_few hours a day} => {Alcohol_drink a lot} 0.1025641  0.4965517 0.2065527 2.192323    72
## [6]  {Dancing_Low,                                                                                          
##       Entertainment.spending_High,                                                                          
##       Music_High}                     => {Alcohol_drink a lot} 0.1096866  0.4935897 0.2222222 2.179245    77
## [7]  {Entertainment.spending_High,                                                                          
##       Getting.up_High,                                                                                      
##       Storm_Low}                      => {Alcohol_drink a lot} 0.1082621  0.4935065 0.2193732 2.178878    76
## [8]  {Dancing_Low,                                                                                          
##       Entertainment.spending_High,                                                                          
##       right handed}                   => {Alcohol_drink a lot} 0.1039886  0.4932432 0.2108262 2.177715    73
## [9]  {Dancing_Low,                                                                                          
##       Entertainment.spending_High}    => {Alcohol_drink a lot} 0.1139601  0.4907975 0.2321937 2.166917    80
## [10] {Dancing_Low,                                                                                          
##       Entertainment.spending_High,                                                                          
##       Movies_High}                    => {Alcohol_drink a lot} 0.1011396  0.4829932 0.2094017 2.132461    71
## [11] {Charity_Low,                                                                                          
##       Entertainment.spending_High,                                                                          
##       Gardening_Low}                  => {Alcohol_drink a lot} 0.1054131  0.4774194 0.2207977 2.107851    74
## [12] {city,                                                                                                 
##       Entertainment.spending_High,                                                                          
##       Getting.up_High}                => {Alcohol_drink a lot} 0.1025641  0.4768212 0.2150997 2.105211    72
## [13] {Entertainment.spending_High,                                                                          
##       Getting.up_High,                                                                                      
##       Only.child_no}                  => {Alcohol_drink a lot} 0.1011396  0.4765101 0.2122507 2.103837    71
## [14] {Entertainment.spending_High,                                                                          
##       Getting.up_High,                                                                                      
##       Internet_High}                  => {Alcohol_drink a lot} 0.1096866  0.4753086 0.2307692 2.098532    77
## [15] {Chemistry_Low,                                                                                        
##       Entertainment.spending_High,                                                                          
##       Getting.up_High}                => {Alcohol_drink a lot} 0.1039886  0.4740260 0.2193732 2.092869    73
rules$Healthy <- apriori(data=trans1, parameter=list(supp=0.1, conf=0.22, minlen=2), appearance=list(default="lhs", rhs="Healthy.eating_High"), control=list(verbose=F)) 
rules$Healthy.byconf<-sort(rules$Healthy, by="confidence", decreasing=TRUE)[1:15]
inspect(rules$Healthy.byconf)
##      lhs                                   rhs                     support confidence  coverage     lift count
## [1]  {Interests.or.hobbies_High,                                                                              
##       Keeping.promises_High,                                                                                  
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1096866  0.4610778 0.2378917 1.778443    77
## [2]  {Borrowed.stuff_High,                                                                                    
##       Interests.or.hobbies_High,                                                                              
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1039886  0.4534161 0.2293447 1.748891    73
## [3]  {city,                                                                                                   
##       Energy.levels_High,                                                                                     
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1054131  0.4484848 0.2350427 1.729870    74
## [4]  {Eating.to.survive_Low,                                                                                  
##       Energy.levels_High,                                                                                     
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1025641  0.4444444 0.2307692 1.714286    72
## [5]  {Punctuality_i am always on time,                                                                        
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1011396  0.4437500 0.2279202 1.711607    71
## [6]  {Energy.levels_High,                                                                                     
##       Reliability_High,                                                                                       
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1111111  0.4431818 0.2507123 1.709416    78
## [7]  {city,                                                                                                   
##       Documentary_High,                                                                                       
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1054131  0.4404762 0.2393162 1.698980    74
## [8]  {city,                                                                                                   
##       Lying_sometimes,                                                                                        
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1011396  0.4382716 0.2307692 1.690476    71
## [9]  {Internet_High,                                                                                          
##       Lying_sometimes,                                                                                        
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1011396  0.4382716 0.2307692 1.690476    71
## [10] {Borrowed.stuff_High,                                                                                    
##       Energy.levels_High,                                                                                     
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1111111  0.4382022 0.2535613 1.690209    78
## [11] {Energy.levels_High,                                                                                     
##       Keeping.promises_High,                                                                                  
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1111111  0.4357542 0.2549858 1.680766    78
## [12] {Interests.or.hobbies_High,                                                                              
##       Only.child_no,                                                                                          
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1011396  0.4355828 0.2321937 1.680105    71
## [13] {Energy.levels_High,                                                                                     
##       Flying_Low,                                                                                             
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1011396  0.4355828 0.2321937 1.680105    71
## [14] {female,                                                                                                 
##       Only.child_no,                                                                                          
##       Spending.on.healthy.eating_High}  => {Healthy.eating_High} 0.1039886  0.4345238 0.2393162 1.676020    73
## [15] {Borrowed.stuff_High,                                                                                    
##       Spending.on.healthy.eating_High,                                                                        
##       Thinking.ahead_High}              => {Healthy.eating_High} 0.1025641  0.4337349 0.2364672 1.672978    72

Plotting Top 15 Rules

By Gender

## Male
plot(rules$male.byconf, method="graph")

plot(rules$male.byconf, method="paracoord", control=list(reorder=TRUE), main='Top 15 rules for males')

## Female
plot(rules$female.byconf, method="graph")

plot(rules$female.byconf, method="paracoord", control=list(reorder=TRUE), main='Top 15 rules for females')

By Age

## 15 - 20
plot(rules$Age15_20.byconf, method="graph")

plot(rules$Age15_20.byconf, method="paracoord", control=list(reorder=TRUE), main='Top 15 rules for ages 15 - 20')

## 21 - 25
plot(rules$Age20_25.byconf, method="graph")

plot(rules$Age20_25.byconf, method="paracoord", control=list(reorder=TRUE), main='Top 15 rules for ages 21 - 25')

## 26 - 30
plot(rules$Age25_30.byconf, method="graph")

plot(rules$Age25_30.byconf, method="paracoord", control=list(reorder=TRUE), main='Top 15 rules for ages 26 - 30')

By Habits

## Drinking
plot(rules$Drinking.byconf, method="graph")

plot(rules$Drinking.byconf, method="paracoord", control=list(reorder=TRUE), main='Top 15 rules for High Drinking')

## Smoking
plot(rules$Smoking.byconf, method="graph")

plot(rules$Smoking.byconf, method="paracoord", control=list(reorder=TRUE), main='Top 15 rules for High Smoking')

## Healthy Eating
plot(rules$Healthy.byconf, method="graph")

plot(rules$Healthy.byconf, method="paracoord", control=list(reorder=TRUE), main='Top 15 rules for Healthy Eating')

Conclusions

General

From the results of the Apriori Association Rule Mining we can clearly see that the interests and habits for different genders and ages are different. People tend to want different things in different points in time of their journey on Earth.

The more curious part of the plotting we can see what influences people’s habits: to smoke, to drink alcohol and to eat healthy. These results show a curious and complex dependency of different factors that lead to habit forming, perhaps with a bigger sample it would be possible to conduct a better analysis, to root out bad habit forming by properly guiding younger people before they are formed.

Gender Association Rules

Males

The analysis revealed that males are highly associated with preferences for music, movies, and comedy, as well as traits like being right-handed and having low interest in gardening or writing. These rules suggest that males in the dataset tend to prioritize entertainment and leisure activities.

Females

Females showed strong associations with high interest in music, movies, and social activities, as well as traits like being right-handed and having medium height. Additionally, females were more likely to exhibit empathy and compassion toward animals, indicating a focus on emotional and social well-being.

Habit Association Rules

Smoking

Rules related to smoking highlighted associations with high entertainment spending, low interest in religion, and high levels of cheating in school. These patterns suggest that smoking behavior may be linked to risk-taking and non-conformist tendencies.

Drinking

High alcohol consumption was associated with high entertainment spending, low interest in charity, and being right-handed. These rules indicate that drinking behavior is often tied to social and leisure activities.

Healthy Eating

Healthy eating habits were strongly associated with high energy levels, reliability, and spending on healthy foods. These rules suggest that individuals who prioritize health are also likely to exhibit disciplined and responsible behaviors.