Mateusz Tomczak

432561

Inroduction

The goal of this project is to understand what sets of programming languages are used by the developers in different fields of software development using the association rules algorithms. Understanding what languages are often used together can prove quite helpful, especially for aspiring developers that feel overwhelmed by the learning possibilites in this field. It can also provide insight into the programming field as well as show some relationships of languages used in front-end and back-end development. Data used for this project comes from The Public Stack Overflow Developer Survey Results 2023. This dataset provides broad overview of Stack Overflow users and gathers information regarding basic data such as education, position at work, etc., but also more recent topics such as AI tools and thier use in the workspace. For task presented in this project one column was selected , ‘LanguagesHaveWorkedWith’, containing the information regarding programming languages that the developer have worked with. In needs to be pointed out that the survey is open to every Stack Overflow user, so the data collected reflects the information for both professional as well as amateur developers. However, even in the case of amateur programmers the data can point out to the languages that are closely used together, or present a natural progression of programmer skillset. The analysis will mainly use Apriori Algorithm, but ECLAT Algorithm will also be used.

Libraries used

library(arules)
library(arulesViz)

The data

data <- read.transactions("data/stack-overflow-developer-survey-2023/data.csv", sep=',')

We can see that the most popular programming language among the Stack Overflow users is JavaScript, with HTML/CSS at second spot, and Python at third.

options(width = 100)
summary(data)
## transactions as itemMatrix in sparse format with
##  87140 rows (elements/itemsets/transactions) and
##  51 columns (items) and a density of 0.1048925 
## 
## most frequent items:
## JavaScript   HTML/CSS     Python        SQL TypeScript    (Other) 
##      55711      46396      43158      42623      34041     244228 
## 
## element (itemset/transaction) length distribution:
## sizes
##     1     2     3     4     5     6     7     8     9    10    11    12    13    14    15    16 
##  5361  8753 12229 13157 12424 10182  7647  5548  3807  2558  1827  1192   775   562   361   232 
##    17    18    19    20    21    22    23    24    25    26    27    28    29    30    31    32 
##   118   120    65    52    30    24    20    12    14     5     4     2     5     2     3     1 
##    33    34    35    36    37    38    41    42    43    44    45    46    47    48    49    50 
##     4     1     2     1     1     1     2     1     1     1     1     1     1     1     1     6 
##    51 
##    22 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00    3.00    5.00    5.35    7.00   51.00 
## 
## includes extended item information - examples:
##   labels
## 1    Ada
## 2   Apex
## 3    APL
options(width = 100, max.print = 1000)
size(data)
##    [1]  3  2  7  3  6 15  7  5  6  5  8  7  4  3 11  2  4  1  3  4  7  3  4  2  2  7 14  2  6  3  7
##   [32]  6  1  3  9  5  8  7  3  3  4  1  4  3  4  7  4  6  4  3  1  6  6  5  5  4  4  6  2  2  2  5
##   [63]  1  6  4  3  3  5  3  5  1  9  2  2  2  4  2  1  6  8  1  5  5  6  9  6  5 14  4  2  5  6  2
##   [94]  9  4  2  9  5  3  7  2  6  3  3  5  7  6  4  4  4  1 13  6  6 12  4 10  2  6  6  1  3  1  3
##  [125]  4  8  1  4  6  8 10  6  7  5  3  5  5  8  4  2  3  8  1  5 16  6  8  5  1  6  8  5  6  1  5
##  [156]  5  9 10  4  4  6  6 10  6  7 18  5  9  3  3  8  3  4  5  1  7  2  4  3 10  4 10  3  4  1  3
##  [187]  7  6  7  5  4 19  3  5  4  3  3 10  3  7  5 14  4  7  2  4 20  6  5  7  4 10  5  3  3  2  5
##  [218]  1 12  3  3  3  4  2  8  3 13  5  6 11 11  6  7  2  6 12  5  4 10  1  2  3  3  2  1  8  3  2
##  [249]  4  7  5  1  3  7  7  4  3 10  3  2  4  4  4  9  6  5  4  6  9  3  4  5  6  4  3  6  6  8  3
##  [280]  4  2  6  6  5  1  3  6  3  6  2  8  4  4  5  8  5  9  5 11  4  2  2  6  1  8  4  7  6  8  3
##  [311]  4 13  6  6 14  7  3  5  7  7  7  6 11  1  1  1  3  2  5  5  5  1  5  6  6  1  3  5 10  1  4
##  [342]  9  9  7 11  8  1  1  7  8  4  8  5  4  3  4  8  3  5  3 17  8  4  4  3  1  9  7  4  6  4  4
##  [373]  7  2  5  3  9  3  3  4  9  6  3  5  5  4  4  3 10  3  9 14  8  2  7  8  4  2  4  2  3  8 15
##  [404]  5  6  2  8 13  5  5  5  5  1  5 11  6  4  8  9  5  7  4  7  6  1  3  8  6  5  4 24  1  7  3
##  [435]  6  1  4  4  5  5  6  7  6  6 14  1  4  9  2  8  4  1  3  2  3  4 10  5  6  8  3  8  4  8  4
##  [466]  7  2  2  5 11  4  2  5  1  4  5  3  3 16  5  6  6  9  4  8  3  4  6  3  2  2  4 11  8  7  3
##  [497]  2  6  4  7  4  6  6  3  7  7  6  8  8  4 11  4  8  1  8  3  6 10  4  3  6  2  2 10  6  4  6
##  [528]  4  3  8  7  6  5  6  3  5  5  2  5  7  4  4 10  6  4  9  4  7  5 11  2 12 12  7 10  5  5  4
##  [559]  5  4  4  6  5  4  2  4  4 11  5  7 14  3  3 11 12  3  7  1  4  4  9  7  8  2  4 12  2  4  5
##  [590]  4  3  3  4  5  5  6  3  6 12  5  3 12  5  2  6  5  5  3  5  7  5  2  7  4  3  4  4  4  9  6
##  [621]  3  6  5  6 10  8 11  6  7  4  3  5  8 13  6  5  5  9  3  8  4  6  7  6  2  8  6 13  4  6  2
##  [652]  7  5  1  5  4  3  5  7  5 19  6  5  3  3  7  4  3  4  7  3  2  3  7  2  4  4  5  4  2  5  5
##  [683]  5  3  2  2  3  6  6  5  6  6 10  6  5  7  6  3  4  3  2  2  1  2 11  2  3  3  3  3  3  8  3
##  [714]  1  6  6 10  2  1  3  6  3  4  6  3  1  5  7  4  2  6  3  4  4  6  2 10  2  7  8  6  5  2  4
##  [745]  6  7  6  6  9 10  3  3  2  4  2  8  3  3  6  2  8  7 13  2  4  5  2  1  3  4  3  6 10  2  8
##  [776]  4  3 10  4  6  5  2  3  6  4  6  1  9  1  6  4  5  8  6 11  3  4 11  3  5  5  5  5  1  4  4
##  [807]  5  6  6  2  5  4 10  3  6  3  3  6 11  4  3  6  7  5 10  5  2 22  2  6  6  2  3  3  6  6  7
##  [838]  6  4  5  8  5  6  5  8  1  5  3  6  4  4  5  6  2  4  8  5  9  2  3  2  4  4  8  5  4  2  6
##  [869]  4  4 13  5  7  4  8  3  2  4  6  7 15  7  7  2  7  7  3  3  4  5  4  7  9  9  1  5  6  7  4
##  [900]  7  8  3  7  6  9  6  3  3  5  6  2  5  4  4  6  8  1  3  7  8  4  7  2  3  5  1  4  9  7  3
##  [931]  1  2  5  6  5  3 11  4  7  2  6  3  4  7  2  3  3  7  5 10  7  5 10  8  7  6  4  1  6  6  8
##  [962]  5  3  8  2  8 11  2  5  5  5 10 11  2  4  6  4  2  4  4  2  4  8  3  7  2  7  5 18  5  7 10
##  [993]  3  4  6  3  2  4  5 11
##  [ reached getOption("max.print") -- omitted 86140 entries ]
median(size(data))
## [1] 5

As we can see, most of the observations, or ‘transactions’, contain 5 programming lanuages.

length(data)
## [1] 87140

Total number of observations is equal to 87140.

Relative frequency of the programming languages in the dataset:

options(width = 100)
itemFrequency(data, type="relative")
##                     Ada                    Apex                     APL                Assembly 
##             0.007769107             0.006644480             0.002582052             0.054544411 
## Bash/Shell (all shells)                       C                      C#                     C++ 
##             0.325350011             0.194399816             0.277633693             0.225315584 
##                 Clojure                   Cobol                 Crystal                    Dart 
##             0.012680744             0.006610053             0.004464081             0.060511820 
##                  Delphi                  Elixir                  Erlang                      F# 
##             0.032487950             0.023272894             0.009960982             0.009742942 
##                    Flow                 Fortran                GDScript                      Go 
##             0.002455818             0.009559330             0.017156300             0.133027312 
##                  Groovy                 Haskell                HTML/CSS                    Java 
##             0.034151939             0.020989213             0.532430571             0.307057608 
##              JavaScript                   Julia                  Kotlin                    Lisp 
##             0.639327519             0.011590544             0.091060363             0.015400505 
##                     Lua                  MATLAB                     Nim             Objective-C 
##             0.061234795             0.038317650             0.003798485             0.023169612 
##                   OCaml                    Perl                     PHP              PowerShell 
##             0.007046133             0.024684416             0.186756943             0.136584806 
##                  Prolog                  Python                       R                    Raku 
##             0.008905210             0.495271976             0.042483360             0.001790223 
##                    Ruby                    Rust                     SAS                   Scala 
##             0.062588937             0.131133808             0.004900161             0.027794354 
##                Solidity                     SQL                   Swift              TypeScript 
##             0.013403718             0.489132431             0.046729401             0.390647234 
##                     VBA     Visual Basic (.Net)                     Zig 
##             0.035655267             0.040945605             0.008365848

Absolute frequencies:

options(width = 100)
itemFrequency(data, type="absolute")
##                     Ada                    Apex                     APL                Assembly 
##                     677                     579                     225                    4753 
## Bash/Shell (all shells)                       C                      C#                     C++ 
##                   28351                   16940                   24193                   19634 
##                 Clojure                   Cobol                 Crystal                    Dart 
##                    1105                     576                     389                    5273 
##                  Delphi                  Elixir                  Erlang                      F# 
##                    2831                    2028                     868                     849 
##                    Flow                 Fortran                GDScript                      Go 
##                     214                     833                    1495                   11592 
##                  Groovy                 Haskell                HTML/CSS                    Java 
##                    2976                    1829                   46396                   26757 
##              JavaScript                   Julia                  Kotlin                    Lisp 
##                   55711                    1010                    7935                    1342 
##                     Lua                  MATLAB                     Nim             Objective-C 
##                    5336                    3339                     331                    2019 
##                   OCaml                    Perl                     PHP              PowerShell 
##                     614                    2151                   16274                   11902 
##                  Prolog                  Python                       R                    Raku 
##                     776                   43158                    3702                     156 
##                    Ruby                    Rust                     SAS                   Scala 
##                    5454                   11427                     427                    2422 
##                Solidity                     SQL                   Swift              TypeScript 
##                    1168                   42623                    4072                   34041 
##                     VBA     Visual Basic (.Net)                     Zig 
##                    3107                    3568                     729

First ten observations:

options(width = 100)
inspect(data[1:10])
##      items                     
## [1]  {HTML/CSS,                
##       JavaScript,              
##       Python}                  
## [2]  {Bash/Shell (all shells), 
##       Go}                      
## [3]  {Bash/Shell (all shells), 
##       HTML/CSS,                
##       JavaScript,              
##       PHP,                     
##       Ruby,                    
##       SQL,                     
##       TypeScript}              
## [4]  {HTML/CSS,                
##       JavaScript,              
##       TypeScript}              
## [5]  {Bash/Shell (all shells), 
##       HTML/CSS,                
##       JavaScript,              
##       Ruby,                    
##       SQL,                     
##       TypeScript}              
## [6]  {Ada,                     
##       Clojure,                 
##       Elixir,                  
##       Go,                      
##       HTML/CSS,                
##       Java,                    
##       JavaScript,              
##       Lisp,                    
##       OCaml,                   
##       Raku,                    
##       Ruby,                    
##       Scala,                   
##       Swift,                   
##       TypeScript,              
##       Zig}                     
## [7]  {Go,                      
##       HTML/CSS,                
##       JavaScript,              
##       Python,                  
##       Rust,                    
##       SQL,                     
##       TypeScript}              
## [8]  {C#,                      
##       JavaScript,              
##       PowerShell,              
##       Ruby,                    
##       TypeScript}              
## [9]  {HTML/CSS,                
##       Java,                    
##       JavaScript,              
##       Python,                  
##       SQL,                     
##       TypeScript}              
## [10] {C#,                      
##       C++,                     
##       HTML/CSS,                
##       JavaScript,              
##       Python}

Item frequency plot for support above 0.1:

itemFrequencyPlot(data, support = 0.1)

Item frequency plot for top 15 programming languages:

itemFrequencyPlot(data, topN = 15)

Here we can better see the top 15 programming languages among Stack Overflow users. We can see that JavaScript is significantly more popular than other languages. However, HTML with CSS, Python and SQL are close behind, with similar number of users.

Graphical representation of 100 random samples from the dataset

set.seed(42)
image(sample(data, 100))

options(width = 100)
ctab<-crossTable(data, measure="count", sort=TRUE) 
ctab
##                         JavaScript HTML/CSS Python   SQL TypeScript Bash/Shell (all shells)  Java
## JavaScript                   55711    41117  27339 31870      29947                   19381 18830
## HTML/CSS                     41117    46396  23174 27992      24428                   17004 15386
## Python                       27339    23174  43158 22557      15996                   18545 14967
## SQL                          31870    27992  22557 42623      18919                   16344 15538
## TypeScript                   29947    24428  15996 18919      34041                   12151 11385
## Bash/Shell (all shells)      19381    17004  18545 16344      12151                   28351 10312
## Java                         18830    15386  14967 15538      11385                   10312 26757
## C#                           16957    14834  10614 15048      11186                    6876  7553
## C++                          11880    10302  13363  9013       6459                    8169  8160
## C                            10573     9366  11867  8306       5614                    8139  7639
## PHP                          14299    12870   7893 11779       7122                    5955  5845
## PowerShell                    9162     8409   6836  8601       5985                    6157  4252
## Go                            8042     6185   7401  6375       6000                    5909  4402
## Rust                          7323     5963   7549  5215       5947                    5424  3949
## Kotlin                        5243     4242   4283  4173       3926                    3074  5553
## Ruby                          4227     3430   2749  3252       2529                    2562  1673
## Lua                           3954     3473   3768  2828       2722                    3063  2115
## Dart                          3894     3305   2971  2852       2734                    1726  2460
## Assembly                      3169     2865   3471  2570       1696                    3130  2444
##                            C#   C++     C   PHP PowerShell    Go  Rust Kotlin Ruby  Lua Dart
## JavaScript              16957 11880 10573 14299       9162  8042  7323   5243 4227 3954 3894
## HTML/CSS                14834 10302  9366 12870       8409  6185  5963   4242 3430 3473 3305
## Python                  10614 13363 11867  7893       6836  7401  7549   4283 2749 3768 2971
## SQL                     15048  9013  8306 11779       8601  6375  5215   4173 3252 2828 2852
## TypeScript              11186  6459  5614  7122       5985  6000  5947   3926 2529 2722 2734
## Bash/Shell (all shells)  6876  8169  8139  5955       6157  5909  5424   3074 2562 3063 1726
## Java                     7553  8160  7639  5845       4252  4402  3949   5553 1673 2115 2460
## C#                      24193  7009  5425  4670       7036  2669  2903   2291 1105 1776 1780
## C++                      7009 19634 11340  4190       3467  3173  4269   2232 1162 2245 1633
## C                        5425 11340 16940  3957       2910  3153  4253   1975 1181 2289 1497
## PHP                      4670  4190  3957 16274       2941  2432  1696   1693 1242 1281 1455
## PowerShell               7036  3467  2910  2941      11902  1710  1618   1200  732 1138  781
## Go                       2669  3173  3153  2432       1710 11592  3339   1691 1361 1560 1205
## Rust                     2903  4269  4253  1696       1618  3339 11427   1631  971 2011 1014
## Kotlin                   2291  2232  1975  1693       1200  1691  1631   7935  684  773 1428
## Ruby                     1105  1162  1181  1242        732  1361   971    684 5454  587  414
## Lua                      1776  2245  2289  1281       1138  1560  2011    773  587 5336  503
## Dart                     1780  1633  1497  1455        781  1205  1014   1428  414  503 5273
## Assembly                 1752  3322  3828  1308       1102   947  1453    669  422  862  449
##                         Assembly Swift    R Visual Basic (.Net) MATLAB  VBA Groovy Delphi Scala
## JavaScript                  3169  2679 2127                2741   2063 2123   2074   1288  1302
## HTML/CSS                    2865  2199 1989                2575   1927 2016   1657   1126   986
## Python                      3471  1999 3102                1712   2778 1772   1798    838  1504
## SQL                         2570  1826 2437                2793   1825 2385   1833   1739  1359
## TypeScript                  1696  1809 1063                1396    983 1001   1362    506  1021
## Bash/Shell (all shells)     3130  1475 1662                1132   1486 1143   1831    595  1172
## Java                        2444  1677 1444                1560   1626 1208   2248    706  1491
## C#                          1752  1163  948                2682   1086 1609    703   1031   486
## C++                         3322  1209 1295                1352   2005 1021    742    912   538
## C                           3828  1110 1236                1162   1926  936    664    800   538
## PHP                         1308   963  795                1348    908 1096    548    832   368
## PowerShell                  1102   555  795                1347    802 1165    771    490   336
## Go                           947   769  530                 377    432  297    759    225   564
## Rust                        1453   704  606                 284    521  266    449    146   550
## Kotlin                       669  1316  370                 379    434  285    871    168   522
## Ruby                         422   561  319                 273    236  232    319    170   285
## Lua                          862   343  343                 314    368  262    307    175   230
## Dart                         449   744  274                 293    326  194    254    148   170
## Assembly                    4753   390  469                 516    740  459    237    435   235
##                         Perl Elixir Objective-C Haskell GDScript Lisp Solidity Clojure Julia Erlang
## JavaScript              1483   1381        1431    1242     1131  856     1030     703   503    583
## HTML/CSS                1288   1087        1159    1075     1048  779      843     512   456    451
## Python                  1464    914        1066    1351     1086  946      758     557   825    476
## SQL                     1479   1055        1041     920      693  724      689     511   425    471
## TypeScript               645    962         911     905      784  473      823     438   310    358
## Bash/Shell (all shells) 1423    842         836    1024      682  855      510     531   488    486
## Java                     968    482        1038     899      568  589      466     494   330    312
## C#                       569    341         772     530      678  352      324     223   231    189
## C++                      904    379         966     903      640  681      368     260   490    293
## C                       1027    437         915    1007      524  789      342     281   458    352
## PHP                      934    331         648     386      354  321      333     184   169    196
## PowerShell               550    185         415     343      279  264      220     136   194    151
## Go                       513    557         383     510      373  367      343     289   253    270
## Rust                     366    594         317     871      586  472      325     271   381    300
## Kotlin                   253    254         619     334      277  198      188     181   161    152
## Ruby                     386    559         350     225      170  252      140     177   123    221
## Lua                      332    254         231     406      382  342      153     159   203    169
## Dart                     146    197         309     216      230  126      188     131   131    114
## Assembly                 399    164         339     487      235  399      166     131   174    144
##                          F# Fortran Prolog Zig Ada OCaml Apex Cobol SAS Crystal Nim APL Flow Raku
## JavaScript              557     444    588 468 286   371  442   382 312     290 226 158  186  100
## HTML/CSS                468     429    544 396 256   313  380   368 299     257 209 136  159   87
## Python                  416     599    606 483 338   422  255   303 274     197 262 133  121  102
## SQL                     529     453    517 297 270   287  394   428 332     256 161 123  139   91
## TypeScript              492     184    353 387 166   284  212   202 171     182 172  99  146   68
## Bash/Shell (all shells) 326     433    436 412 254   336  217   246 170     193 200 145  112   99
## Java                    268     360    535 261 275   282  251   336 168     149 145 124   99   71
## C#                      690     272    308 244 188   172  201   269 150     192 129 111   94   62
## C++                     280     533    469 395 295   300  129   286 144     156 169 140   76   79
## C                       269     540    497 479 329   385  133   307 143     167 183 150   71   89
## PHP                     170     266    279 136 143   148  167   248 174     136  94  84   93   60
## PowerShell              354     210    235 159 129    99  144   200 143     120 101  72   72   61
## Go                      201     165    198 349 117   162   90   123  85     138 151  82   72   67
## Rust                    265     153    235 489 117   283   70    95  70     136 178  95   73   71
## Kotlin                  151     122    175 149  97   114   76   105  68      95 101  68   74   50
## Ruby                    106     117    120 117  91   101   79   105  77     171  87  68   72   68
## Lua                     143     145    140 255  76   144   57    97  60      93 118  78   51   61
## Dart                    109      89    130 129  77    92   66    79  63      90  92  57   54   42
## Assembly                142     311    269 222 191   191   90   203  83      90  99 120   57   59
##  [ reached getOption("max.print") -- omitted 32 rows ]

Association Rules

Association rules are used to define the relationship between the occurences of two or more items, thus allowing a possibility of discovering patterns of occurence in the data. There are three main measures we will use to evaluate obtained rules:

Support

Support shows how frequent the itemset or a rule occurs in the data. In other words, support is equal to relative frequency of the item or rule.

Confidence

Percentage value describing the proportion of transaction where the presence of given item (or itemset) results in the presence of another item (or itemset). Higer confidence value indicate stronger rule.

Lift

Value describing the increase of probability of having item X on the cart knowing item Y is present over the probability of having item X on the cart without the knowledge about the item Y presence. Values greater than 1 indicate positive relationsip between X and Y. Lift of around 1 implies that the sets are independent. Values below 1 implies negative association between X and Y.

Apriori Algorithm

Apriori algorithm uses prior knowledge of frequent items. It allows for reduction of the number of rules obtained for the analysis by allowing minimum support level to be defined at the begining. The algorithm assumes that all nonempty subsets of frequent items must also be frequent and vice versa. This allows for a manageable sized output.

We want to find rules that meet minimum threshold of support, confidence and length. We will search for rules with at least 5% support and 25% confidence level. Minimum length will be set to 2 to avoid obtaining rules with empty sets on one side.

apriori_rules <- apriori(data, parameter = list(support = 0.05, confidence = 0.25, minlen = 2))
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen maxlen target  ext
##        0.25    0.1    1 none FALSE            TRUE       5    0.05      2     10  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 4357 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[51 item(s), 87140 transaction(s)] done [0.02s].
## sorting and recoding items ... [19 item(s)] done [0.00s].
## creating transaction tree ... done [0.02s].
## checking subsets of size 1 2 3 4 5 done [0.01s].
## writing ... [860 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].

We can see that for those parameters we have created 860 rules.

Inspecting first 10 rules

inspect(apriori_rules[1:10])
##      lhs         rhs                       support    confidence coverage   lift      count
## [1]  {Kotlin} => {Java}                    0.06372504 0.6998110  0.09106036 2.2790869 5553 
## [2]  {Kotlin} => {JavaScript}              0.06016755 0.6607435  0.09106036 1.0334977 5243 
## [3]  {Rust}   => {Bash/Shell (all shells)} 0.06224466 0.4746653  0.13113381 1.4589373 5424 
## [4]  {Rust}   => {TypeScript}              0.06824650 0.5204341  0.13113381 1.3322354 5947 
## [5]  {Rust}   => {Python}                  0.08663071 0.6606283  0.13113381 1.3338698 7549 
## [6]  {Rust}   => {SQL}                     0.05984622 0.4563753  0.13113381 0.9330300 5215 
## [7]  {Rust}   => {HTML/CSS}                0.06843011 0.5218343  0.13113381 0.9800982 5963 
## [8]  {Rust}   => {JavaScript}              0.08403718 0.6408506  0.13113381 1.0023823 7323 
## [9]  {Go}     => {Java}                    0.05051641 0.3797447  0.13302731 1.2367212 4402 
## [10] {Go}     => {Bash/Shell (all shells)} 0.06781042 0.5097481  0.13302731 1.5667684 5909

Inspecting first 10 rules with the highest lift

inspect(sort(apriori_rules, by = "lift")[1:10])
##      lhs                                       rhs support    confidence coverage   lift     count
## [1]  {Bash/Shell (all shells), C++, Python} => {C} 0.05137709 0.6924981  0.07419096 3.562236 4477 
## [2]  {Bash/Shell (all shells), C++}         => {C} 0.06365619 0.6790305  0.09374570 3.492958 5547 
## [3]  {C++, Java}                            => {C} 0.06235942 0.6659314  0.09364241 3.425576 5434 
## [4]  {C++, HTML/CSS, JavaScript, Python}    => {C} 0.05027542 0.6618825  0.07595823 3.404748 4381 
## [5]  {C++, HTML/CSS, Python}                => {C} 0.05612807 0.6557179  0.08559789 3.373038 4891 
## [6]  {C++, JavaScript, Python}              => {C} 0.06296764 0.6478158  0.09719991 3.332389 5487 
## [7]  {C++, JavaScript, SQL}                 => {C} 0.05220335 0.6470839  0.08067478 3.328624 4549 
## [8]  {C++, SQL}                             => {C} 0.06484967 0.6269832  0.10343126 3.225226 5651 
## [9]  {C++, HTML/CSS, JavaScript}            => {C} 0.06489557 0.6237591  0.10403948 3.208640 5655 
## [10] {C++, HTML/CSS}                        => {C} 0.07298600 0.6173559  0.11822355 3.175702 6360

Inspecting first 10 rules with the highest confidence

inspect(sort(apriori_rules, by = "confidence")[1:10])
##      lhs                                                     rhs          support    confidence
## [1]  {HTML/CSS, PHP, SQL, TypeScript}                     => {JavaScript} 0.05277714 0.9780944 
## [2]  {HTML/CSS, PHP, TypeScript}                          => {JavaScript} 0.06744319 0.9736581 
## [3]  {PHP, SQL, TypeScript}                               => {JavaScript} 0.05958228 0.9652352 
## [4]  {HTML/CSS, Java, Python, TypeScript}                 => {JavaScript} 0.05632316 0.9600939 
## [5]  {PHP, TypeScript}                                    => {JavaScript} 0.07835667 0.9587195 
## [6]  {HTML/CSS, Python, SQL, TypeScript}                  => {JavaScript} 0.08742254 0.9573960 
## [7]  {HTML/CSS, PHP, Python, SQL}                         => {JavaScript} 0.05678219 0.9555813 
## [8]  {Bash/Shell (all shells), HTML/CSS, SQL, TypeScript} => {JavaScript} 0.07095479 0.9551985 
## [9]  {HTML/CSS, PowerShell, TypeScript}                   => {JavaScript} 0.05407390 0.9548126 
## [10] {HTML/CSS, Java, SQL, TypeScript}                    => {JavaScript} 0.06433326 0.9538880 
##      coverage   lift     count
## [1]  0.05395915 1.529880 4599 
## [2]  0.06926784 1.522941 5877 
## [3]  0.06172825 1.509766 5192 
## [4]  0.05866422 1.501725 4908 
## [5]  0.08173055 1.499575 6828 
## [6]  0.09131283 1.497505 7618 
## [7]  0.05942162 1.494666 4948 
## [8]  0.07428276 1.494068 6183 
## [9]  0.05663300 1.493464 4712 
## [10] 0.06744319 1.492018 5606

Here we can see that users that mainly worked on the web development (HTML/CSS, PHP, TypeScript) are likely to have also worked with JavaScript.

Scatterplot of rules’ confidence vs support

plot(apriori_rules)

We can see that most of the rules have low support value, while confidence ranges greatly. Lift values remain, in most cases, above 1.

Graph of the first 100 rules defined by support and lift values.

plot(apriori_rules, method='graph', measure = "support", shading = "lift", engine='html')

We can see clearly that Kotlin language does not appear in many rules, as well as Go and Rust, which suggests that they are not the most popular programming languages.

Matrix of rules

plot(apriori_rules, method="matrix", measure="lift")

Scatterplot of lift vs support

plot(apriori_rules, measure=c("support","lift"), shading="confidence")
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.

plot(apriori_rules, shading="order")

Jaccard Index is used to visualize the differences between programming languages. Value of the index describes how often (in percentages) the two items overlap. In our case we will use dissimilarity, which equals Jaccard Distance (1 - Jaccard Coefficient). So, in our case, for example a value of 0.9 means that the two items do not overlap in 90% of the observations.

options(width = 100)
data_sel<-data[,itemFrequency(data)>0.05]
jaccard_idx<-dissimilarity(data_sel, which="items")
round(jaccard_idx,2)
##                         Assembly Bash/Shell (all shells)    C   C#  C++ Dart   Go HTML/CSS Java
## Bash/Shell (all shells)     0.90                                                               
## C                           0.79                    0.78                                       
## C#                          0.94                    0.85 0.85                                  
## C++                         0.84                    0.79 0.55 0.81                             
## Dart                        0.95                    0.95 0.93 0.94 0.93                        
## Go                          0.94                    0.83 0.88 0.92 0.89 0.92                   
## HTML/CSS                    0.94                    0.71 0.83 0.73 0.82 0.93 0.88              
## Java                        0.92                    0.77 0.79 0.83 0.79 0.92 0.87     0.73     
## JavaScript                  0.94                    0.70 0.83 0.73 0.81 0.93 0.86     0.33 0.70
## Kotlin                      0.94                    0.91 0.91 0.92 0.91 0.88 0.91     0.92 0.81
## Lua                         0.91                    0.90 0.89 0.94 0.90 0.95 0.90     0.93 0.93
## PHP                         0.93                    0.85 0.86 0.87 0.87 0.93 0.90     0.74 0.84
## PowerShell                  0.93                    0.82 0.89 0.76 0.88 0.95 0.92     0.83 0.88
## Python                      0.92                    0.65 0.75 0.81 0.73 0.93 0.84     0.65 0.73
## Ruby                        0.96                    0.92 0.94 0.96 0.95 0.96 0.91     0.93 0.95
## Rust                        0.90                    0.84 0.82 0.91 0.84 0.94 0.83     0.89 0.88
## SQL                         0.94                    0.70 0.84 0.71 0.83 0.94 0.87     0.54 0.71
## TypeScript                  0.95                    0.76 0.88 0.76 0.86 0.93 0.85     0.56 0.77
##                         JavaScript Kotlin  Lua  PHP PowerShell Python Ruby Rust  SQL
## Bash/Shell (all shells)                                                             
## C                                                                                   
## C#                                                                                  
## C++                                                                                 
## Dart                                                                                
## Go                                                                                  
## HTML/CSS                                                                            
## Java                                                                                
## JavaScript                                                                          
## Kotlin                        0.91                                                  
## Lua                           0.93   0.94                                           
## PHP                           0.75   0.92 0.94                                      
## PowerShell                    0.84   0.94 0.93 0.88                                 
## Python                        0.62   0.91 0.92 0.85       0.86                      
## Ruby                          0.93   0.95 0.94 0.94       0.96   0.94               
## Rust                          0.88   0.91 0.86 0.93       0.93   0.84 0.94          
## SQL                           0.52   0.91 0.94 0.75       0.81   0.64 0.93 0.89     
## TypeScript                    0.50   0.90 0.93 0.84       0.85   0.74 0.93 0.85 0.67

We can see that most of the programming languages do not overlap, with some expected exceptions, such as JavaScript and HTML/CSS (0.33), which are both neccessary for Front-End web development.

Relationships’ Left Hand Side (LHS) groups and Right Hand Side (RHS)

plot(apriori_rules, method="grouped") 

Association rules based on specific languages

Now we will try to examine association rules but with value on one side set before the algorithm will extract the relationships.

Python

What language do developers use when they already use Python?

In this case we set Python programming language to the LHS of the rules.

rules_python <- apriori(data=data, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="rhs", lhs="Python"), control=list(verbose=F))

First 6 rules based on confidence

inspect(head(sort(rules_python, by="confidence", decreasing=TRUE), 6))
##     lhs         rhs                       support   confidence coverage lift      count
## [1] {Python} => {JavaScript}              0.3137365 0.6334631  0.495272 0.9908272 27339
## [2] {Python} => {HTML/CSS}                0.2659399 0.5369572  0.495272 1.0085019 23174
## [3] {Python} => {SQL}                     0.2588593 0.5226609  0.495272 1.0685469 22557
## [4] {Python} => {Bash/Shell (all shells)} 0.2128185 0.4297002  0.495272 1.3207320 18545
## [5] {Python} => {TypeScript}              0.1835667 0.3706381  0.495272 0.9487796 15996
## [6] {Python} => {Java}                    0.1717581 0.3467955  0.495272 1.1294151 14967

We can see that developers using Python are also likely to use languages such as JavaScript, HTML/CSS and SQL. First two may suggest connection between Front-End (JS, HTML/CSS) and Back-End (JS, Python) development. SQL may suggest more data-related Python programming.

Graph visualisig obtained relationships

plot(rules_python, method='graph', measure = "support", shading = "lift")

plot(rules_python, method="paracoord", control=list(reorder=TRUE))

Significance of the rules tested with Fisher’s exact test

options(width = 100)
is.significant(rules_python, data)
##  [1]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE
## [17]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE

Checking if the sets are maximal (do not contain supersets). Superset is a set that is not contained in another (existing) set.

options(width = 100)
is.maximal(rules_python)
##                {GDScript,Python}                    {Lisp,Python}                  {Elixir,Python} 
##                             TRUE                             TRUE                             TRUE 
##                   {Python,Scala}                 {Haskell,Python}             {Objective-C,Python} 
##                             TRUE                             TRUE                             TRUE 
##                    {Perl,Python}                  {Groovy,Python}                     {Python,VBA} 
##                             TRUE                             TRUE                             TRUE 
##                       {Python,R}                  {MATLAB,Python}                   {Python,Swift} 
##                             TRUE                             TRUE                             TRUE 
##     {Python,Visual Basic (.Net)}                    {Python,Ruby}                    {Dart,Python} 
##                             TRUE                             TRUE                             TRUE 
##                {Assembly,Python}                     {Lua,Python}                  {Kotlin,Python} 
##                             TRUE                             TRUE                             TRUE 
##                    {Python,Rust}                      {Go,Python}              {PowerShell,Python} 
##                             TRUE                             TRUE                             TRUE 
##                     {PHP,Python}                       {C,Python}                     {C++,Python} 
##                             TRUE                             TRUE                             TRUE 
##                      {C#,Python}                    {Java,Python} {Bash/Shell (all shells),Python} 
##                             TRUE                             TRUE                             TRUE 
##              {Python,TypeScript}                     {Python,SQL}                {HTML/CSS,Python} 
##                             TRUE                             TRUE                             TRUE 
##              {JavaScript,Python} 
##                             TRUE

Checking if the rule is redundand, meaning that there is more general one with the same or higer confidence value

options(width = 100)
is.redundant(rules_python)
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [17] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

JavaScript

What language do developers use when they already use JavaScript?

Here we set JavaScript programming language again on the LHS of the rules.

rules_js <- apriori(data=data, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="rhs", lhs="JavaScript"), control=list(verbose=F))
inspect(head(sort(rules_js, by="confidence", decreasing=TRUE),10))
##      lhs             rhs                       support   confidence coverage  lift      count
## [1]  {JavaScript} => {HTML/CSS}                0.4718499 0.7380410  0.6393275 1.3861731 41117
## [2]  {JavaScript} => {SQL}                     0.3657333 0.5720594  0.6393275 1.1695388 31870
## [3]  {JavaScript} => {TypeScript}              0.3436654 0.5375420  0.6393275 1.3760291 29947
## [4]  {JavaScript} => {Python}                  0.3137365 0.4907289  0.6393275 0.9908272 27339
## [5]  {JavaScript} => {Bash/Shell (all shells)} 0.2224122 0.3478846  0.6393275 1.0692627 19381
## [6]  {JavaScript} => {Java}                    0.2160891 0.3379943  0.6393275 1.1007521 18830
## [7]  {JavaScript} => {C#}                      0.1945949 0.3043744  0.6393275 1.0963164 16957
## [8]  {JavaScript} => {PHP}                     0.1640923 0.2566639  0.6393275 1.3743203 14299
## [9]  {JavaScript} => {C++}                     0.1363323 0.2132433  0.6393275 0.9464208 11880
## [10] {JavaScript} => {C}                       0.1213335 0.1897830  0.6393275 0.9762509 10573

As expected, based on confidence value, association of JavaScript with HTML/CSS takes the first spot, with very high confidence value. Second place of SQL suggest more data-related developers. TypeScript at the third place is also reasonably expected programming language, as it was created as an JavaScript ‘substitute’.

Graph representation of obtained JavaScript relationships

set.seed(42)
plot(rules_js, method='graph', measure = "support", shading = "lift", max=20)
## Warning: Too many rules supplied. Only plotting the best 20 using 'lift' (change control parameter
## max if needed).

plot(rules_js, method="paracoord", control=list(reorder=TRUE))

Significance

options(width = 100)
is.significant(rules_js, data)
##  [1]  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
## [17]  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE

Maximal sets

options(width = 100)
is.maximal(rules_js)
##                {JavaScript,Solidity}                {GDScript,JavaScript} 
##                                 TRUE                                 TRUE 
##                  {Elixir,JavaScript}                  {Delphi,JavaScript} 
##                                 TRUE                                 TRUE 
##                   {JavaScript,Scala}                 {Haskell,JavaScript} 
##                                 TRUE                                 TRUE 
##             {JavaScript,Objective-C}                    {JavaScript,Perl} 
##                                 TRUE                                 TRUE 
##                  {Groovy,JavaScript}                     {JavaScript,VBA} 
##                                 TRUE                                 TRUE 
##                       {JavaScript,R}                  {JavaScript,MATLAB} 
##                                 TRUE                                 TRUE 
##                   {JavaScript,Swift}     {JavaScript,Visual Basic (.Net)} 
##                                 TRUE                                 TRUE 
##                    {JavaScript,Ruby}                    {Dart,JavaScript} 
##                                 TRUE                                 TRUE 
##                {Assembly,JavaScript}                     {JavaScript,Lua} 
##                                 TRUE                                 TRUE 
##                  {JavaScript,Kotlin}                    {JavaScript,Rust} 
##                                 TRUE                                 TRUE 
##                      {Go,JavaScript}              {JavaScript,PowerShell} 
##                                 TRUE                                 TRUE 
##                     {JavaScript,PHP}                       {C,JavaScript} 
##                                 TRUE                                 TRUE 
##                     {C++,JavaScript}                      {C#,JavaScript} 
##                                 TRUE                                 TRUE 
##                    {Java,JavaScript} {Bash/Shell (all shells),JavaScript} 
##                                 TRUE                                 TRUE 
##              {JavaScript,TypeScript}                  {JavaScript,Python} 
##                                 TRUE                                 TRUE 
##                     {JavaScript,SQL}                {HTML/CSS,JavaScript} 
##                                 TRUE                                 TRUE

Redundant sets

options(width = 100)
is.redundant(rules_js)
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [17] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

R

What languages do developers use to also use R?

Here we set R programming language again on the RHS of the rules.

rules_r <- apriori(data=data, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="lhs", rhs="R"), control=list(verbose=F))
inspect(head(sort(rules_r, by="confidence", decreasing=TRUE),10))
##      lhs                                            rhs support    confidence coverage  lift    
## [1]  {Java, Python, SQL}                         => {R} 0.01128070 0.10506627 0.1073675 2.473116
## [2]  {C++, SQL}                                  => {R} 0.01023640 0.09896816 0.1034313 2.329575
## [3]  {Bash/Shell (all shells), Python, SQL}      => {R} 0.01225614 0.09791877 0.1251664 2.304873
## [4]  {HTML/CSS, Java, Python}                    => {R} 0.01048887 0.09679127 0.1083658 2.278334
## [5]  {C, Python}                                 => {R} 0.01269222 0.09319963 0.1361832 2.193791
## [6]  {Python, SQL}                               => {R} 0.02407620 0.09300882 0.2588593 2.189300
## [7]  {Java, JavaScript, Python}                  => {R} 0.01168235 0.09012838 0.1296190 2.121498
## [8]  {HTML/CSS, Python, SQL}                     => {R} 0.01527427 0.08977472 0.1701400 2.113174
## [9]  {Bash/Shell (all shells), HTML/CSS, Python} => {R} 0.01113151 0.08837464 0.1259582 2.080218
## [10] {HTML/CSS, JavaScript, Python, SQL}         => {R} 0.01339224 0.08668202 0.1544985 2.040376
##      count
## [1]   983 
## [2]   892 
## [3]  1068 
## [4]   914 
## [5]  1106 
## [6]  2098 
## [7]  1018 
## [8]  1331 
## [9]   970 
## [10] 1167

Here, we can see high variety of languages that can suggest also working with R. Java and C++ to Python, SQL and Bash are very different languages, however together they can represent a skillset of proffesional data scientist, so R can be seen as an important language for this field. But we can see that the confidence values for those rules are rather low.

Graph representing obtained relationships

plot(rules_r, method='graph', measure = "support", shading = "lift", max=20)

plot(rules_r, method="paracoord", control=list(reorder=TRUE))

Significance

options(width = 100)
is.significant(rules_r, data)
##  [1]  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [17]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
## [33]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE

Maximal sets

options(width = 100)
is.maximal(rules_r)
##                                           {C,R}                                         {C++,R} 
##                                           FALSE                                           FALSE 
##                                          {C#,R}                                        {Java,R} 
##                                            TRUE                                           FALSE 
##                     {Bash/Shell (all shells),R}                                  {R,TypeScript} 
##                                           FALSE                                           FALSE 
##                                      {Python,R}                                         {R,SQL} 
##                                           FALSE                                           FALSE 
##                                    {HTML/CSS,R}                                  {JavaScript,R} 
##                                           FALSE                                           FALSE 
##                                       {C,C++,R}                                    {C,Python,R} 
##                                            TRUE                                            TRUE 
##                                {C,JavaScript,R}                                  {C++,Python,R} 
##                                            TRUE                                            TRUE 
##                                     {C++,R,SQL}                              {C++,JavaScript,R} 
##                                            TRUE                                            TRUE 
##                                 {Java,Python,R}                                    {Java,R,SQL} 
##                                           FALSE                                           FALSE 
##                               {HTML/CSS,Java,R}                             {Java,JavaScript,R} 
##                                           FALSE                                           FALSE 
##              {Bash/Shell (all shells),Python,R}                 {Bash/Shell (all shells),R,SQL} 
##                                           FALSE                                           FALSE 
##            {Bash/Shell (all shells),HTML/CSS,R}          {Bash/Shell (all shells),JavaScript,R} 
##                                           FALSE                                           FALSE 
##                           {Python,R,TypeScript}                       {JavaScript,R,TypeScript} 
##                                           FALSE                                           FALSE 
##                                  {Python,R,SQL}                             {HTML/CSS,Python,R} 
##                                           FALSE                                           FALSE 
##                           {JavaScript,Python,R}                                {HTML/CSS,R,SQL} 
##                                           FALSE                                           FALSE 
##                              {JavaScript,R,SQL}                         {HTML/CSS,JavaScript,R} 
##                                           FALSE                                           FALSE 
##                             {Java,Python,R,SQL}                        {HTML/CSS,Java,Python,R} 
##                                            TRUE                                            TRUE 
##                      {Java,JavaScript,Python,R}                         {Java,JavaScript,R,SQL} 
##                                            TRUE                                            TRUE 
##                    {HTML/CSS,Java,JavaScript,R}          {Bash/Shell (all shells),Python,R,SQL} 
##                                            TRUE                                            TRUE 
##     {Bash/Shell (all shells),HTML/CSS,Python,R}   {Bash/Shell (all shells),JavaScript,Python,R} 
##                                            TRUE                                            TRUE 
##      {Bash/Shell (all shells),JavaScript,R,SQL} {Bash/Shell (all shells),HTML/CSS,JavaScript,R} 
##                                            TRUE                                            TRUE 
##                {JavaScript,Python,R,TypeScript}                         {HTML/CSS,Python,R,SQL} 
##                                            TRUE                                           FALSE 
##                       {JavaScript,Python,R,SQL}                  {HTML/CSS,JavaScript,Python,R} 
##                                           FALSE                                           FALSE 
##                     {HTML/CSS,JavaScript,R,SQL}              {HTML/CSS,JavaScript,Python,R,SQL} 
##                                           FALSE                                            TRUE

Redundant sets

options(width = 100)
is.redundant(rules_r)
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [17] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
## [33] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE

VBA

What languages do developers use to also use VBA?

For VBA programming language we also set it on the RHS of the rules.

rules_vba <- apriori(data=data, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="lhs", rhs="VBA"), control=list(verbose=F))

First 6 rows of the rules by confidence

inspect(head(sort(rules_vba, by="confidence", decreasing=TRUE),10))
##      lhs                                  rhs   support    confidence coverage   lift     count
## [1]  {SQL, Visual Basic (.Net)}        => {VBA} 0.01101675 0.34371643 0.03205187 9.639990  960 
## [2]  {JavaScript, Visual Basic (.Net)} => {VBA} 0.01028230 0.32688800 0.03145513 9.168014  896 
## [3]  {Visual Basic (.Net)}             => {VBA} 0.01296764 0.31670404 0.04094560 8.882391 1130 
## [4]  {PowerShell, SQL}                 => {VBA} 0.01156759 0.11719567 0.09870324 3.286911 1008 
## [5]  {HTML/CSS, PowerShell}            => {VBA} 0.01021345 0.10583898 0.09649989 2.968397  890 
## [6]  {JavaScript, PowerShell}          => {VBA} 0.01079871 0.10270683 0.10514115 2.880551  941 
## [7]  {PowerShell}                      => {VBA} 0.01336929 0.09788271 0.13658481 2.745252 1165 
## [8]  {C#, HTML/CSS, JavaScript, SQL}   => {VBA} 0.01086757 0.09734786 0.11163645 2.730252  947 
## [9]  {C#, HTML/CSS, SQL}               => {VBA} 0.01183154 0.09610365 0.12311223 2.695356 1031 
## [10] {C#, JavaScript, SQL}             => {VBA} 0.01253156 0.09208196 0.13609135 2.582563 1092

VBA is a programming language associated by most with the Excel software. Knowing this, we can understand some of the relationsips we have obtained. SQL, as a database query language, works well with VBA on data related tasks. Visual Basic, programming language also associated with Microsoft, along with PowerShell show, that the developers working with data and/or in the Windows enviroment are more likely to also know VBA.

Graph of the relationships

plot(rules_vba, method='graph', measure = "support", shading = "lift")

plot(rules_vba, method="paracoord", control=list(reorder=TRUE))

Significance

options(width = 100)
is.significant(rules_vba, data)
##  [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [17]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE
## [33]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [49]  TRUE

Maximal sets

options(width = 100)
is.maximal(rules_vba)
##                {VBA,Visual Basic (.Net)}                         {PowerShell,VBA} 
##                                    FALSE                                    FALSE 
##                                {PHP,VBA}                                  {C,VBA} 
##                                    FALSE                                     TRUE 
##                                {C++,VBA}                                 {C#,VBA} 
##                                     TRUE                                    FALSE 
##                               {Java,VBA}            {Bash/Shell (all shells),VBA} 
##                                    FALSE                                    FALSE 
##                         {TypeScript,VBA}                             {Python,VBA} 
##                                    FALSE                                    FALSE 
##                                {SQL,VBA}                           {HTML/CSS,VBA} 
##                                    FALSE                                    FALSE 
##                         {JavaScript,VBA}            {SQL,VBA,Visual Basic (.Net)} 
##                                    FALSE                                     TRUE 
##     {JavaScript,VBA,Visual Basic (.Net)}                     {PowerShell,SQL,VBA} 
##                                     TRUE                                     TRUE 
##                {HTML/CSS,PowerShell,VBA}              {JavaScript,PowerShell,VBA} 
##                                     TRUE                                     TRUE 
##                            {PHP,SQL,VBA}                       {HTML/CSS,PHP,VBA} 
##                                    FALSE                                    FALSE 
##                     {JavaScript,PHP,VBA}                             {C#,SQL,VBA} 
##                                    FALSE                                    FALSE 
##                        {C#,HTML/CSS,VBA}                      {C#,JavaScript,VBA} 
##                                    FALSE                                    FALSE 
##                           {Java,SQL,VBA}                      {HTML/CSS,Java,VBA} 
##                                    FALSE                                     TRUE 
##                    {Java,JavaScript,VBA}        {Bash/Shell (all shells),SQL,VBA} 
##                                    FALSE                                     TRUE 
##   {Bash/Shell (all shells),HTML/CSS,VBA} {Bash/Shell (all shells),JavaScript,VBA} 
##                                     TRUE                                     TRUE 
##              {JavaScript,TypeScript,VBA}                         {Python,SQL,VBA} 
##                                     TRUE                                    FALSE 
##                    {HTML/CSS,Python,VBA}                  {JavaScript,Python,VBA} 
##                                    FALSE                                    FALSE 
##                       {HTML/CSS,SQL,VBA}                     {JavaScript,SQL,VBA} 
##                                    FALSE                                    FALSE 
##                {HTML/CSS,JavaScript,VBA}                 {JavaScript,PHP,SQL,VBA} 
##                                    FALSE                                     TRUE 
##            {HTML/CSS,JavaScript,PHP,VBA}                    {C#,HTML/CSS,SQL,VBA} 
##                                     TRUE                                    FALSE 
##                  {C#,JavaScript,SQL,VBA}             {C#,HTML/CSS,JavaScript,VBA} 
##                                    FALSE                                    FALSE 
##                {Java,JavaScript,SQL,VBA}                {HTML/CSS,Python,SQL,VBA} 
##                                     TRUE                                    FALSE 
##              {JavaScript,Python,SQL,VBA}         {HTML/CSS,JavaScript,Python,VBA} 
##                                    FALSE                                    FALSE 
##            {HTML/CSS,JavaScript,SQL,VBA}         {C#,HTML/CSS,JavaScript,SQL,VBA} 
##                                    FALSE                                     TRUE 
##     {HTML/CSS,JavaScript,Python,SQL,VBA} 
##                                     TRUE

Redundant sets

options(width = 100)
is.redundant(rules_vba)
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [17] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
## [33] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [49] FALSE

ECLAT Algorithm

Equivalence Class Clustering and bottom-up Lattice Traversal (ECLAT) algorithm can also be used to extract relationships between the values. Compared to Apriori, which works in horizontal sense, ECLAT works in vertical manner, similarly to Depth-First Search of a graph. It can be used with top-up, bottom-up or hybrid approach and allows fast computing.

eclat_items <- eclat(data, parameter=list(supp=0.05, maxlen=15)) 
## Eclat
## 
## parameter specification:
##  tidLists support minlen maxlen            target  ext
##     FALSE    0.05      1     15 frequent itemsets TRUE
## 
## algorithmic control:
##  sparse sort verbose
##       7   -2    TRUE
## 
## Absolute minimum support count: 4357 
## 
## create itemset ... 
## set transactions ...[51 item(s), 87140 transaction(s)] done [0.02s].
## sorting and recoding items ... [19 item(s)] done [0.00s].
## creating bit matrix ... [19 row(s), 87140 column(s)] done [0.00s].
## writing  ... [318 set(s)] done [0.00s].
## Creating S4 object  ... done [0.00s].
inspect(eclat_items[1:10])
##      items                          support    count
## [1]  {JavaScript, Kotlin}           0.06016755 5243 
## [2]  {Java, Kotlin}                 0.06372504 5553 
## [3]  {JavaScript, Rust, TypeScript} 0.05859536 5106 
## [4]  {JavaScript, Python, Rust}     0.05781501 5038 
## [5]  {HTML/CSS, JavaScript, Rust}   0.06021345 5247 
## [6]  {JavaScript, Rust}             0.08403718 7323 
## [7]  {HTML/CSS, Rust}               0.06843011 5963 
## [8]  {Rust, SQL}                    0.05984622 5215 
## [9]  {Python, Rust}                 0.08663071 7549 
## [10] {Rust, TypeScript}             0.06824650 5947

As we can see, we obtained a set of 574 rules for minimum confidence parameter of 50%.

eclat_rules<-ruleInduction(eclat_items, data, confidence=0.5)
eclat_rules
## set of 574 rules
inspect(eclat_rules[1:10])
##      lhs                   rhs          support    confidence lift      itemset
## [1]  {Kotlin}           => {JavaScript} 0.06016755 0.6607435  1.0334977 1      
## [2]  {Kotlin}           => {Java}       0.06372504 0.6998110  2.2790869 2      
## [3]  {Rust, TypeScript} => {JavaScript} 0.05859536 0.8585842  1.3429489 3      
## [4]  {JavaScript, Rust} => {TypeScript} 0.05859536 0.6972552  1.7848718 3      
## [5]  {Python, Rust}     => {JavaScript} 0.05781501 0.6673732  1.0438674 4      
## [6]  {JavaScript, Rust} => {Python}     0.05781501 0.6879694  1.3890740 4      
## [7]  {JavaScript, Rust} => {HTML/CSS}   0.06021345 0.7165096  1.3457334 5      
## [8]  {HTML/CSS, Rust}   => {JavaScript} 0.06021345 0.8799262  1.3763309 5      
## [9]  {Rust}             => {JavaScript} 0.08403718 0.6408506  1.0023823 6      
## [10] {Rust}             => {HTML/CSS}   0.06843011 0.5218343  0.9800982 7

First 6 rules by confidence

inspect(head(sort(eclat_rules, by="confidence", decreasing=TRUE),10))
##      lhs                                                     rhs          support    confidence
## [1]  {HTML/CSS, PHP, SQL, TypeScript}                     => {JavaScript} 0.05277714 0.9780944 
## [2]  {HTML/CSS, PHP, TypeScript}                          => {JavaScript} 0.06744319 0.9736581 
## [3]  {PHP, SQL, TypeScript}                               => {JavaScript} 0.05958228 0.9652352 
## [4]  {HTML/CSS, Java, Python, TypeScript}                 => {JavaScript} 0.05632316 0.9600939 
## [5]  {PHP, TypeScript}                                    => {JavaScript} 0.07835667 0.9587195 
## [6]  {HTML/CSS, Python, SQL, TypeScript}                  => {JavaScript} 0.08742254 0.9573960 
## [7]  {HTML/CSS, PHP, Python, SQL}                         => {JavaScript} 0.05678219 0.9555813 
## [8]  {Bash/Shell (all shells), HTML/CSS, SQL, TypeScript} => {JavaScript} 0.07095479 0.9551985 
## [9]  {HTML/CSS, PowerShell, TypeScript}                   => {JavaScript} 0.05407390 0.9548126 
## [10] {HTML/CSS, Java, SQL, TypeScript}                    => {JavaScript} 0.06433326 0.9538880 
##      lift     itemset
## [1]  1.529880  59    
## [2]  1.522941  62    
## [3]  1.509766  60    
## [4]  1.501725 216    
## [5]  1.499575  63    
## [6]  1.497505 274    
## [7]  1.494666  66    
## [8]  1.494068 250    
## [9]  1.493464  32    
## [10] 1.492018 220

As we can see, using ECLAT algorithm we obtained similar results compared to the Apriori algorithm, based on top 6 relationships ordered by the highest confidence, as JavaScript and other web-dev related languages take the top spot.

First 6 rules by support

inspect(head(sort(eclat_rules, by="support", decreasing=TRUE),10))
##      lhs                  rhs          support   confidence lift      itemset
## [1]  {JavaScript}      => {HTML/CSS}   0.4718499 0.7380410  1.3861731 299    
## [2]  {HTML/CSS}        => {JavaScript} 0.4718499 0.8862186  1.3861731 299    
## [3]  {SQL}             => {JavaScript} 0.3657333 0.7477184  1.1695388 297    
## [4]  {JavaScript}      => {SQL}        0.3657333 0.5720594  1.1695388 297    
## [5]  {TypeScript}      => {JavaScript} 0.3436654 0.8797333  1.3760291 285    
## [6]  {JavaScript}      => {TypeScript} 0.3436654 0.5375420  1.3760291 285    
## [7]  {SQL}             => {HTML/CSS}   0.3212302 0.6567346  1.2334653 298    
## [8]  {HTML/CSS}        => {SQL}        0.3212302 0.6033279  1.2334653 298    
## [9]  {Python}          => {JavaScript} 0.3137365 0.6334631  0.9908272 293    
## [10] {JavaScript, SQL} => {HTML/CSS}   0.2907276 0.7949168  1.4929963 296

In case of top associations based on support we again see that most of the rules describe web development-related relationships.

First 6 rules by lift

inspect(head(sort(eclat_rules, by="lift", decreasing=TRUE),10))
##      lhs                                       rhs support    confidence lift     itemset
## [1]  {Bash/Shell (all shells), C++, Python} => {C} 0.05137709 0.6924981  3.562236 85     
## [2]  {Bash/Shell (all shells), C++}         => {C} 0.06365619 0.6790305  3.492958 95     
## [3]  {C++, Java}                            => {C} 0.06235942 0.6659314  3.425576 96     
## [4]  {C++, HTML/CSS, JavaScript, Python}    => {C} 0.05027542 0.6618825  3.404748 86     
## [5]  {C++, HTML/CSS, Python}                => {C} 0.05612807 0.6557179  3.373038 88     
## [6]  {C++, JavaScript, Python}              => {C} 0.06296764 0.6478158  3.332389 87     
## [7]  {C++, JavaScript, SQL}                 => {C} 0.05220335 0.6470839  3.328624 89     
## [8]  {C++, SQL}                             => {C} 0.06484967 0.6269832  3.225226 93     
## [9]  {C++, HTML/CSS, JavaScript}            => {C} 0.06489557 0.6237591  3.208640 90     
## [10] {C++, HTML/CSS}                        => {C} 0.07298600 0.6173559  3.175702 92

Ordering by high lift we can see object-oriented high-level programming language (C, C++, Java) relationships, with Bash/Shell, which are also used to build executable files from those high-level languages raw files.

Scatterplot of rules’ confidence vs support for ECLAT Algorithm

plot(eclat_rules)

Significance

options(width = 100)
is.significant(eclat_rules, data)
##   [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
##  [16]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
##  [31]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
##  [46]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
##  [61]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
##  [76]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
##  [91]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [106]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE
## [121]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [136]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [151]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
## [166]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [181]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [196]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [211]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
## [226]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [241]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [256]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE
## [271]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE
## [286]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [301]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [316]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [331]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [346]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [361]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [376]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [391]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [406]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [421]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [436]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [451]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [466]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [481]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [496]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [511]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [526]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [541]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [556]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [571]  TRUE  TRUE  TRUE  TRUE

Redundant sets

options(width = 100)
is.redundant(eclat_rules)
##   [1] FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [16] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [31] FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [46] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [61] FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [76] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [91] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## [106] FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
## [121] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [136] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [151] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [166] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [181] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE
## [196] FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [211] FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
## [226] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [241] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE
## [256] FALSE FALSE FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [271] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
## [286] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## [301] FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
## [316] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
## [331] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [346] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE
## [361] FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## [376]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
## [391] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [406] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [421] FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
## [436] FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE FALSE
## [451]  TRUE  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [466] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE
## [481]  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
## [496] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [511] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## [526] FALSE  TRUE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [541] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE
## [556] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [571] FALSE FALSE FALSE FALSE

Redundant relationships:

options(width = 100)
redundant <- eclat_rules[is.redundant(eclat_rules)==TRUE]
inspect(redundant)
##      lhs                           rhs             support confidence     lift itemset
## [1]  {Rust,                                                                           
##       TypeScript}               => {JavaScript} 0.05859536  0.8585842 1.342949       3
## [2]  {JavaScript,                                                                     
##       Rust}                     => {HTML/CSS}   0.06021345  0.7165096 1.345733       5
## [3]  {HTML/CSS,                                                                       
##       Rust}                     => {JavaScript} 0.06021345  0.8799262 1.376331       5
## [4]  {Go,                                                                             
##       JavaScript}               => {HTML/CSS}   0.06445949  0.6984581 1.311829      16
## [5]  {HTML/CSS,                                                                       
##       JavaScript,                                                                     
##       PowerShell}               => {C#}         0.05341978  0.6073059 2.187436      25
## [6]  {C#,                                                                             
##       JavaScript,                                                                     
##       PowerShell}               => {HTML/CSS}   0.05341978  0.8361775 1.570491      25
## [7]  {PowerShell,                                                                     
##       SQL}                      => {Python}     0.05601331  0.5674922 1.145819      39
## [8]  {PowerShell,                                                                     
##       Python}                   => {SQL}        0.05601331  0.7140140 1.459756      39
## [9]  {HTML/CSS,                                                                       
##       JavaScript,                                                                     
##       PHP,                                                                            
##       SQL}                      => {Python}     0.05678219  0.5310722 1.072284      66
## [10] {JavaScript,                                                                     
##       PHP,                                                                            
##       SQL}                      => {Python}     0.06443654  0.5256998 1.061437      67
## [11] {HTML/CSS,                                                                       
##       PHP,                                                                            
##       SQL}                      => {Python}     0.05942162  0.5278826 1.065844      68
## [12] {PHP,                                                                            
##       SQL}                      => {Python}     0.07028919  0.5199932 1.049914      72
## [13] {Bash/Shell (all shells),                                                        
##       C,                                                                              
##       HTML/CSS}                 => {JavaScript} 0.05105577  0.8898000 1.391775     104
## [14] {Bash/Shell (all shells),                                                        
##       C}                        => {JavaScript} 0.06273812  0.6717041 1.050642     105
## [15] {Bash/Shell (all shells),                                                        
##       C}                        => {SQL}        0.05033280  0.5388868 1.101720     107
## [16] {C,                                                                              
##       Python}                   => {SQL}        0.07085150  0.5202663 1.063651     115
## [17] {C,                                                                              
##       HTML/CSS}                 => {JavaScript} 0.09439982  0.8782832 1.373761     119
## [18] {C#,                                                                             
##       C++}                      => {Python}     0.05130824  0.6378941 1.287967     131
## [19] {Bash/Shell (all shells),                                                        
##       C++,                                                                            
##       Python}                   => {JavaScript} 0.05005738  0.6747100 1.055343     138
## [20] {Bash/Shell (all shells),                                                        
##       C++}                      => {JavaScript} 0.06203810  0.6617701 1.035103     139
## [21] {Bash/Shell (all shells),                                                        
##       C++}                      => {HTML/CSS}   0.05610512  0.5984821 1.124057     140
## [22] {C++,                                                                            
##       HTML/CSS,                                                                       
##       JavaScript}               => {TypeScript} 0.05335093  0.5127951 1.312681     142
## [23] {C++,                                                                            
##       JavaScript,                                                                     
##       Python}                   => {SQL}        0.05911177  0.6081464 1.243316     146
## [24] {C++,                                                                            
##       HTML/CSS,                                                                       
##       Python}                   => {SQL}        0.05400505  0.6309157 1.289867     147
## [25] {C++,                                                                            
##       HTML/CSS}                 => {JavaScript} 0.10403948  0.8800233 1.376483     155
## [26] {C#,                                                                             
##       HTML/CSS,                                                                       
##       SQL,                                                                            
##       TypeScript}               => {JavaScript} 0.06699564  0.9422208 1.473769     175
## [27] {C#,                                                                             
##       SQL,                                                                            
##       TypeScript}               => {JavaScript} 0.08006656  0.9132199 1.428407     176
## [28] {C#,                                                                             
##       HTML/CSS,                                                                       
##       TypeScript}               => {JavaScript} 0.09055543  0.9272620 1.450371     178
## [29] {C#,                                                                             
##       HTML/CSS,                                                                       
##       Python}                   => {SQL}        0.05849208  0.7230813 1.478293     185
## [30] {Bash/Shell (all shells),                                                        
##       Java,                                                                           
##       JavaScript,                                                                     
##       SQL}                      => {HTML/CSS}   0.05300666  0.8202806 1.540634     206
## [31] {Bash/Shell (all shells),                                                        
##       Java,                                                                           
##       JavaScript}               => {HTML/CSS}   0.07074822  0.7802810 1.465507     209
## [32] {HTML/CSS,                                                                       
##       Java,                                                                           
##       JavaScript,                                                                     
##       TypeScript}               => {Python}     0.05632316  0.6245069 1.260937     216
## [33] {HTML/CSS,                                                                       
##       Java,                                                                           
##       JavaScript,                                                                     
##       Python}                   => {TypeScript} 0.05632316  0.5681213 1.454308     216
## [34] {Java,                                                                           
##       JavaScript,                                                                     
##       SQL,                                                                            
##       TypeScript}               => {HTML/CSS}   0.06433326  0.8114054 1.523965     220
## [35] {Java,                                                                           
##       JavaScript,                                                                     
##       SQL}                      => {HTML/CSS}   0.11131513  0.7892596 1.482371     235
## [36] {Java,                                                                           
##       JavaScript}               => {HTML/CSS}   0.15865274  0.7342007 1.378961     238
## [37] {Bash/Shell (all shells),                                                        
##       JavaScript,                                                                     
##       SQL,                                                                            
##       TypeScript}               => {Python}     0.05584118  0.6619508 1.336540     245
## [38] {Bash/Shell (all shells),                                                        
##       JavaScript,                                                                     
##       Python,                                                                         
##       SQL}                      => {TypeScript} 0.05584118  0.5778411 1.479189     245
## [39] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       JavaScript,                                                                     
##       TypeScript}               => {Python}     0.06457425  0.6341711 1.280450     246
## [40] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       JavaScript,                                                                     
##       Python}                   => {TypeScript} 0.06457425  0.5770690 1.477213     246
## [41] {Bash/Shell (all shells),                                                        
##       JavaScript,                                                                     
##       TypeScript}               => {Python}     0.07925178  0.6277041 1.267393     247
## [42] {Bash/Shell (all shells),                                                        
##       JavaScript,                                                                     
##       Python}                   => {TypeScript} 0.07925178  0.5555913 1.422233     247
## [43] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       TypeScript}               => {Python}     0.06800551  0.6306939 1.273429     248
## [44] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       Python}                   => {TypeScript} 0.06800551  0.5399052 1.382079     248
## [45] {Bash/Shell (all shells),                                                        
##       SQL,                                                                            
##       TypeScript}               => {Python}     0.05978885  0.6590765 1.330737     249
## [46] {Bash/Shell (all shells),                                                        
##       TypeScript}               => {Python}     0.08689465  0.6231586 1.258215     257
## [47] {Bash/Shell (all shells),                                                        
##       JavaScript,                                                                     
##       Python,                                                                         
##       SQL}                      => {HTML/CSS}   0.07957310  0.8234176 1.546526     258
## [48] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       JavaScript,                                                                     
##       SQL}                      => {Python}     0.07957310  0.6622732 1.337191     258
## [49] {Bash/Shell (all shells),                                                        
##       Python,                                                                         
##       SQL}                      => {JavaScript} 0.09663759  0.7720730 1.207633     259
## [50] {Bash/Shell (all shells),                                                        
##       JavaScript,                                                                     
##       SQL}                      => {Python}     0.09663759  0.6623407 1.337327     259
## [51] {Bash/Shell (all shells),                                                        
##       Python,                                                                         
##       SQL}                      => {HTML/CSS}   0.08703236  0.6953333 1.305960     260
## [52] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       SQL}                      => {Python}     0.08703236  0.6616069 1.335846     260
## [53] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       Python}                   => {JavaScript} 0.11190039  0.8883929 1.389574     261
## [54] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       JavaScript}               => {Python}     0.11190039  0.6429513 1.298178     261
## [55] {Bash/Shell (all shells),                                                        
##       Python}                   => {JavaScript} 0.14264402  0.6702615 1.048385     262
## [56] {Bash/Shell (all shells),                                                        
##       JavaScript}               => {Python}     0.14264402  0.6413498 1.294945     262
## [57] {Bash/Shell (all shells),                                                        
##       Python}                   => {HTML/CSS}   0.12595823  0.5918576 1.111615     263
## [58] {Bash/Shell (all shells),                                                        
##       HTML/CSS}                 => {Python}     0.12595823  0.6454952 1.303315     263
## [59] {HTML/CSS,                                                                       
##       JavaScript,                                                                     
##       Python}                   => {TypeScript} 0.12836814  0.5454723 1.396330     277
## [60] {JavaScript,                                                                     
##       Python}                   => {TypeScript} 0.16470048  0.5249643 1.343832     278
## [61] {HTML/CSS,                                                                       
##       Python}                   => {TypeScript} 0.13547165  0.5094071 1.304008     279
## [62] {SQL,                                                                            
##       TypeScript}               => {Python}     0.11483819  0.5289392 1.067977     280
## [63] {Python,                                                                         
##       SQL}                      => {JavaScript} 0.19318338  0.7462872 1.167300     290
## [64] {JavaScript,                                                                     
##       SQL}                      => {Python}     0.19318338  0.5282083 1.066502     290
## [65] {HTML/CSS,                                                                       
##       Python}                   => {JavaScript} 0.23533395  0.8849141 1.384133     292

Insignificant relationships:

options(width = 100)
insignificant <- eclat_rules[is.redundant(eclat_rules)==TRUE]
inspect(insignificant)
##      lhs                           rhs             support confidence     lift itemset
## [1]  {Rust,                                                                           
##       TypeScript}               => {JavaScript} 0.05859536  0.8585842 1.342949       3
## [2]  {JavaScript,                                                                     
##       Rust}                     => {HTML/CSS}   0.06021345  0.7165096 1.345733       5
## [3]  {HTML/CSS,                                                                       
##       Rust}                     => {JavaScript} 0.06021345  0.8799262 1.376331       5
## [4]  {Go,                                                                             
##       JavaScript}               => {HTML/CSS}   0.06445949  0.6984581 1.311829      16
## [5]  {HTML/CSS,                                                                       
##       JavaScript,                                                                     
##       PowerShell}               => {C#}         0.05341978  0.6073059 2.187436      25
## [6]  {C#,                                                                             
##       JavaScript,                                                                     
##       PowerShell}               => {HTML/CSS}   0.05341978  0.8361775 1.570491      25
## [7]  {PowerShell,                                                                     
##       SQL}                      => {Python}     0.05601331  0.5674922 1.145819      39
## [8]  {PowerShell,                                                                     
##       Python}                   => {SQL}        0.05601331  0.7140140 1.459756      39
## [9]  {HTML/CSS,                                                                       
##       JavaScript,                                                                     
##       PHP,                                                                            
##       SQL}                      => {Python}     0.05678219  0.5310722 1.072284      66
## [10] {JavaScript,                                                                     
##       PHP,                                                                            
##       SQL}                      => {Python}     0.06443654  0.5256998 1.061437      67
## [11] {HTML/CSS,                                                                       
##       PHP,                                                                            
##       SQL}                      => {Python}     0.05942162  0.5278826 1.065844      68
## [12] {PHP,                                                                            
##       SQL}                      => {Python}     0.07028919  0.5199932 1.049914      72
## [13] {Bash/Shell (all shells),                                                        
##       C,                                                                              
##       HTML/CSS}                 => {JavaScript} 0.05105577  0.8898000 1.391775     104
## [14] {Bash/Shell (all shells),                                                        
##       C}                        => {JavaScript} 0.06273812  0.6717041 1.050642     105
## [15] {Bash/Shell (all shells),                                                        
##       C}                        => {SQL}        0.05033280  0.5388868 1.101720     107
## [16] {C,                                                                              
##       Python}                   => {SQL}        0.07085150  0.5202663 1.063651     115
## [17] {C,                                                                              
##       HTML/CSS}                 => {JavaScript} 0.09439982  0.8782832 1.373761     119
## [18] {C#,                                                                             
##       C++}                      => {Python}     0.05130824  0.6378941 1.287967     131
## [19] {Bash/Shell (all shells),                                                        
##       C++,                                                                            
##       Python}                   => {JavaScript} 0.05005738  0.6747100 1.055343     138
## [20] {Bash/Shell (all shells),                                                        
##       C++}                      => {JavaScript} 0.06203810  0.6617701 1.035103     139
## [21] {Bash/Shell (all shells),                                                        
##       C++}                      => {HTML/CSS}   0.05610512  0.5984821 1.124057     140
## [22] {C++,                                                                            
##       HTML/CSS,                                                                       
##       JavaScript}               => {TypeScript} 0.05335093  0.5127951 1.312681     142
## [23] {C++,                                                                            
##       JavaScript,                                                                     
##       Python}                   => {SQL}        0.05911177  0.6081464 1.243316     146
## [24] {C++,                                                                            
##       HTML/CSS,                                                                       
##       Python}                   => {SQL}        0.05400505  0.6309157 1.289867     147
## [25] {C++,                                                                            
##       HTML/CSS}                 => {JavaScript} 0.10403948  0.8800233 1.376483     155
## [26] {C#,                                                                             
##       HTML/CSS,                                                                       
##       SQL,                                                                            
##       TypeScript}               => {JavaScript} 0.06699564  0.9422208 1.473769     175
## [27] {C#,                                                                             
##       SQL,                                                                            
##       TypeScript}               => {JavaScript} 0.08006656  0.9132199 1.428407     176
## [28] {C#,                                                                             
##       HTML/CSS,                                                                       
##       TypeScript}               => {JavaScript} 0.09055543  0.9272620 1.450371     178
## [29] {C#,                                                                             
##       HTML/CSS,                                                                       
##       Python}                   => {SQL}        0.05849208  0.7230813 1.478293     185
## [30] {Bash/Shell (all shells),                                                        
##       Java,                                                                           
##       JavaScript,                                                                     
##       SQL}                      => {HTML/CSS}   0.05300666  0.8202806 1.540634     206
## [31] {Bash/Shell (all shells),                                                        
##       Java,                                                                           
##       JavaScript}               => {HTML/CSS}   0.07074822  0.7802810 1.465507     209
## [32] {HTML/CSS,                                                                       
##       Java,                                                                           
##       JavaScript,                                                                     
##       TypeScript}               => {Python}     0.05632316  0.6245069 1.260937     216
## [33] {HTML/CSS,                                                                       
##       Java,                                                                           
##       JavaScript,                                                                     
##       Python}                   => {TypeScript} 0.05632316  0.5681213 1.454308     216
## [34] {Java,                                                                           
##       JavaScript,                                                                     
##       SQL,                                                                            
##       TypeScript}               => {HTML/CSS}   0.06433326  0.8114054 1.523965     220
## [35] {Java,                                                                           
##       JavaScript,                                                                     
##       SQL}                      => {HTML/CSS}   0.11131513  0.7892596 1.482371     235
## [36] {Java,                                                                           
##       JavaScript}               => {HTML/CSS}   0.15865274  0.7342007 1.378961     238
## [37] {Bash/Shell (all shells),                                                        
##       JavaScript,                                                                     
##       SQL,                                                                            
##       TypeScript}               => {Python}     0.05584118  0.6619508 1.336540     245
## [38] {Bash/Shell (all shells),                                                        
##       JavaScript,                                                                     
##       Python,                                                                         
##       SQL}                      => {TypeScript} 0.05584118  0.5778411 1.479189     245
## [39] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       JavaScript,                                                                     
##       TypeScript}               => {Python}     0.06457425  0.6341711 1.280450     246
## [40] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       JavaScript,                                                                     
##       Python}                   => {TypeScript} 0.06457425  0.5770690 1.477213     246
## [41] {Bash/Shell (all shells),                                                        
##       JavaScript,                                                                     
##       TypeScript}               => {Python}     0.07925178  0.6277041 1.267393     247
## [42] {Bash/Shell (all shells),                                                        
##       JavaScript,                                                                     
##       Python}                   => {TypeScript} 0.07925178  0.5555913 1.422233     247
## [43] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       TypeScript}               => {Python}     0.06800551  0.6306939 1.273429     248
## [44] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       Python}                   => {TypeScript} 0.06800551  0.5399052 1.382079     248
## [45] {Bash/Shell (all shells),                                                        
##       SQL,                                                                            
##       TypeScript}               => {Python}     0.05978885  0.6590765 1.330737     249
## [46] {Bash/Shell (all shells),                                                        
##       TypeScript}               => {Python}     0.08689465  0.6231586 1.258215     257
## [47] {Bash/Shell (all shells),                                                        
##       JavaScript,                                                                     
##       Python,                                                                         
##       SQL}                      => {HTML/CSS}   0.07957310  0.8234176 1.546526     258
## [48] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       JavaScript,                                                                     
##       SQL}                      => {Python}     0.07957310  0.6622732 1.337191     258
## [49] {Bash/Shell (all shells),                                                        
##       Python,                                                                         
##       SQL}                      => {JavaScript} 0.09663759  0.7720730 1.207633     259
## [50] {Bash/Shell (all shells),                                                        
##       JavaScript,                                                                     
##       SQL}                      => {Python}     0.09663759  0.6623407 1.337327     259
## [51] {Bash/Shell (all shells),                                                        
##       Python,                                                                         
##       SQL}                      => {HTML/CSS}   0.08703236  0.6953333 1.305960     260
## [52] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       SQL}                      => {Python}     0.08703236  0.6616069 1.335846     260
## [53] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       Python}                   => {JavaScript} 0.11190039  0.8883929 1.389574     261
## [54] {Bash/Shell (all shells),                                                        
##       HTML/CSS,                                                                       
##       JavaScript}               => {Python}     0.11190039  0.6429513 1.298178     261
## [55] {Bash/Shell (all shells),                                                        
##       Python}                   => {JavaScript} 0.14264402  0.6702615 1.048385     262
## [56] {Bash/Shell (all shells),                                                        
##       JavaScript}               => {Python}     0.14264402  0.6413498 1.294945     262
## [57] {Bash/Shell (all shells),                                                        
##       Python}                   => {HTML/CSS}   0.12595823  0.5918576 1.111615     263
## [58] {Bash/Shell (all shells),                                                        
##       HTML/CSS}                 => {Python}     0.12595823  0.6454952 1.303315     263
## [59] {HTML/CSS,                                                                       
##       JavaScript,                                                                     
##       Python}                   => {TypeScript} 0.12836814  0.5454723 1.396330     277
## [60] {JavaScript,                                                                     
##       Python}                   => {TypeScript} 0.16470048  0.5249643 1.343832     278
## [61] {HTML/CSS,                                                                       
##       Python}                   => {TypeScript} 0.13547165  0.5094071 1.304008     279
## [62] {SQL,                                                                            
##       TypeScript}               => {Python}     0.11483819  0.5289392 1.067977     280
## [63] {Python,                                                                         
##       SQL}                      => {JavaScript} 0.19318338  0.7462872 1.167300     290
## [64] {JavaScript,                                                                     
##       SQL}                      => {Python}     0.19318338  0.5282083 1.066502     290
## [65] {HTML/CSS,                                                                       
##       Python}                   => {JavaScript} 0.23533395  0.8849141 1.384133     292

Conclusions

As we can see, the association rules can show relatively well associations of programming languages that are used by the developers. Both Apriori and ECLAT algorithms were able to capture relationships between the languages, mainly the web development-related tools’ relations were visible, which can be attributted to the popularity of the JavaScript language. But some data-related and object-oriented programming relationships were also visible. Obtained relationships provide a good source of information about the knowlegde of different programming languages and groups that they create, depending on the main work focus of different developers.

However, we need to remember that the data used was collected using public survey. This can suggest that some of the observations were not representative. Also, this survey did not collect the data regarding each language knowledge and experience level, which may have caused the users to select languages that they did not in fact known well.

Regardless of that, we can conclude that the association rules can be effectively used to analyse the relationships between the programming languages in the different sectors of the software development