Unsupervised Learning Association Rules Assignment

Si Tang Lin

Introduction

Traditional Chinese medicine (TCM) is a vital component of traditional medical systems, with its origins dates back thousands of years. It plays multiple roles, including disease treatment, health preservation, and disease prevention. TCM primarily derives from natural plants such as roots, fruits, and leaves, along with a small proportion of minerals. Various combinations of medicinal herbs can be decocted to treat corresponding symptoms. By applying association rules, we can uncover the compatibility patterns of medicinal herbs in TCM formulas (specific effects of herb combinations) and identify herb pairs with high correlation. The data source comes from regular procurement records of a relative’s TCM clinic. This study assumes that the quantity of medicinal herbs ordered equals the quantity consumed, which in turn equals the usage rate of the herbs. Based on this assumption, association rule analysis is conducted. Additionally, the effects of medical herbs have also been compared and analyzed using a traditional Chinese medicine formula website for further insight.

Because the translated names of the herbs are hard to read, I collected images from the internet to aid in visualization.

Prerequisite

# load require pacakges
library(arules)
library(arulesViz)

Load Dataset and Key Insight

Data resources:

The dataset consist of order records spanning three seasons from a relative’s Chinese medicine clinic.

#load the dataset
tcm <- suppressWarnings(read.transactions("/Users/ninalin/Desktop/DS/usl/Traditional Chinese Medicine.csv", sep = ","))
#Check the dataset
summary(tcm)
## transactions as itemMatrix in sparse format with
##  927 rows (elements/itemsets/transactions) and
##  23 columns (items) and a density of 0.2511139 
## 
## most frequent items:
##          Radix Astragali       Rhizoma Chuanxiong       Radix Glycyrrhizae 
##                      492                      465                      401 
## Radix Angelicae Sinensis      Radix Paeoniae Alba                  (Other) 
##                      383                      367                     3246 
## 
## element (itemset/transaction) length distribution:
## sizes
##   1   2   3   4   5   6   7   8   9  10  11  12 
##  63  70  86  90 103 112 126 120  92  43  18   4 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   4.000   6.000   5.776   8.000  12.000 
## 
## includes extended item information - examples:
##                          labels
## 1 Bulbus Fritillariae Cirrhosae
## 2                  Bulbus Lilii
## 3             Flos Chrysanthemi
size(tcm)
##   [1]  4  5 12  8  7  9  9  2  9  5 11  4  8  2  6  5  1  6  6  1  5  3  3  6  9
##  [26]  5  8  7  7  3  7  2  7  8  5  4  2  8  1  4  2  7  5  2  2  9  2  4  8  3
##  [51]  5 10  7  5  7  6  9 11  7  6  4  1  7  6  6  9  8  5  4  1  9  2 10  7  7
##  [76]  5  4  4  3  5  4  7  5  9  2  3  8  7  5  8 10  8  6  8  4 10  1  9  8  4
## [101]  7  4  1 11  5  1  8  3  7  5  4  7  1  5 12  6  4  5  4  1  9  8  7  8  6
## [126]  6  9  6  6  6  8  2  4  1  8  8  8  3  9  4  9  9  7  2  4  8  8  9  3 11
## [151] 10  7  7  6  6  8  2  7  5  3  1  5  8  8  7  3  5  4  9  3  7  2  7  7  6
## [176]  4  4  5  9  5  5  7  4 10  8  5 10  4  3  5  3  5  2  8  2  5  9  6  3  7
## [201] 10  3  8  5  3 10  7  9  3  7 10  8  8  5  5  8  8 10  3  1  8  8  8  2  6
## [226] 10  5  5 11  5 11  5  9  7 10  7 10  5  9  4  8  5  6  8  5  7  8  4  8  3
## [251]  1 12  3  3  5  8  1  3  2  7  5  5  2  7  9  8  7  4  8  9  8  9 10  5  8
## [276]  5  9 10  8  8  8  4  8  3  6  8  1  1  8  8  8  9  6  8  7  8 10  7  6  6
## [301]  1  3  7  4  6  8  6  4  8  6  7  3  6  3  2  3  8  7  9  3  3  1  2  4  5
## [326]  3  1  6  8  7  8  2  8 10  6  8  1  6  9  2  6  3  2  9  7  2  5  5  7  7
## [351] 11  7  9  4  3  8  6  2  3  9  2  6  9  5  5  6  3  9  9  2  9  5  6  2  7
## [376]  9  6  8  3  6  9  6  8  1  1  7  9  3  4  4  7  2  3  4 11 10  6 10  2  3
## [401]  3  7  3  7  4  2  6  5  1  4  9 10  5  3  2  2  8  5  7  6  3  5  7 10  6
## [426]  7  3  3  6  8  4  3  8  1  7  7  6  8  5  4  9  2  8  2 11  2  9  8  7  6
## [451]  3  6  8  9  6  7  4  5  4  8  6  9  4  9  5  9  5  6  8  1  7  4  7 11  9
## [476]  7  7  9  1  2  2  1  8  3  1  6  7  7  6  7  5  7  5  4  1  6  6  2  8  8
## [501]  9  4  5  8  3  6  8  2  1  6  8  7  7  3  7  6  4  8  4  8  1  4  5  5  3
## [526]  2  9  1  5  3  5  5  1  9  8  7  6  7  7  8  4  3  7  1 10  6  2  9  3  4
## [551]  7  7  5  9  3  9  4  5  2  5  3  7  7  6 10  3 10  2  6  8  7  3  5  3  9
## [576]  9  4  6  6  8  4  2  4  7  1  1  3  9  4  7  6  5  2  2  5  6  8  7  4  4
## [601]  7  7  1  8  6  6 10  6  3  6  8  8  5  7  5 10  4  1  5  2  8  5  7 10 11
## [626]  1  3  2  3  9  8  5  1  6  6  2 10  8  7  7 10  6  5  6  8  9  7 10  3 11
## [651]  9  6  8  2  3  4  4  9  4  7  9  3  1  4  4  7  1  6  4  6 10  8  2  1 10
## [676]  5  8  8  2  6  4  9  4  7  9  6  6  4  1  6  1  7  3  5  2  8  4  5  8  6
## [701]  5  9 10  7  5  8  1  8  1  8  7  6  7  3  5  6  9  6 10  3  9  5  9  3  4
## [726] 11  1  9  6  2  7  9  6  8 11  4  9 10  9  7  3  9  7  3  4 11  7  4  6  2
## [751]  2  6  3  5  4  4  8  3  3  2  7  6  7  6  6  8  5  9  6  9  1  5  7  4  6
## [776]  7  9  2  2  3  4  8  9  1  4  7  7  4  7  9  3  1  4  2 10  9  5 10  1  8
## [801]  3 10 10  8  5  4  7  9  2  7  6  5  7  4  4  5  9  4  9 12  4  2  6  8  5
## [826]  5  7  8  7 11  5  1  2  8  6  4  6  1  5  2 10  2  7  5  9  1  5  3  1  5
## [851]  9  4  7  3  6  5  5  1  1  7  3  3 11  6  8  2  7  5  3  9  1  1  8  1  5
## [876]  6 10  8  6  4  7  7 10  6  6  8  3  4  4  7  2  6  7  2  6  6  6  9  8 11
## [901]  4  8  7  8  9  7  7  8  9  9  6  1  5  6  7  7  3  6  9  7  3  9  8  9  6
## [926]  8  7
length(tcm) 
## [1] 927

We have 927 transactions records in given dataset, each transaction record contains between 1 to 12 different medicinal herbs(size).

Herbs Usage Frequency

#Check the frequency of each medical herbs appears in the transactions.
itemFrequency(tcm, type="relative")
## Bulbus Fritillariae Cirrhosae                  Bulbus Lilii 
##                    0.32038835                    0.14886731 
##             Flos Chrysanthemi              Fructus Crataegi 
##                    0.33225458                    0.34627832 
##               Fructus Jujubae                 Fructus Lycii 
##                    0.19848975                    0.19633225 
##                     Ganoderma Pericarpium Citri Reticulatae 
##                    0.12189860                    0.24919094 
##            Plumula Nelumbinis     Radix Angelicae Dahuricae 
##                    0.01725998                    0.37216828 
##      Radix Angelicae Sinensis               Radix Astragali 
##                    0.41316073                    0.53074434 
##             Radix Aucklandiae                 Radix Ginseng 
##                    0.03667745                    0.14670982 
##            Radix Glycyrrhizae           Radix Paeoniae Alba 
##                    0.43257821                    0.39590076 
##    Radix Rehmanniae Preparata             Rhizoma Bletillae 
##                    0.33764833                    0.03667745 
##            Rhizoma Chuanxiong              Rhizoma Coptidis 
##                    0.50161812                    0.23516721 
##            Rhizoma Gastrodiae                 Semen Cassiae 
##                    0.06796117                    0.26213592 
##   Semen Euphorbiae Lathyridis 
##                    0.07551241
itemFrequency(tcm, type="absolute")
## Bulbus Fritillariae Cirrhosae                  Bulbus Lilii 
##                           297                           138 
##             Flos Chrysanthemi              Fructus Crataegi 
##                           308                           321 
##               Fructus Jujubae                 Fructus Lycii 
##                           184                           182 
##                     Ganoderma Pericarpium Citri Reticulatae 
##                           113                           231 
##            Plumula Nelumbinis     Radix Angelicae Dahuricae 
##                            16                           345 
##      Radix Angelicae Sinensis               Radix Astragali 
##                           383                           492 
##             Radix Aucklandiae                 Radix Ginseng 
##                            34                           136 
##            Radix Glycyrrhizae           Radix Paeoniae Alba 
##                           401                           367 
##    Radix Rehmanniae Preparata             Rhizoma Bletillae 
##                           313                            34 
##            Rhizoma Chuanxiong              Rhizoma Coptidis 
##                           465                           218 
##            Rhizoma Gastrodiae                 Semen Cassiae 
##                            63                           243 
##   Semen Euphorbiae Lathyridis 
##                            70
itemFrequencyPlot(tcm, topN=15, type="absolute", main="Herbs Frequency") 

Radix Astragali appears most frequently, followed by Rhizoma Chuanxiong. Upon investigation, these two herbs are one of the most commonly combined with other medicinal herbs in TCM formulas. On the other hand, Plumula Nelumbinis appears the least frequently(16 times) due to its low usage rate in pharmacies and its ability to function effectively only when paired with a limited number of herbs.

Transaction Insight

#Check the first 4 transaction
inspect(tcm[1:4])
##     items                           
## [1] {Bulbus Lilii,                  
##      Radix Astragali,               
##      Radix Glycyrrhizae,            
##      Radix Paeoniae Alba}           
## [2] {Fructus Crataegi,              
##      Radix Angelicae Dahuricae,     
##      Radix Angelicae Sinensis,      
##      Radix Astragali,               
##      Rhizoma Chuanxiong}            
## [3] {Fructus Crataegi,              
##      Fructus Jujubae,               
##      Fructus Lycii,                 
##      Radix Angelicae Dahuricae,     
##      Radix Astragali,               
##      Radix Aucklandiae,             
##      Radix Ginseng,                 
##      Radix Glycyrrhizae,            
##      Radix Paeoniae Alba,           
##      Radix Rehmanniae Preparata,    
##      Rhizoma Chuanxiong,            
##      Rhizoma Coptidis}              
## [4] {Flos Chrysanthemi,             
##      Fructus Lycii,                 
##      Pericarpium Citri Reticulatae, 
##      Radix Angelicae Dahuricae,     
##      Radix Astragali,               
##      Radix Paeoniae Alba,           
##      Radix Rehmanniae Preparata,    
##      Rhizoma Chuanxiong}
#Support level of the herbs
itemFrequency(tcm[, 1:5])
## Bulbus Fritillariae Cirrhosae                  Bulbus Lilii 
##                     0.3203883                     0.1488673 
##             Flos Chrysanthemi              Fructus Crataegi 
##                     0.3322546                     0.3462783 
##               Fructus Jujubae 
##                     0.1984898

Bulbus Fritillariae Cirrhosae appears in 32% of transation, while Bulbus Lilii only account for 14%.

#Top 15 items
itemFrequencyPlot(tcm, topN = 15)

The graph present top 15 herbs out of 23 variables.

# Visualize the sparse matrix for the first 5 items
image(tcm[1:5],) 

Each row represents a transaction, and each column corresponds to a specific herb. Dark cell indicates that a particular herb was purchased in the corresponding transaction. The 10th item is the most frequent in first 5 tranastions.

##200 transactions
image(sample(tcm, 200))

Select 200 transactions out of 927(since my dataset is relatively small) to visualize, dark cell indicates a particular herb was purchased in the corresponding transaction.

Eclat Method

#Use the same parameter to compare
freq.herbs<-eclat(tcm, parameter=list(supp=0.2, maxlen=12))
## Eclat
## 
## parameter specification:
##  tidLists support minlen maxlen            target  ext
##     FALSE     0.2      1     12 frequent itemsets TRUE
## 
## algorithmic control:
##  sparse sort verbose
##       7   -2    TRUE
## 
## Absolute minimum support count: 185 
## 
## create itemset ... 
## set transactions ...[23 item(s), 927 transaction(s)] done [0.00s].
## sorting and recoding items ... [13 item(s)] done [0.00s].
## creating bit matrix ... [13 row(s), 927 column(s)] done [0.00s].
## writing  ... [24 set(s)] done [0.00s].
## Creating S4 object  ... done [0.00s].

Utilizing eclate method to check through frequent set. 23 items and 927 transactions were analyzed, by applying 20% support theshold, filtered out 13 items resulting in 24 frequency itemsets.

Apriori Method

One Dimension Assiciation

#Set the minimum support at 35% to check the top 6 herbs present in the data.
itemFrequencyPlot(tcm, support = 0.35)

tcmrules<- apriori(tcm, parameter = list(support = 0.2, confidence =
0.50, minlen = 2)) 
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.5    0.1    1 none FALSE            TRUE       5     0.2      2
##  maxlen target  ext
##      10  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 185 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[23 item(s), 927 transaction(s)] done [0.00s].
## sorting and recoding items ... [13 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [12 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].
inspect(tcmrules[1:5]) 
##     lhs                             rhs                  support   confidence
## [1] {Flos Chrysanthemi}          => {Radix Astragali}    0.2006472 0.6038961 
## [2] {Radix Rehmanniae Preparata} => {Rhizoma Chuanxiong} 0.2038835 0.6038339 
## [3] {Radix Rehmanniae Preparata} => {Radix Astragali}    0.2038835 0.6038339 
## [4] {Radix Angelicae Dahuricae}  => {Radix Astragali}    0.2103560 0.5652174 
## [5] {Radix Paeoniae Alba}        => {Rhizoma Chuanxiong} 0.2200647 0.5558583 
##     coverage  lift     count
## [1] 0.3322546 1.137829 186  
## [2] 0.3376483 1.203772 189  
## [3] 0.3376483 1.137711 189  
## [4] 0.3721683 1.064952 195  
## [5] 0.3959008 1.108130 204

I want to identify strong associations among the herbs by a minimum support level of 20% and a confidence threshold of 50%.

Transactions that include both Flos Chrysanthemi and Radix Astragali account for 20.06% of the total, with a confidence level of 60.39%. This indicates that they are commonly paired herbs, which believed to have benefits in relieving fatigue and preventing colds.

#Reorder the tcmrules on different metrics
inspect(sort(tcmrules, by = "lift")[1:5]) 
##     lhs                             rhs                  support   confidence
## [1] {Radix Rehmanniae Preparata} => {Rhizoma Chuanxiong} 0.2038835 0.6038339 
## [2] {Flos Chrysanthemi}          => {Radix Astragali}    0.2006472 0.6038961 
## [3] {Radix Rehmanniae Preparata} => {Radix Astragali}    0.2038835 0.6038339 
## [4] {Rhizoma Chuanxiong}         => {Radix Astragali}    0.3009709 0.6000000 
## [5] {Radix Astragali}            => {Rhizoma Chuanxiong} 0.3009709 0.5670732 
##     coverage  lift     count
## [1] 0.3376483 1.203772 189  
## [2] 0.3322546 1.137829 186  
## [3] 0.3376483 1.137711 189  
## [4] 0.5016181 1.130488 279  
## [5] 0.5307443 1.130488 279
inspect(sort(tcmrules, by ="confidence")[1:5]) 
##     lhs                             rhs                  support   confidence
## [1] {Flos Chrysanthemi}          => {Radix Astragali}    0.2006472 0.6038961 
## [2] {Radix Rehmanniae Preparata} => {Rhizoma Chuanxiong} 0.2038835 0.6038339 
## [3] {Radix Rehmanniae Preparata} => {Radix Astragali}    0.2038835 0.6038339 
## [4] {Rhizoma Chuanxiong}         => {Radix Astragali}    0.3009709 0.6000000 
## [5] {Radix Astragali}            => {Rhizoma Chuanxiong} 0.3009709 0.5670732 
##     coverage  lift     count
## [1] 0.3322546 1.137829 186  
## [2] 0.3376483 1.203772 189  
## [3] 0.3376483 1.137711 189  
## [4] 0.5016181 1.130488 279  
## [5] 0.5307443 1.130488 279
inspect(sort(tcmrules, by = "support")[1:5])
##     lhs                      rhs                  support   confidence
## [1] {Rhizoma Chuanxiong}  => {Radix Astragali}    0.3009709 0.6000000 
## [2] {Radix Astragali}     => {Rhizoma Chuanxiong} 0.3009709 0.5670732 
## [3] {Radix Glycyrrhizae}  => {Rhizoma Chuanxiong} 0.2351672 0.5436409 
## [4] {Radix Glycyrrhizae}  => {Radix Astragali}    0.2308522 0.5336658 
## [5] {Radix Paeoniae Alba} => {Radix Astragali}    0.2233010 0.5640327 
##     coverage  lift     count
## [1] 0.5016181 1.130488 279  
## [2] 0.5307443 1.130488 279  
## [3] 0.4325782 1.083774 218  
## [4] 0.4325782 1.005505 214  
## [5] 0.3959008 1.062720 207
inspect(sort(tcmrules, by = "count")[1:5])
##     lhs                      rhs                  support   confidence
## [1] {Rhizoma Chuanxiong}  => {Radix Astragali}    0.3009709 0.6000000 
## [2] {Radix Astragali}     => {Rhizoma Chuanxiong} 0.3009709 0.5670732 
## [3] {Radix Glycyrrhizae}  => {Rhizoma Chuanxiong} 0.2351672 0.5436409 
## [4] {Radix Glycyrrhizae}  => {Radix Astragali}    0.2308522 0.5336658 
## [5] {Radix Paeoniae Alba} => {Radix Astragali}    0.2233010 0.5640327 
##     coverage  lift     count
## [1] 0.5016181 1.130488 279  
## [2] 0.5307443 1.130488 279  
## [3] 0.4325782 1.083774 218  
## [4] 0.4325782 1.005505 214  
## [5] 0.3959008 1.062720 207

Visualization on the plot

# Check the frequency plot of 12 rules
plot(tcmrules, method="graph", measure="support", shading="lift")

Each nodes represent a herb, it size indicate the support level. The edges are the association between herbs, a thicker edges indicate stronger relationship between two herbs. The color of the edge is its lift value, darker represent higher lift value and vice versa.

Insight from the graph: Radix Astragali is the key herb in this data set, which has a strong association with multiple herbs. On the other hands, Flos Chrysanthemi is connect to few other herbs but weaker association.

plot(tcmrules, measure=c("support","lift"), shading="confidence", jitter=0, main="Scatter Plot for TCM rules")

plot(tcmrules, method="paracoord", control=list(reorder=TRUE))

In the left hand side shows the lists the herbs involved in the rules, and the right side represents the rhs(consequent), a targeted single herb. The color intensity is the lift vale and thickness reflect the support value.

From this plot we can get a conclusion that Flos Chrysanthemi has a lower support and lift value, suggesting less common to paired with Rhizoma Chuanxiong, but have higher assocaiation with Radix Astragali.

plot(tcmrules, method="grouped")

The bigger the dot the higher support level, and the redder the color present the higher lift level.

Rhizoma Chuanxiong and Radix Rehmanniae Prepaeata have 22% support might appear in some prescription for specific health needs, such as improving circulation. Lift value in relatively high in this graph, shows a weak but positive association.

Two dimensions association

Categorical type

#Group the variables into six categories based on their types.
root_herbs<- c("Radix Astragali", "Rhizoma Chuanxiong",
"Radix Glycyrrhizae", "Radix Angelicae Sinensis", "Radix Angelicae Dahuricae", "Radix Paeoniae Alba","Bulbus Fritillariae Cirrhosae","Radix Rehmanniae Preparata", "Rhizoma Coptidis", "Rhizoma Gastrodiae","Rhizoma Bletillae")
fruit_herbs<-c("Semen Euphorbiae Lathyridis", "Plumula Nelumbinis", "Fructus Lycii", "Fructus Jujubae", "Semen Cassiae","Fructus Crataegi")
flower_herbs<- c("Bulbus Lilii", "Flos Chrysanthemi")
fungi_herbs<-c("Ganoderma")
leaves_herbs<- c("Radix Ginseng")
other_herbs<- c("Pericarpium Citri Reticulatae","Radix Aucklandiae")

What is the association between fruit herbs and other herbs?

rules.fruitherbsrhs <- apriori(
     data = tcm,
     parameter = list(supp = 0.03, conf = 0.4, minlen=2),
     appearance = list(default = "lhs", rhs = fruit_herbs),
     control = list(verbose = F))

inspect(sort(rules.fruitherbsrhs)[1:10])
##      lhs                                 rhs                   support confidence  coverage     lift count
## [1]  {Bulbus Fritillariae Cirrhosae}  => {Fructus Crataegi} 0.12837109  0.4006734 0.3203883 1.157085   119
## [2]  {Radix Astragali,                                                                                    
##       Rhizoma Chuanxiong}             => {Fructus Crataegi} 0.12189860  0.4050179 0.3009709 1.169631   113
## [3]  {Pericarpium Citri Reticulatae}  => {Fructus Crataegi} 0.10032362  0.4025974 0.2491909 1.162641    93
## [4]  {Rhizoma Coptidis}               => {Fructus Crataegi} 0.09600863  0.4082569 0.2351672 1.178985    89
## [5]  {Radix Angelicae Dahuricae,                                                                          
##       Radix Astragali}                => {Fructus Crataegi} 0.09492988  0.4512821 0.2103560 1.303235    88
## [6]  {Radix Angelicae Dahuricae,                                                                          
##       Rhizoma Chuanxiong}             => {Fructus Crataegi} 0.08953614  0.4486486 0.1995685 1.295630    83
## [7]  {Radix Astragali,                                                                                    
##       Radix Paeoniae Alba}            => {Fructus Crataegi} 0.08953614  0.4009662 0.2233010 1.157930    83
## [8]  {Radix Rehmanniae Preparata,                                                                         
##       Rhizoma Chuanxiong}             => {Fructus Crataegi} 0.08737864  0.4285714 0.2038835 1.237650    81
## [9]  {Bulbus Fritillariae Cirrhosae,                                                                      
##       Radix Astragali}                => {Fructus Crataegi} 0.07982740  0.4596273 0.1736785 1.327335    74
## [10] {Bulbus Fritillariae Cirrhosae,                                                                      
##       Rhizoma Chuanxiong}             => {Fructus Crataegi} 0.07659115  0.4329268 0.1769148 1.250228    71

Due to the combination nature of herbal pairings, the confidence threshold is increased to identify items that are highly associated with fruit herbs in purchase orders. In transaction [9] result shows when purchasing Radix Astragali and Radix Ginseng, 40.69% of the transactions also included Semen Cassiae. The lift value of 1.552 indicates that this combination is 1.552 times more likely to occur together than independently.

Set fruit herbs as my antecedent

rules.fruitherbslhs<-apriori(data=tcm, parameter=list(supp=0.03,conf =
0.4, minlen=2), appearance=list(lhs=fruit_herbs),
control=list(verbose=F))

inspect(sort(rules.fruitherbslhs, by="confidence")[1:10])
##      lhs                    rhs                      support confidence   coverage     lift count
## [1]  {Fructus Jujubae,                                                                           
##       Semen Cassiae}     => {Radix Astragali}     0.03451996  0.6956522 0.04962244 1.310710    32
## [2]  {Fructus Lycii,                                                                             
##       Semen Cassiae}     => {Radix Glycyrrhizae}  0.03775620  0.6481481 0.05825243 1.498337    35
## [3]  {Fructus Crataegi,                                                                          
##       Fructus Lycii}     => {Rhizoma Chuanxiong}  0.05717368  0.6463415 0.08845739 1.288513    53
## [4]  {Fructus Jujubae}   => {Radix Astragali}     0.11866235  0.5978261 0.19848975 1.126392   110
## [5]  {Fructus Crataegi,                                                                          
##       Semen Cassiae}     => {Radix Astragali}     0.05933118  0.5978261 0.09924488 1.126392    55
## [6]  {Fructus Crataegi,                                                                          
##       Fructus Jujubae}   => {Radix Astragali}     0.04314995  0.5970149 0.07227616 1.124863    40
## [7]  {Semen Cassiae}     => {Radix Astragali}     0.15533981  0.5925926 0.26213592 1.116531   144
## [8]  {Fructus Lycii}     => {Rhizoma Chuanxiong}  0.11542611  0.5879121 0.19633225 1.172031   107
## [9]  {Fructus Crataegi,                                                                          
##       Fructus Lycii}     => {Radix Astragali}     0.05177994  0.5853659 0.08845739 1.102915    48
## [10] {Fructus Crataegi,                                                                          
##       Fructus Jujubae}   => {Radix Paeoniae Alba} 0.04207120  0.5820896 0.07227616 1.470292    39

Fruit herbs mostly paired with root herbs because their function are complementary. Fruit herbs are effective in moistening dryness and replenishing deficiencies. while root herbs provide a stable and tonifying, which are better to paired up harmonized the medicine properties.

Herb Radix Paeoniae Alba as my target variable

RadixPaeoniaeAlba.rules<-subset(tcmrules, items %in% "Radix Paeoniae Alba")

rules.RPAconf<-sort(RadixPaeoniaeAlba.rules, by="confidence", decreasing=TRUE)
rules.RPAlift<-sort(RadixPaeoniaeAlba.rules, by="lift", decreasing=TRUE)

inspect(sort(rules.RPAconf))
##     lhs                      rhs                  support   confidence
## [1] {Radix Paeoniae Alba} => {Radix Astragali}    0.2233010 0.5640327 
## [2] {Radix Paeoniae Alba} => {Rhizoma Chuanxiong} 0.2200647 0.5558583 
##     coverage  lift    count
## [1] 0.3959008 1.06272 207  
## [2] 0.3959008 1.10813 204

Radix Paeoniae Alba, Rhizoma Chuanxiong, and Radix Astragali have a high association rule, with average 55% confidence and normal lift value. In terms of medical treatement, Radix Paeoniae Alba and Rhizoma Chuanxiong are components of Si Wu soup, used to treat irregular menstruation in women and to replenish blood.

Jaccard Index

# Use Jaccrd method to check the dissimilarity of the herbs
value.tcm<-tcm[, itemFrequency(tcm)>=0.05]
jaccard.dissimilarity<-dissimilarity(value.tcm, which="items")
round(jaccard.dissimilarity,2)
##                               Bulbus Fritillariae Cirrhosae Bulbus Lilii
## Bulbus Lilii                                           0.89             
## Flos Chrysanthemi                                      0.79         0.88
## Fructus Crataegi                                       0.76         0.85
## Fructus Jujubae                                        0.82         0.91
## Fructus Lycii                                          0.85         0.84
## Ganoderma                                              0.88         0.91
## Pericarpium Citri Reticulatae                          0.79         0.90
## Radix Angelicae Dahuricae                              0.77         0.87
## Radix Angelicae Sinensis                               0.75         0.87
## Radix Astragali                                        0.74         0.88
## Radix Ginseng                                          0.88         0.91
## Radix Glycyrrhizae                                     0.73         0.84
## Radix Paeoniae Alba                                    0.76         0.84
## Radix Rehmanniae Preparata                             0.79         0.85
## Rhizoma Chuanxiong                                     0.73         0.84
## Rhizoma Coptidis                                       0.81         0.89
## Rhizoma Gastrodiae                                     0.94         0.93
## Semen Cassiae                                          0.83         0.90
## Semen Euphorbiae Lathyridis                            0.94         0.92
##                               Flos Chrysanthemi Fructus Crataegi
## Bulbus Lilii                                                    
## Flos Chrysanthemi                                               
## Fructus Crataegi                           0.79                 
## Fructus Jujubae                            0.85             0.85
## Fructus Lycii                              0.85             0.81
## Ganoderma                                  0.86             0.88
## Pericarpium Citri Reticulatae              0.81             0.80
## Radix Angelicae Dahuricae                  0.77             0.75
## Radix Angelicae Sinensis                   0.75             0.77
## Radix Astragali                            0.70             0.71
## Radix Ginseng                              0.89             0.86
## Radix Glycyrrhizae                         0.76             0.74
## Radix Paeoniae Alba                        0.77             0.73
## Radix Rehmanniae Preparata                 0.79             0.77
## Rhizoma Chuanxiong                         0.73             0.70
## Rhizoma Coptidis                           0.83             0.80
## Rhizoma Gastrodiae                         0.94             0.91
## Semen Cassiae                              0.83             0.81
## Semen Euphorbiae Lathyridis                0.93             0.94
##                               Fructus Jujubae Fructus Lycii Ganoderma
## Bulbus Lilii                                                         
## Flos Chrysanthemi                                                    
## Fructus Crataegi                                                     
## Fructus Jujubae                                                      
## Fructus Lycii                            0.88                        
## Ganoderma                                0.91          0.89          
## Pericarpium Citri Reticulatae            0.85          0.87      0.90
## Radix Angelicae Dahuricae                0.82          0.86      0.88
## Radix Angelicae Sinensis                 0.83          0.83      0.88
## Radix Astragali                          0.81          0.82      0.88
## Radix Ginseng                            0.89          0.90      0.90
## Radix Glycyrrhizae                       0.82          0.81      0.86
## Radix Paeoniae Alba                      0.80          0.81      0.88
## Radix Rehmanniae Preparata               0.81          0.81      0.86
## Rhizoma Chuanxiong                       0.80          0.80      0.89
## Rhizoma Coptidis                         0.86          0.85      0.90
## Rhizoma Gastrodiae                       0.94          0.96      0.94
## Semen Cassiae                            0.88          0.85      0.90
## Semen Euphorbiae Lathyridis              0.97          0.94      0.96
##                               Pericarpium Citri Reticulatae
## Bulbus Lilii                                               
## Flos Chrysanthemi                                          
## Fructus Crataegi                                           
## Fructus Jujubae                                            
## Fructus Lycii                                              
## Ganoderma                                                  
## Pericarpium Citri Reticulatae                              
## Radix Angelicae Dahuricae                              0.83
## Radix Angelicae Sinensis                               0.78
## Radix Astragali                                        0.76
## Radix Ginseng                                          0.88
## Radix Glycyrrhizae                                     0.79
## Radix Paeoniae Alba                                    0.82
## Radix Rehmanniae Preparata                             0.81
## Rhizoma Chuanxiong                                     0.77
## Rhizoma Coptidis                                       0.84
## Rhizoma Gastrodiae                                     0.94
## Semen Cassiae                                          0.82
## Semen Euphorbiae Lathyridis                            0.95
##                               Radix Angelicae Dahuricae
## Bulbus Lilii                                           
## Flos Chrysanthemi                                      
## Fructus Crataegi                                       
## Fructus Jujubae                                        
## Fructus Lycii                                          
## Ganoderma                                              
## Pericarpium Citri Reticulatae                          
## Radix Angelicae Dahuricae                              
## Radix Angelicae Sinensis                           0.72
## Radix Astragali                                    0.70
## Radix Ginseng                                      0.86
## Radix Glycyrrhizae                                 0.74
## Radix Paeoniae Alba                                0.74
## Radix Rehmanniae Preparata                         0.76
## Rhizoma Chuanxiong                                 0.70
## Rhizoma Coptidis                                   0.83
## Rhizoma Gastrodiae                                 0.92
## Semen Cassiae                                      0.78
## Semen Euphorbiae Lathyridis                        0.94
##                               Radix Angelicae Sinensis Radix Astragali
## Bulbus Lilii                                                          
## Flos Chrysanthemi                                                     
## Fructus Crataegi                                                      
## Fructus Jujubae                                                       
## Fructus Lycii                                                         
## Ganoderma                                                             
## Pericarpium Citri Reticulatae                                         
## Radix Angelicae Dahuricae                                             
## Radix Angelicae Sinensis                                              
## Radix Astragali                                   0.70                
## Radix Ginseng                                     0.86            0.84
## Radix Glycyrrhizae                                0.72            0.68
## Radix Paeoniae Alba                               0.73            0.68
## Radix Rehmanniae Preparata                        0.75            0.69
## Rhizoma Chuanxiong                                0.68            0.59
## Rhizoma Coptidis                                  0.81            0.75
## Rhizoma Gastrodiae                                0.93            0.93
## Semen Cassiae                                     0.78            0.76
## Semen Euphorbiae Lathyridis                       0.92            0.92
##                               Radix Ginseng Radix Glycyrrhizae
## Bulbus Lilii                                                  
## Flos Chrysanthemi                                             
## Fructus Crataegi                                              
## Fructus Jujubae                                               
## Fructus Lycii                                                 
## Ganoderma                                                     
## Pericarpium Citri Reticulatae                                 
## Radix Angelicae Dahuricae                                     
## Radix Angelicae Sinensis                                      
## Radix Astragali                                               
## Radix Ginseng                                                 
## Radix Glycyrrhizae                     0.88                   
## Radix Paeoniae Alba                    0.85               0.73
## Radix Rehmanniae Preparata             0.87               0.71
## Rhizoma Chuanxiong                     0.86               0.66
## Rhizoma Coptidis                       0.89               0.82
## Rhizoma Gastrodiae                     0.97               0.92
## Semen Cassiae                          0.86               0.77
## Semen Euphorbiae Lathyridis            0.95               0.92
##                               Radix Paeoniae Alba Radix Rehmanniae Preparata
## Bulbus Lilii                                                                
## Flos Chrysanthemi                                                           
## Fructus Crataegi                                                            
## Fructus Jujubae                                                             
## Fructus Lycii                                                               
## Ganoderma                                                                   
## Pericarpium Citri Reticulatae                                               
## Radix Angelicae Dahuricae                                                   
## Radix Angelicae Sinensis                                                    
## Radix Astragali                                                             
## Radix Ginseng                                                               
## Radix Glycyrrhizae                                                          
## Radix Paeoniae Alba                                                         
## Radix Rehmanniae Preparata                   0.73                           
## Rhizoma Chuanxiong                           0.68                       0.68
## Rhizoma Coptidis                             0.82                       0.80
## Rhizoma Gastrodiae                           0.93                       0.93
## Semen Cassiae                                0.76                       0.77
## Semen Euphorbiae Lathyridis                  0.93                       0.92
##                               Rhizoma Chuanxiong Rhizoma Coptidis
## Bulbus Lilii                                                     
## Flos Chrysanthemi                                                
## Fructus Crataegi                                                 
## Fructus Jujubae                                                  
## Fructus Lycii                                                    
## Ganoderma                                                        
## Pericarpium Citri Reticulatae                                    
## Radix Angelicae Dahuricae                                        
## Radix Angelicae Sinensis                                         
## Radix Astragali                                                  
## Radix Ginseng                                                    
## Radix Glycyrrhizae                                               
## Radix Paeoniae Alba                                              
## Radix Rehmanniae Preparata                                       
## Rhizoma Chuanxiong                                               
## Rhizoma Coptidis                            0.79                 
## Rhizoma Gastrodiae                          0.93             0.95
## Semen Cassiae                               0.77             0.85
## Semen Euphorbiae Lathyridis                 0.92             0.92
##                               Rhizoma Gastrodiae Semen Cassiae
## Bulbus Lilii                                                  
## Flos Chrysanthemi                                             
## Fructus Crataegi                                              
## Fructus Jujubae                                               
## Fructus Lycii                                                 
## Ganoderma                                                     
## Pericarpium Citri Reticulatae                                 
## Radix Angelicae Dahuricae                                     
## Radix Angelicae Sinensis                                      
## Radix Astragali                                               
## Radix Ginseng                                                 
## Radix Glycyrrhizae                                            
## Radix Paeoniae Alba                                           
## Radix Rehmanniae Preparata                                    
## Rhizoma Chuanxiong                                            
## Rhizoma Coptidis                                              
## Rhizoma Gastrodiae                                            
## Semen Cassiae                               0.93              
## Semen Euphorbiae Lathyridis                 0.98          0.93
plot(hclust(jaccard.dissimilarity, method="ward.D2"), main="Dendrogram of Herbs using Jaccard Dissimilarity", xlab="Herbs")

The data shows the dissimilarity between two herbs. The closer the value to 0 indicates high association, while approach to 1 result in dissimilarity.

The graph present the similarity by using hierarchical clustering, Bulbus Lilii and Fructus Lycii are commonly used together, and Ganoderma shows lower association with other herbs.

Conclusion

By utilizing association rules in Traditional Chinese Medicine, we can better understand the connections between medicinal herbs and their pairing patterns in this clinic. Firstly, this method can be applied during herb shortage seasons to pre-order the required quantities of herbs to prevent stockouts. Secondly, this approach allows us to gain a deeper understanding of customer symptoms, enabling doctors to further explore other treatment methods. Additionally, the connection reports generated through this analysis can be used to develop new treatment approaches.