This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.
datos <- read.csv("C:/Users/Sotto Family.DESKTOP-P9Q8G8G/Desktop/UOC 2021/CIENCIA DE DATOS Y SALUS/PRACTICA/sarcopenia_data.txt", na.strings="", stringsAsFactors=TRUE)
library (arules)
## Loading required package: Matrix
##
## Attaching package: 'arules'
## The following objects are masked from 'package:base':
##
## abbreviate, write
library (arulesViz)
library (tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.3 v purrr 0.3.4
## v tibble 3.1.1 v dplyr 1.0.6
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x tidyr::expand() masks Matrix::expand()
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## x tidyr::pack() masks Matrix::pack()
## x dplyr::recode() masks arules::recode()
## x tidyr::unpack() masks Matrix::unpack()
##Estudiamos la base de datos##
str(datos)
## 'data.frame': 20796 obs. of 2 variables:
## $ PatientID: int 1 2 2 3 3 3 4 5 5 5 ...
## $ Symptom : Factor w/ 36 levels "Abdominal_Obesity",..: 21 25 25 1 24 4 7 32 12 21 ...
tenemos 20.796 datos de 2 variables , Symptom y PatientID ## eliminamos los datos duplicados y eliminamos los datos NA ##
datosSD <- datos[!duplicated (datos), ]
view(datosSD)
19.146 datos tras eliminar duplicados
datosSDN <- na.omit(datosSD)
view (datosSDN)
18.960 datos finales sin duplicados y sin espacios vacios.
levels(datosSDN$Symptom)
## [1] "Abdominal_Obesity"
## [2] "Arteriosclerosis"
## [3] "Atrophic_Muscular_Disorders"
## [4] "Chronic_Kidney_Failure"
## [5] "Chronic_Obstructive_Pulmonary_Disease"
## [6] "Cognition_Disorders"
## [7] "Ecoli_Infections"
## [8] "Gastrointestinal_Neoplasms"
## [9] "Hip_Fractures"
## [10] "Hypokinesia"
## [11] "Hypothyroidism"
## [12] "Inflammation"
## [13] "Kidney_Neoplasms"
## [14] "Liver_Diseases"
## [15] "Malnutrition"
## [16] "Metabolic_Syndrome_X"
## [17] "Muscle_Weakness"
## [18] "Muscular_Atrophy"
## [19] "Neurologic_Gait_Disorders"
## [20] "Non-Small-Cell_Lung_Carcinoma"
## [21] "Obesity"
## [22] "Osteoporosis"
## [23] "Overnutrition"
## [24] "Pain"
## [25] "Pancreatic_Neoplasms"
## [26] "Prostatic_Neoplasms"
## [27] "Protein-Energy_Malnutrition"
## [28] "Pseudomonas_Infections"
## [29] "Renal_Cell_Carcinoma"
## [30] "Reperfusion_Injury"
## [31] "Rheumatoid_Arthritis"
## [32] "Sarcopenia"
## [33] "Sensation_Disorders"
## [34] "Starvation"
## [35] "Systemic_Lupus_Erythematosus"
## [36] "Type2_Diabetes_Mellitus"
##síntomas mas frecuentes##
summary(datosSDN)
## PatientID Symptom
## Min. : 1 Sarcopenia :4528
## 1st Qu.:2523 Obesity :3097
## Median :5035 Muscle_Weakness :1350
## Mean :4944 Muscular_Atrophy: 983
## 3rd Qu.:7358 Inflammation : 815
## Max. :9684 Malnutrition : 753
## (Other) :7434
El síntoma más frecuente es sarcopenia, seguido de obesidad, debilidad muscular, atrofia muscular, inflamación y malnutrición
library(dplyr)
sunflowerplot(datosSDN$Symptom, datosSDN$PatientID, main= "sunflowerplot")
##trabajamos ahora con arules, primer paso es la importacion de datos a un objeto tipo transaction ##
datos_split <- split(x = datosSDN$Symptom, f = datosSDN$PatientID)
transacciones <- as(datos_split, Class = "transactions")
transacciones
## transactions in sparse format with
## 9409 transactions (rows) and
## 36 items (columns)
summary(transacciones)
## transactions as itemMatrix in sparse format with
## 9409 rows (elements/itemsets/transactions) and
## 36 columns (items) and a density of 0.05597478
##
## most frequent items:
## Sarcopenia Obesity Muscle_Weakness Muscular_Atrophy
## 4528 3097 1350 983
## Inflammation (Other)
## 815 8187
##
## element (itemset/transaction) length distribution:
## sizes
## 1 2 3 4 5 6 7 8 9
## 3814 3092 1511 676 209 82 16 5 4
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 1.000 2.000 2.015 3.000 9.000
##
## includes extended item information - examples:
## labels
## 1 Abdominal_Obesity
## 2 Arteriosclerosis
## 3 Atrophic_Muscular_Disorders
##
## includes extended transaction information - examples:
## transactionID
## 1 1
## 2 2
## 3 3
tenemos 9.409 transacciones, y 36 columnas que sonlos síntomas., solo 4 transacciones tienen 4 elementos, la mayoría solo tiene 1(3.841), y la mediana está en 2 elementos. ## Empezamos a explorar el df de transacciones para buscar las asociaciones mas frecuentes##
inspect (transacciones[1:6])
## items transactionID
## [1] {Obesity} 1
## [2] {Pancreatic_Neoplasms} 2
## [3] {Abdominal_Obesity,
## Chronic_Kidney_Failure,
## Pain} 3
## [4] {Ecoli_Infections} 4
## [5] {Inflammation,
## Obesity,
## Sarcopenia} 5
## [6] {Chronic_Obstructive_Pulmonary_Disease,
## Ecoli_Infections,
## Inflammation} 6
df_transacciones <- as(transacciones, Class = "data.frame")
# Para que el tamaño de la tabla se ajuste mejor, usamos tibble#
as_tibble(df_transacciones) %>% head()
frecuencia_items <- itemFrequency(x = transacciones, type = "relative")
frecuencia_items %>% sort(decreasing = TRUE) %>% head (10)
## Sarcopenia Obesity
## 0.48124136 0.32915294
## Muscle_Weakness Muscular_Atrophy
## 0.14347965 0.10447444
## Inflammation Malnutrition
## 0.08661919 0.08002976
## Metabolic_Syndrome_X Chronic_Obstructive_Pulmonary_Disease
## 0.07227123 0.06217451
## Abdominal_Obesity Chronic_Kidney_Failure
## 0.05866723 0.05473483
##podemos tambien ver los sintomas menos frecuentes## frecuencia_items <- itemFrequency(x = transacciones, type = “relative”) frecuencia_items %>% sort(decreasing = TRUE) %>% tail (10) ##
frecuencia_items <- itemFrequency(x = transacciones, type = "relative")
frecuencia_items %>% sort(decreasing = TRUE) %>% tail (10)
## Hypokinesia Arteriosclerosis Neurologic_Gait_Disorders
## 0.014241683 0.013285153 0.013072590
## Type2_Diabetes_Mellitus Reperfusion_Injury Overnutrition
## 0.010628122 0.010521841 0.008608779
## Starvation Hip_Fractures Hypothyroidism
## 0.008183654 0.006695717 0.006164311
## Cognition_Disorders
## 0.004570092
summary(frecuencia_items)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00457 0.01400 0.03151 0.05597 0.05572 0.48124
La frecuencia máxima de un item es 48% y la que menos 0.4% ##Es muy importante estudiar cómo se distribuye el soporte de los síntomas individuales en un conjunto de pacientes para identificar itemsets frecuentes o crear reglas de asociación, ya que, dependiendo del caso, tendrá sentido un límite de soporte u otro. Por lo general, cuando el número de posibles items es muy grande (varios miles) prácticamente todos los artículos son raros, por lo que los soportes son muy bajos## La soporte máxima es es 48.5%, el soporte elegido debe ser menor de 0.5 y mayor o igual de 0.0045 que es el mínimo
itemFrequencyPlot(transacciones, support = 0.005, cex.names = 0.8)
options(repr.plot.width = 3, repr.plot.height = 2)
itemFrequencyPlot(transacciones , topN = 36)
En esta gráfica se puede ver ordenado por frecuencias relativas para cada item. ##buscamos itemsets frecuentes## para ello aplicamos el algoritmo apriori que en realidad nos permite reducir el número de operaciones.
soporte <- 10/dim(transacciones)[1]
itemsets <- apriori (data = transacciones, parameter = list(support = soporte, minlen = 1, maxlen = 20, target = "frequent itemset"))
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## NA 0.1 1 none FALSE TRUE 5 0.001062812 1
## maxlen target ext
## 20 frequent itemsets TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 10
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[36 item(s), 9409 transaction(s)] done [0.00s].
## sorting and recoding items ... [36 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 done [0.00s].
## sorting transactions ... done [0.00s].
## writing ... [493 set(s)] done [0.00s].
## creating S4 object ... done [0.00s].
summary (itemsets)
## set of 493 itemsets
##
## most frequent items:
## Sarcopenia Obesity Muscle_Weakness Muscular_Atrophy
## 176 107 83 73
## Malnutrition (Other)
## 61 661
##
## element (itemset/transaction) length distribution:sizes
## 1 2 3 4
## 36 253 197 7
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 2.000 2.355 3.000 4.000
##
## summary of quality measures:
## support transIdenticalToItemsets count
## Min. :0.001063 Min. :0.0000000 Min. : 10.00
## 1st Qu.:0.001488 1st Qu.:0.0001063 1st Qu.: 14.00
## Median :0.002232 Median :0.0003188 Median : 21.00
## Mean :0.008125 Mean :0.0016830 Mean : 76.45
## 3rd Qu.:0.004995 3rd Qu.:0.0008502 3rd Qu.: 47.00
## Max. :0.481241 Max. :0.1259433 Max. :4528.00
##
## includes transaction ID lists: FALSE
##
## mining info:
## data ntransactions support confidence
## transacciones 9409 0.001062812 1
top_20_itemsets <- sort(itemsets, by = "support", decreasing = FALSE)[1:20]
inspect(top_20_itemsets)
## items support transIdenticalToItemsets count
## [1] {Reperfusion_Injury,
## Sensation_Disorders} 0.001062812 0.0000000000 10
## [2] {Kidney_Neoplasms,
## Neurologic_Gait_Disorders} 0.001062812 0.0002125624 10
## [3] {Neurologic_Gait_Disorders,
## Non-Small-Cell_Lung_Carcinoma} 0.001062812 0.0002125624 10
## [4] {Arteriosclerosis,
## Chronic_Kidney_Failure} 0.001062812 0.0000000000 10
## [5] {Prostatic_Neoplasms,
## Pseudomonas_Infections} 0.001062812 0.0000000000 10
## [6] {Prostatic_Neoplasms,
## Renal_Cell_Carcinoma} 0.001062812 0.0000000000 10
## [7] {Prostatic_Neoplasms,
## Rheumatoid_Arthritis} 0.001062812 0.0000000000 10
## [8] {Muscular_Atrophy,
## Sarcopenia,
## Starvation} 0.001062812 0.0001062812 10
## [9] {Muscular_Atrophy,
## Reperfusion_Injury,
## Sarcopenia} 0.001062812 0.0001062812 10
## [10] {Obesity,
## Reperfusion_Injury,
## Sarcopenia} 0.001062812 0.0001062812 10
## [11] {Metabolic_Syndrome_X,
## Neurologic_Gait_Disorders,
## Sarcopenia} 0.001062812 0.0001062812 10
## [12] {Arteriosclerosis,
## Muscular_Atrophy,
## Sarcopenia} 0.001062812 0.0001062812 10
## [13] {Arteriosclerosis,
## Muscle_Weakness,
## Obesity} 0.001062812 0.0000000000 10
## [14] {Hypokinesia,
## Malnutrition,
## Metabolic_Syndrome_X} 0.001062812 0.0000000000 10
## [15] {Hypokinesia,
## Malnutrition,
## Sarcopenia} 0.001062812 0.0003188437 10
## [16] {Kidney_Neoplasms,
## Metabolic_Syndrome_X,
## Sensation_Disorders} 0.001062812 0.0001062812 10
## [17] {Malnutrition,
## Non-Small-Cell_Lung_Carcinoma,
## Sarcopenia} 0.001062812 0.0003188437 10
## [18] {Muscular_Atrophy,
## Non-Small-Cell_Lung_Carcinoma,
## Sarcopenia} 0.001062812 0.0001062812 10
## [19] {Inflammation,
## Protein-Energy_Malnutrition,
## Sarcopenia} 0.001062812 0.0002125624 10
## [20] {Muscle_Weakness,
## Obesity,
## Pseudomonas_Infections} 0.001062812 0.0001062812 10
####identificados los itemsets más frecuentes, podemos filtrarlos por síntomas presentes, buscamos aquellos donde aparece sarcopenia##
itemsets_filtrado <-arules::subset(itemsets, subset = items %in% "Sarcopenia")
itemsets_filtrado
## set of 176 itemsets
inspect (itemsets_filtrado [1:10])
## items support
## [1] {Sarcopenia} 0.481241365
## [2] {Hip_Fractures,Sarcopenia} 0.005420342
## [3] {Hypothyroidism,Sarcopenia} 0.003294718
## [4] {Overnutrition,Sarcopenia} 0.004570092
## [5] {Sarcopenia,Starvation} 0.004038686
## [6] {Atrophic_Muscular_Disorders,Sarcopenia} 0.002975874
## [7] {Sarcopenia,Type2_Diabetes_Mellitus} 0.005739186
## [8] {Sarcopenia,Systemic_Lupus_Erythematosus} 0.003294718
## [9] {Reperfusion_Injury,Sarcopenia} 0.006589436
## [10] {Pain,Sarcopenia} 0.004144968
## transIdenticalToItemsets count
## [1] 0.1212668721 4528
## [2] 0.0021256244 51
## [3] 0.0008502498 31
## [4] 0.0010628122 43
## [5] 0.0000000000 38
## [6] 0.0011690934 28
## [7] 0.0009565310 54
## [8] 0.0005314061 31
## [9] 0.0015942183 62
## [10] 0.0009565310 39
##Empezamos a buscar asociaciones, aplicando Reglas de asociación con el paquete arules.#
soporte <- 10/ dim(transacciones)[1]
reglas <- apriori(data = transacciones,
parameter = list(support = soporte,
confidence = 0,
# Se especifica que se creen reglas
target = "rules"))
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0 0.1 1 none FALSE TRUE 5 0.001062812 1
## maxlen target ext
## 10 rules TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 10
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[36 item(s), 9409 transaction(s)] done [0.00s].
## sorting and recoding items ... [36 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 done [0.00s].
## writing ... [1161 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
summary(reglas)
## set of 1161 rules
##
## rule length distribution (lhs + rhs):sizes
## 1 2 3 4
## 36 506 591 28
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 3.000 2.526 3.000 4.000
##
## summary of quality measures:
## support confidence coverage lift
## Min. :0.001063 Min. :0.00457 Min. :0.001169 Min. : 0.1868
## 1st Qu.:0.001382 1st Qu.:0.04658 1st Qu.:0.009672 1st Qu.: 0.8381
## Median :0.002019 Median :0.09239 Median :0.028377 Median : 1.1604
## Mean :0.005565 Mean :0.16799 Mean :0.085510 Mean : 1.4709
## 3rd Qu.:0.003826 3rd Qu.:0.22222 3rd Qu.:0.058667 3rd Qu.: 1.6862
## Max. :0.481241 Max. :0.90909 Max. :1.000000 Max. :11.2079
## count
## Min. : 10.00
## 1st Qu.: 13.00
## Median : 19.00
## Mean : 52.36
## 3rd Qu.: 36.00
## Max. :4528.00
##
## mining info:
## data ntransactions support confidence
## transacciones 9409 0.001062812 0
Probamos con la mínima confianza, buscando la máxima para luego aplicarla, vemos que es 0.8, trabajamos con esa confianza mínima
soporte <- 10/ dim(transacciones)[1]
reglas <- apriori(data = transacciones,
parameter = list(support = soporte,
confidence = 0.8,
# Se especifica que se creen reglas
target = "rules"))
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.8 0.1 1 none FALSE TRUE 5 0.001062812 1
## maxlen target ext
## 10 rules TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 10
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[36 item(s), 9409 transaction(s)] done [0.00s].
## sorting and recoding items ... [36 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 done [0.00s].
## writing ... [6 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
summary(reglas)
## set of 6 rules
##
## rule length distribution (lhs + rhs):sizes
## 2 3 4
## 1 4 1
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2 3 3 3 3 4
##
## summary of quality measures:
## support confidence coverage lift
## Min. :0.001063 Min. :0.8095 Min. :0.001169 Min. :1.682
## 1st Qu.:0.001302 1st Qu.:0.8333 1st Qu.:0.001515 1st Qu.:1.732
## Median :0.001488 Median :0.8452 Median :0.001754 Median :1.756
## Mean :0.002055 Mean :0.8515 Mean :0.002462 Mean :1.769
## 3rd Qu.:0.001594 3rd Qu.:0.8643 3rd Qu.:0.001913 3rd Qu.:1.796
## Max. :0.005420 Max. :0.9091 Max. :0.006696 Max. :1.889
## count
## Min. :10.00
## 1st Qu.:12.25
## Median :14.00
## Mean :19.33
## 3rd Qu.:15.00
## Max. :51.00
##
## mining info:
## data ntransactions support confidence
## transacciones 9409 0.001062812 0.8
Obtenemos 6 reglas, la mayoría formadas por dos items en el antecedente. ##inspeccionamos las reglas obtenidas#
inspect(sort(x = reglas, decreasing = TRUE, by = "confidence"))
## lhs rhs support confidence coverage lift count
## [1] {Abdominal_Obesity,
## Chronic_Obstructive_Pulmonary_Disease,
## Malnutrition} => {Sarcopenia} 0.001062812 0.9090909 0.001169093 1.889054 10
## [2] {Inflammation,
## Pseudomonas_Infections} => {Sarcopenia} 0.001381656 0.8666667 0.001594218 1.800898 13
## [3] {Metabolic_Syndrome_X,
## Type2_Diabetes_Mellitus} => {Sarcopenia} 0.001275375 0.8571429 0.001487937 1.781108 12
## [4] {Metabolic_Syndrome_X,
## Reperfusion_Injury} => {Sarcopenia} 0.001594218 0.8333333 0.001913062 1.731633 15
## [5] {Gastrointestinal_Neoplasms,
## Malnutrition} => {Sarcopenia} 0.001594218 0.8333333 0.001913062 1.731633 15
## [6] {Hip_Fractures} => {Sarcopenia} 0.005420342 0.8095238 0.006695717 1.682158 51
##exploramos las reglas maximales ##
reglas_maximales <- reglas [is.maximal(reglas)]
reglas_maximales
## set of 6 rules
inspect(reglas_maximales)
## lhs rhs support confidence coverage lift count
## [1] {Hip_Fractures} => {Sarcopenia} 0.005420342 0.8095238 0.006695717 1.682158 51
## [2] {Metabolic_Syndrome_X,
## Type2_Diabetes_Mellitus} => {Sarcopenia} 0.001275375 0.8571429 0.001487937 1.781108 12
## [3] {Metabolic_Syndrome_X,
## Reperfusion_Injury} => {Sarcopenia} 0.001594218 0.8333333 0.001913062 1.731633 15
## [4] {Inflammation,
## Pseudomonas_Infections} => {Sarcopenia} 0.001381656 0.8666667 0.001594218 1.800898 13
## [5] {Gastrointestinal_Neoplasms,
## Malnutrition} => {Sarcopenia} 0.001594218 0.8333333 0.001913062 1.731633 15
## [6] {Abdominal_Obesity,
## Chronic_Obstructive_Pulmonary_Disease,
## Malnutrition} => {Sarcopenia} 0.001062812 0.9090909 0.001169093 1.889054 10
#todas las reglas son maximales# ##buscamos reglas redundantes##
reglas_redundantes <- reglas[is.redundant(x = reglas, measure = "confidence")]
reglas_redundantes
## set of 0 rules
No hay reglas redundantes ## grafica con las reglas ##
plot(reglas)
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.
plot (reglas,method="graph", "interactive "=TRUE, shading=NA)
## Warning: Unknown control parameters: interactive
## Available control parameters (with default values):
## layout = list(fun = function (graph, dim = 2, ...) { if ("layout" %in% graph_attr_names(graph)) { lay <- graph_attr(graph, "layout") if (is.function(lay)) { lay(graph, ...) } else { lay } } else if (all(c("x", "y") %in% vertex_attr_names(graph))) { if ("z" %in% vertex_attr_names(graph)) { cbind(V(graph)$x, V(graph)$y, V(graph)$z) } else { cbind(V(graph)$x, V(graph)$y) } } else if (vcount(graph) < 1000) { layout_with_fr(graph, dim = dim, ...) } else { layout_with_drl(graph, dim = dim, ...) } }, call_str = c("layout_nicely(<graph>, input = \"C:/Users/Sotto Family.DESKTOP-P9Q8G8G/Desktop/definitivo 12.6/junio.Rmd\", ", " encoding = \"UTF-8\")"), args = list())
## edges = <environment>
## nodes = <environment>
## nodetext = <environment>
## colors = c("#EE0000FF", "#EEEEEEFF")
## engine = ggplot2
## max = 100
## verbose = FALSE
subrules <- head(sort(reglas, by="confidence"), 20)
plot(subrules, method="graph", engine = "htmlwidget")
##obtenemos un plot dinámico en el que podemos estudiar cada regla de forma independiente. de las 6 reglas, parece que todas ellas tienen como lhf sarcopenia, las reglas con color rojo mas oscuro, son las de mayor lift, (1),El tamaño de los circulos se corresponde con el suppportlos mas pequeños tienen el menor support y la confidence máxima es 0.9 para la regla 1.
```r
reglas_seleccionadas2 <- subset(reglas, subset = confidence > 0.9)
reglas_seleccionadas2
## set of 1 rules
inspect (reglas_seleccionadas2)
## lhs rhs support confidence coverage lift count
## [1] {Abdominal_Obesity,
## Chronic_Obstructive_Pulmonary_Disease,
## Malnutrition} => {Sarcopenia} 0.001062812 0.9090909 0.001169093 1.889054 10
Esta es la regla con una confidence mayor ##inspectionamos por el lift ##
reglas_seleccionadas3 <- subset(reglas, subset = lift >1.8)
reglas_seleccionadas3
## set of 2 rules
inspect (reglas_seleccionadas3)
## lhs rhs support confidence coverage lift count
## [1] {Inflammation,
## Pseudomonas_Infections} => {Sarcopenia} 0.001381656 0.8666667 0.001594218 1.800898 13
## [2] {Abdominal_Obesity,
## Chronic_Obstructive_Pulmonary_Disease,
## Malnutrition} => {Sarcopenia} 0.001062812 0.9090909 0.001169093 1.889054 10
Tenemos dos regals con un lift mayor de 1.8, que es el valor mas alto del lift, ##analizamos las reglas por otros criterios de calidad ##
mInteres <- interestMeasure(reglas,
measure = c("gini", "chiSquared"),
transactions=transacciones)
quality(reglas) <- cbind(quality(reglas), mInteres)
Todos los valores de gini son bajos, cerca de 0 luego hay mucha igualdad.
```