NIM : 211810329
KELAS : 4 SE 3
##INFORMASI DATA SET
Dataset HCC diperoleh di Rumah Sakit Universitas di Portugal dan berisi beberapa demografi, faktor risiko, laboratorium, dan fitur kelangsungan hidup keseluruhan dari 165 pasien nyata yang didiagnosis dengan HCC. Kumpulan data berisi 49 fitur yang dipilih menurut Pedoman Praktik Klinis EASL-EORTC (Asosiasi Eropa untuk Studi Hati - Organisasi Eropa untuk Penelitian dan Perawatan Kanker), yang merupakan teknologi terkini dalam pengelolaan HCC.
Dataset ini adalah kumpulan data yang heterogen, dengan 23 variabel kuantitatif, dan 26 variabel kualitatif. Secara keseluruhan, data yang hilang mewakili 10,22% dari keseluruhan dataset dan hanya delapan pasien yang memiliki informasi lengkap di semua bidang (4,85%). Variabel target adalah kelangsungan hidup pada 1 tahun, dan dikodekan sebagai variabel biner: 0 (mati) dan 1 (hidup).
##INFORMASI ATTRIBUTE
Gender: nominal
Symptoms: nominal
Alcohol: nominal
Hepatitis B Surface Antigen: nominal
Hepatitis B e Antigen: nominal
Hepatitis B Core Antibody: nominal
Hepatitis C Virus Antibody: nominal
Cirrhosis : nominal
Endemic Countries: nominal
Smoking: nominal
Diabetes: nominal
Obesity: nominal
Hemochromatosis: nominal
Arterial Hypertension: nominal
Chronic Renal Insufficiency: nominal
Human Immunodeficiency Virus: nominal
Nonalcoholic Steatohepatitis: nominal
Esophageal Varices: nominal
Splenomegaly: nominal
Portal Hypertension: nominal
Portal Vein Thrombosis: nominal
Liver Metastasis: nominal
Radiological Hallmark: nominal
Age at diagnosis: integer
Grams of Alcohol per day: continuous
Packs of cigarets per year: continuous
Performance Status: ordinal
Encefalopathy degree: ordinal
Ascites degree: ordinal
International Normalised Ratio: continuous
Alpha-Fetoprotein (ng/mL): continuous
Haemoglobin (g/dL): continuous
Mean Corpuscular Volume (fl): continuous
Leukocytes(G/L): continuous
Platelets (G/L): continuous
Albumin (mg/dL): continuous
Total Bilirubin(mg/dL): continuous
Alanine transaminase (U/L): continuous
Aspartate transaminase (U/L): continuous
Gamma glutamyl transferase (U/L): continuous
Alkaline phosphatase (U/L): continuous
Total Proteins (g/dL): continuous
Creatinine (mg/dL): continuous
Number of Nodules: integer
Major dimension of nodule (cm): continuous
Direct Bilirubin (mg/dL): continuous
Iron (mcg/dL): continuous
Oxygen Saturation (%): continuous
Ferritin (ng/mL): continuous
Class: nominal (1 if patient survives, 0 if patient died)
##CC-SURVIVAL DATA SET
Sumber: UCI Machine Learning Repository Data Set Characteristics: Multivariate Attribute Characteristics: Integer, Real Associated Task: Classification Number of Instances: 165 Number of Atributes: 49 Data Donated: 2017-11-29
library(readxl)
library(tidyverse)
library(dplyr)
library(imputeMissings)
library(ggplot2)
library(ggcorrplot)
library(Boruta)
Dalam melakukan data prepocessing, sebelumnya penulis melakukan pemisahan terlebih dahulu antara atribut yang bersifat kategorik dengan atribut yang bersifat numerik. Hal tersebut dilakukan karena paneliti kali ini hanya akn melakukan klasifikasi menggunakan atribut yang bersifat kategorik.
HCCKategorik=read_excel("C:\\Users\\USER\\Downloads\\Tingkat 4\\04.DATMIN-BU WA ODE\\Tugas Pertama\\HCC-SURVIVAL.xlsx",sheet="Kategorik")
glimpse(HCCKategorik)
## Rows: 165
## Columns: 28
## $ Patients <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, ~
## $ Gender <dbl> 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ~
## $ Symptoms <chr> "0", "?", "0", "1", "1", "0", "0", "1",~
## $ Alcohol <dbl> 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, ~
## $ `Hepatitis B Surface Antigen` <chr> "0", "0", "1", "0", "1", "0", "0", "0",~
## $ `Hepatitis B e Antigen` <chr> "0", "0", "0", "0", "0", "?", "?", "?",~
## $ `Hepatitis B Core Antibody` <chr> "0", "0", "1", "0", "1", "0", "1", "0",~
## $ `Hepatitis C Virus Antibody` <chr> "0", "1", "0", "0", "0", "0", "1", "0",~
## $ Cirrhosis <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ~
## $ `Endemic Countries` <chr> "0", "?", "0", "0", "0", "0", "0", "0",~
## $ Smoking <chr> "1", "?", "1", "1", "1", "?", "0", "1",~
## $ Diabetes <chr> "1", "1", "0", "1", "0", "0", "1", "1",~
## $ Obesity <chr> "?", "0", "0", "0", "0", "1", "0", "?",~
## $ Hemochromatosis <chr> "1", "0", "0", "0", "0", "0", "?", "0",~
## $ `Arterial Hypertension` <chr> "0", "1", "1", "1", "1", "0", "0", "0",~
## $ `Chronic Renal Insufficiency` <chr> "0", "0", "1", "0", "1", "0", "0", "0",~
## $ `Human Immunodeficiency Virus` <chr> "0", "0", "0", "0", "0", "0", "0", "0",~
## $ `Nonalcoholic Steatohepatitis` <chr> "0", "0", "0", "0", "0", "0", "0", "0",~
## $ `Esophageal Varices` <chr> "1", "1", "0", "0", "0", "1", "0", "0",~
## $ Splenomegaly <chr> "0", "0", "0", "0", "0", "1", "0", "1",~
## $ `Portal Hypertension` <chr> "0", "0", "1", "0", "0", "1", "0", "1",~
## $ PortalVeinThrombosis <chr> "0", "0", "0", "0", "0", "0", "0", "1",~
## $ LiverMetastasis <chr> "0", "0", "1", "1", "0", "0", "0", "0",~
## $ `Radiological Hallmark` <chr> "1", "1", "1", "1", "1", "1", "1", "1",~
## $ PerformanceStatus <dbl> 0, 0, 2, 0, 0, 1, 0, 3, 1, 0, 0, 0, 0, ~
## $ `Encefalopathy degree` <chr> "1", "1", "1", "1", "1", "1", "1", "1",~
## $ Ascitesdegree <chr> "1", "1", "2", "1", "1", "2", "1", "1",~
## $ Class <dbl> 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, ~
Dari atribut-atribut di atas masih ada yang bertipe numerik, kita harus mengubah atribut bertipe numerik menjadi kategorik.
HCCKategorik <- HCCKategorik %>% mutate_if(is.numeric,as.factor)
HCCKategorik <- HCCKategorik %>% mutate_if(is.character,as.factor)
glimpse(HCCKategorik)
## Rows: 165
## Columns: 28
## $ Patients <fct> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, ~
## $ Gender <fct> 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ~
## $ Symptoms <fct> 0, ?, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, ~
## $ Alcohol <fct> 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, ~
## $ `Hepatitis B Surface Antigen` <fct> 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, ~
## $ `Hepatitis B e Antigen` <fct> 0, 0, 0, 0, 0, ?, ?, ?, 0, 0, 0, 0, 0, ~
## $ `Hepatitis B Core Antibody` <fct> 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, ~
## $ `Hepatitis C Virus Antibody` <fct> 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, ~
## $ Cirrhosis <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ~
## $ `Endemic Countries` <fct> 0, ?, 0, 0, 0, 0, 0, 0, 0, 0, ?, 1, 0, ~
## $ Smoking <fct> 1, ?, 1, 1, 1, ?, 0, 1, 1, 0, ?, 0, 1, ~
## $ Diabetes <fct> 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, ~
## $ Obesity <fct> ?, 0, 0, 0, 0, 1, 0, ?, 0, 0, 0, 0, 0, ~
## $ Hemochromatosis <fct> 1, 0, 0, 0, 0, 0, ?, 0, 0, 1, 0, 0, 0, ~
## $ `Arterial Hypertension` <fct> 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, ~
## $ `Chronic Renal Insufficiency` <fct> 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, ~
## $ `Human Immunodeficiency Virus` <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ~
## $ `Nonalcoholic Steatohepatitis` <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ?, ~
## $ `Esophageal Varices` <fct> 1, 1, 0, 0, 0, 1, 0, 0, ?, 0, ?, ?, ?, ~
## $ Splenomegaly <fct> 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, ~
## $ `Portal Hypertension` <fct> 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, ~
## $ PortalVeinThrombosis <fct> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, ~
## $ LiverMetastasis <fct> 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, ~
## $ `Radiological Hallmark` <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, ~
## $ PerformanceStatus <fct> 0, 0, 2, 0, 0, 1, 0, 3, 1, 0, 0, 0, 0, ~
## $ `Encefalopathy degree` <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ~
## $ Ascitesdegree <fct> 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 2, 1, 1, ~
## $ Class <fct> 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, ~
Dataset kategorik ini masih terdapat missing value yang bertanda “?”. Sebelum mengisi missing value dengan sebuah nilai, terlebih dahulu tanda “?” diubah menjadi “NA”. Setelah itu “NA” akan diisi dengan nilai yang sering muncul (modus) pada atribut terkait.
Untuk melihat atribut mana yang masih memiliki missing value maka dibuatlah suatu plot untuk memudahkan kita dalam melihat atribut yang masih memiliki missing value. Warna hitam menandakan persentase missing value.
HCCKategorik <- na_if(HCCKategorik, "?")
options(repr.plot.width = 6, repr.plot.height = 4)
missing_data <- HCCKategorik %>% summarise_all(funs(sum(is.na(.))/n()))
missing_data <- gather(missing_data, key = "variables", value = "percent_missing")
ggplot(missing_data, aes(x = reorder(variables, percent_missing), y = percent_missing)) +
geom_bar(stat = "identity", fill = "black", aes(color = I('white')), size = 0.3)+
xlab('variables')+
coord_flip()+
theme_bw()
imput <- compute(HCCKategorik,method="median/mode")
HCCKategorik_nomissing <- impute(HCCKategorik,object = imput)
glimpse(HCCKategorik_nomissing)
## Rows: 165
## Columns: 28
## $ Patients <fct> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13~
## $ Gender <fct> 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
## $ Symptoms <fct> 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1,~
## $ Alcohol <fct> 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0,~
## $ Hepatitis.B.Surface.Antigen <fct> 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
## $ Hepatitis.B.e.Antigen <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
## $ Hepatitis.B.Core.Antibody <fct> 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0,~
## $ Hepatitis.C.Virus.Antibody <fct> 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1,~
## $ Cirrhosis <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
## $ Endemic.Countries <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,~
## $ Smoking <fct> 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1,~
## $ Diabetes <fct> 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0,~
## $ Obesity <fct> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,~
## $ Hemochromatosis <fct> 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,~
## $ Arterial.Hypertension <fct> 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0,~
## $ Chronic.Renal.Insufficiency <fct> 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
## $ Human.Immunodeficiency.Virus <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
## $ Nonalcoholic.Steatohepatitis <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
## $ Esophageal.Varices <fct> 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0,~
## $ Splenomegaly <fct> 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0,~
## $ Portal.Hypertension <fct> 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0,~
## $ PortalVeinThrombosis <fct> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0,~
## $ LiverMetastasis <fct> 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
## $ Radiological.Hallmark <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1,~
## $ PerformanceStatus <fct> 0, 0, 2, 0, 0, 1, 0, 3, 1, 0, 0, 0, 0, 0,~
## $ Encefalopathy.degree <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
## $ Ascitesdegree <fct> 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 2, 1, 1, 1,~
## $ Class <fct> 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1,~
Missing value sudah terisi dengan nilai modus dari masing-masing atribut, untuk memastikannya kita gunakan kembali plot yang sama untuk mengetahui apakah missing value sudah benar-benar terisi semua.
options(repr.plot.width = 6, repr.plot.height = 4)
missing_data <- HCCKategorik_nomissing %>% summarise_all(funs(sum(is.na(.))/n()))
missing_data <- gather(missing_data, key = "variables", value = "percent_missing")
ggplot(missing_data, aes(x = reorder(variables, percent_missing), y = percent_missing)) +
geom_bar(stat = "identity", fill = "black", aes(color = I('white')), size = 0.3)+
xlab('variables')+
coord_flip()+
theme_bw()
Dari plot di atas menunjukan bahwa semua atribut sudah tidak memiliki missing value.
Penulis mencoba menggunakan dua cara untuk mereduksi atribut, pertama menggunakan manual barchart dan kedua menggunakan package.
####Bar Chart Pada bagian ini kita akan memilih atribut dengan menggunakan bar chart. membuat bar chart untuk melihat perbedaan proporsi antara kategori hidup dan meninggal pada masing-masing atribut
ggplot(HCCKategorik_nomissing, aes(x=Gender,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Gender")
ggplot(HCCKategorik_nomissing, aes(x=Symptoms,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Symptoms")
ggplot(HCCKategorik_nomissing, aes(x=Alcohol,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Alcohol")
ggplot(HCCKategorik_nomissing, aes(x=Hepatitis.B.Surface.Antigen,fill=Class))+ geom_bar(position = 'fill')+ theme_grey()+xlab("Hepatitis.B.Surface.Antigen")
ggplot(HCCKategorik_nomissing, aes(x=Hepatitis.B.e.Antigen,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Hepatitis.B.e.Antigen")
ggplot(HCCKategorik_nomissing, aes(x=Hepatitis.B.Core.Antibody,fill=Class))+ geom_bar(position = 'fill')+xlab("Hepatitis.B.Core.Antibody")
ggplot(HCCKategorik_nomissing, aes(x=Hepatitis.C.Virus.Antibody,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Hepatitis.C.Virus.Antibody")
ggplot(HCCKategorik_nomissing, aes(x=Cirrhosis,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Cirrhosis")
ggplot(HCCKategorik_nomissing, aes(x= Endemic.Countries,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab(" Endemic.Countries")
ggplot(HCCKategorik_nomissing, aes(x=Smoking,fill=Class))+ geom_bar(position = 'fill')+ theme_grey()+xlab("Smoking")
ggplot(HCCKategorik_nomissing, aes(x=Diabetes,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Diabetes")
ggplot(HCCKategorik_nomissing, aes(x=Obesity,fill=Class))+ geom_bar(position = 'fill')+xlab("Obesity")
ggplot(HCCKategorik_nomissing, aes(x=Hemochromatosis,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Hemochromatosis")
ggplot(HCCKategorik_nomissing, aes(x=Arterial.Hypertension,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Arterial.Hypertension")
ggplot(HCCKategorik_nomissing, aes(x=Chronic.Renal.Insufficiency,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Chronic.Renal.Insufficiency")
ggplot(HCCKategorik_nomissing, aes(x=Human.Immunodeficiency.Virus,fill=Class))+ geom_bar(position = 'fill')+ theme_grey()+xlab("Human.Immunodeficiency.Virus")
ggplot(HCCKategorik_nomissing, aes(x=Nonalcoholic.Steatohepatitis,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Nonalcoholic.Steatohepatitis")
ggplot(HCCKategorik_nomissing, aes(x=Esophageal.Varices,fill=Class))+ geom_bar(position = 'fill')+xlab("Esophageal.Varices")
ggplot(HCCKategorik_nomissing, aes(x=Splenomegaly,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Splenomegaly")
ggplot(HCCKategorik_nomissing, aes(x=Portal.Hypertension,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Portal.Hypertension")
ggplot(HCCKategorik_nomissing, aes(x=PortalVeinThrombosis,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Portal.Vein.Thrombosis")
ggplot(HCCKategorik_nomissing, aes(x=LiverMetastasis,fill=Class))+ geom_bar(position = 'fill')+ theme_grey()+xlab("Liver.Metastasis")
ggplot(HCCKategorik_nomissing, aes(x=Radiological.Hallmark,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Radiological.Hallmark")
ggplot(HCCKategorik_nomissing, aes(x=PerformanceStatus,fill=Class))+ geom_bar(position = 'fill')+xlab("Performance.Status")
ggplot(HCCKategorik_nomissing, aes(x=Encefalopathy.degree,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Encefalopathy.degree")
ggplot(HCCKategorik_nomissing, aes(x=Ascitesdegree,fill=Class))+ geom_bar(position = 'fill')+theme_grey()+xlab("Ascites.degree")
dari bar chart di atas, dapat kita pilih atribut dengan perbedaan proporsi antara kategori hidup dan meninggal yang signifikan. Atribut yang terpilih adalah Symptoms, Hepatitis B e Antigen, Endemic Countries, Smoking, Portal Vein Thrombosis, Liver Metastasis, Performance Status, Encefalopathy degree, Ascites degree
Atribut_Kategorik_Barchart <- select(HCCKategorik_nomissing, Patients, Symptoms, Hepatitis.B.e.Antigen, Endemic.Countries, Smoking, PortalVeinThrombosis, LiverMetastasis, PerformanceStatus, Encefalopathy.degree, Ascitesdegree, Class)
glimpse(Atribut_Kategorik_Barchart)
## Rows: 165
## Columns: 11
## $ Patients <fct> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1~
## $ Symptoms <fct> 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, ~
## $ Hepatitis.B.e.Antigen <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ~
## $ Endemic.Countries <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, ~
## $ Smoking <fct> 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, ~
## $ PortalVeinThrombosis <fct> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, ~
## $ LiverMetastasis <fct> 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ~
## $ PerformanceStatus <fct> 0, 0, 2, 0, 0, 1, 0, 3, 1, 0, 0, 0, 0, 0, 2, 1, ~
## $ Encefalopathy.degree <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, ~
## $ Ascitesdegree <fct> 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 2, 1, 1, 1, 2, 2, ~
## $ Class <fct> 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, ~
Pada bagian ini kita akan memilih atribut dengan perbedaan proporsi antara kategori hidup dan meninggal yang signifikan menggunakan Package “Boruta”.
set.seed(111)
boruta.datakategorik <- Boruta(Class~., data = HCCKategorik_nomissing, doTrace = 2)
print(boruta.datakategorik)
## Boruta performed 99 iterations in 6.571949 secs.
## 5 attributes confirmed important: Ascitesdegree, LiverMetastasis,
## PerformanceStatus, PortalVeinThrombosis, Symptoms;
## 20 attributes confirmed unimportant: Alcohol, Arterial.Hypertension,
## Chronic.Renal.Insufficiency, Cirrhosis, Diabetes and 15 more;
## 2 tentative attributes left: Endemic.Countries,
## Hepatitis.B.Surface.Antigen;
getSelectedAttributes(boruta.datakategorik, withTentative = F)
## [1] "Symptoms" "PortalVeinThrombosis" "LiverMetastasis"
## [4] "PerformanceStatus" "Ascitesdegree"
Hasil perhitungan menggunakan package Boruta, terpilih 5 variabel diantaranya yaitu Symtoms, Liver Metastasis, Ascites degree, Portal Vein Thrombosis, dan Performance Status
Atribut_Kategorik_Boruta <- select(HCCKategorik_nomissing, Patients, Symptoms, PortalVeinThrombosis, LiverMetastasis, PerformanceStatus, Ascitesdegree, Class)
glimpse(Atribut_Kategorik_Boruta)
## Rows: 165
## Columns: 7
## $ Patients <fct> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15~
## $ Symptoms <fct> 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0~
## $ PortalVeinThrombosis <fct> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1~
## $ LiverMetastasis <fct> 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
## $ PerformanceStatus <fct> 0, 0, 2, 0, 0, 1, 0, 3, 1, 0, 0, 0, 0, 0, 2, 1, 3~
## $ Ascitesdegree <fct> 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 2, 1, 1, 1, 2, 2, 1~
## $ Class <fct> 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0~
HCCNumerik=read_excel("C:\\Users\\USER\\Downloads\\Tingkat 4\\04.DATMIN-BU WA ODE\\Tugas Pertama\\HCC-SURVIVAL.xlsx",sheet="Numerik")
glimpse(HCCNumerik)
## Rows: 165
## Columns: 25
## $ `Age at diagnosis` <dbl> 67, 62, 78, 77, 76, 75, 49, 61, 50,~
## $ `Grams of Alcohol per day` <chr> "137", "0", "50", "40", "100", "?",~
## $ `Packs of cigarets per year` <chr> "15", "?", "50", "30", "30", "?", "~
## $ `International Normalised Ratio` <chr> "1.53", "?", "0.96", "0.95", "0.94"~
## $ `Alpha-Fetoprotein (ng/mL)` <chr> "95", "?", "5.8", "2440", "49", "11~
## $ `Haemoglobin (g/dL)` <chr> "13.7", "?", "8.9", "13.4", "14.3",~
## $ `Mean Corpuscular Volume (fl)` <chr> "106.6", "?", "79.8", "97.1", "95.1~
## $ `Leukocytes(G/L)` <chr> "4.9000000000000004", "?", "8.4", "~
## $ `Platelets (G/L)` <chr> "99", "?", "472", "279", "199", "85~
## $ `Albumin (mg/dL)` <chr> "3.4", "?", "3.3", "3.7", "4.099999~
## $ `Total Bilirubin(mg/dL)` <chr> "2.1", "?", "0.4", "0.4", "0.7", "3~
## $ `Alanine transaminase (U/L)` <chr> "34", "?", "58", "16", "147", "91",~
## $ `Aspartate transaminase (U/L)` <chr> "41", "?", "68", "64", "306", "122"~
## $ `Gamma glutamyl transferase (U/L)` <chr> "183", "?", "202", "94", "173", "24~
## $ `Alkaline phosphatase (U/L)` <chr> "150", "?", "109", "174", "109", "3~
## $ `Total Proteins (g/dL)` <chr> "7.1", "?", "7", "8.1", "6.9", "5.6~
## $ `Creatinine (mg/dL)` <chr> "0.7", "?", "2.1", "1.1100000000000~
## $ `Number of Nodules` <chr> "1", "1", "5", "2", "1", "1", "5", ~
## $ `Major dimension of nodule (cm)` <chr> "3.5", "1.8", "13", "15.7", "9", "1~
## $ `Direct Bilirubin (mg/dL)` <chr> "0.5", "?", "0.1", "0.2", "?", "1.4~
## $ `Iron (mcg/dL)` <chr> "?", "?", "28", "?", "59", "53", "1~
## $ `Oxygen Saturation (%)` <chr> "?", "?", "6", "?", "15", "22", "12~
## $ `Ferritin (ng/mL)` <chr> "?", "?", "16", "?", "22", "111", "~
## $ Class <dbl> 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1,~
## $ Patients <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, ~
HCCNumerik <- HCCNumerik %>% mutate_if(is.character,as.numeric)
HCCNumerik$Class <- as.factor(HCCNumerik$Class)
glimpse(HCCNumerik)
## Rows: 165
## Columns: 25
## $ `Age at diagnosis` <dbl> 67, 62, 78, 77, 76, 75, 49, 61, 50,~
## $ `Grams of Alcohol per day` <dbl> 137, 0, 50, 40, 100, NA, 0, NA, 100~
## $ `Packs of cigarets per year` <dbl> 15, NA, 50, 30, 30, NA, 0, 20, 32, ~
## $ `International Normalised Ratio` <dbl> 1.53, NA, 0.96, 0.95, 0.94, 1.58, 1~
## $ `Alpha-Fetoprotein (ng/mL)` <dbl> 95.0, NA, 5.8, 2440.0, 49.0, 110.0,~
## $ `Haemoglobin (g/dL)` <dbl> 13.7, NA, 8.9, 13.4, 14.3, 13.4, 10~
## $ `Mean Corpuscular Volume (fl)` <dbl> 106.6, NA, 79.8, 97.1, 95.1, 91.5, ~
## $ `Leukocytes(G/L)` <dbl> 4.9, NA, 8.4, 9.0, 6.4, 5.4, 3.2, 3~
## $ `Platelets (G/L)` <dbl> 99, NA, 472, 279, 199, 85, 42000, 5~
## $ `Albumin (mg/dL)` <dbl> 3.40, NA, 3.30, 3.70, 4.10, 3.40, 2~
## $ `Total Bilirubin(mg/dL)` <dbl> 2.10, NA, 0.40, 0.40, 0.70, 3.50, 2~
## $ `Alanine transaminase (U/L)` <dbl> 34, NA, 58, 16, 147, 91, 119, 79, 2~
## $ `Aspartate transaminase (U/L)` <dbl> 41, NA, 68, 64, 306, 122, 183, 108,~
## $ `Gamma glutamyl transferase (U/L)` <dbl> 183, NA, 202, 94, 173, 242, 143, 18~
## $ `Alkaline phosphatase (U/L)` <dbl> 150, NA, 109, 174, 109, 396, 211, 3~
## $ `Total Proteins (g/dL)` <dbl> 7.1, NA, 7.0, 8.1, 6.9, 5.6, 7.3, 7~
## $ `Creatinine (mg/dL)` <dbl> 0.70, NA, 2.10, 1.11, 1.80, 0.90, 0~
## $ `Number of Nodules` <dbl> 1, 1, 5, 2, 1, 1, 5, 2, 1, 1, 5, 5,~
## $ `Major dimension of nodule (cm)` <dbl> 3.5, 1.8, 13.0, 15.7, 9.0, 10.0, 2.~
## $ `Direct Bilirubin (mg/dL)` <dbl> 0.50, NA, 0.10, 0.20, NA, 1.40, 2.1~
## $ `Iron (mcg/dL)` <dbl> NA, NA, 28.0, NA, 59.0, 53.0, 171.0~
## $ `Oxygen Saturation (%)` <dbl> NA, NA, 6, NA, 15, 22, 126, 25, 73,~
## $ `Ferritin (ng/mL)` <dbl> NA, NA, 16.0, NA, 22.0, 111.0, 1452~
## $ Class <fct> 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1,~
## $ Patients <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, ~
options(repr.plot.width = 6, repr.plot.height = 4)
missing_data <- HCCNumerik %>% summarise_all(funs(sum(is.na(.))/n()))
missing_data <- gather(missing_data, key = "variables", value = "percent_missing")
ggplot(missing_data, aes(x = reorder(variables, percent_missing), y = percent_missing)) +
geom_bar(stat = "identity", fill = "black", aes(color = I('white')), size = 0.3)+
xlab('variables')+
coord_flip()+
theme_bw()
Gambar di atas menunjukan bahwa sebagian besar atribut memiliki missing value. Untuk mengisi missing value pada data numerik ini, penulis menggunakan rata-rata dari masing-masing variabel.
HCCNumerik$`Oxygen Saturation (%)`[is.na(HCCNumerik$`Oxygen Saturation (%)`)]<-mean(HCCNumerik$`Oxygen Saturation (%)`,na.rm = TRUE)
HCCNumerik$`Ferritin (ng/mL)`[is.na(HCCNumerik$`Ferritin (ng/mL)`)]<-mean(HCCNumerik$`Ferritin (ng/mL)`,na.rm = TRUE)
HCCNumerik$`Iron (mcg/dL)`[is.na(HCCNumerik$`Iron (mcg/dL)`)]<-mean(HCCNumerik$`Iron (mcg/dL)`,na.rm = TRUE)
HCCNumerik$`Packs of cigarets per year`[is.na(HCCNumerik$`Packs of cigarets per year`)]<-mean(HCCNumerik$`Packs of cigarets per year`,na.rm = TRUE)
HCCNumerik$`Grams of Alcohol per day`[is.na(HCCNumerik$`Grams of Alcohol per day`)]<-mean(HCCNumerik$`Grams of Alcohol per day`,na.rm = TRUE)
HCCNumerik$`Direct Bilirubin (mg/dL)`[is.na(HCCNumerik$`Direct Bilirubin (mg/dL)`)]<-mean(HCCNumerik$`Direct Bilirubin (mg/dL)`,na.rm = TRUE)
HCCNumerik$`Major dimension of nodule (cm)`[is.na(HCCNumerik$`Major dimension of nodule (cm)`)]<-mean(HCCNumerik$`Major dimension of nodule (cm)`,na.rm = TRUE)
HCCNumerik$`Total Proteins (g/dL)`[is.na(HCCNumerik$`Total Proteins (g/dL)`)]<-mean(HCCNumerik$`Total Proteins (g/dL)`,na.rm = TRUE)
HCCNumerik$`Alpha-Fetoprotein (ng/mL)`[is.na(HCCNumerik$`Alpha-Fetoprotein (ng/mL)`)]<-mean(HCCNumerik$`Alpha-Fetoprotein (ng/mL)`,na.rm = TRUE)
HCCNumerik$`Creatinine (mg/dL)`[is.na(HCCNumerik$`Creatinine (mg/dL)`)]<-mean(HCCNumerik$`Creatinine (mg/dL)`,na.rm = TRUE)
HCCNumerik$`Albumin (mg/dL)`[is.na(HCCNumerik$`Albumin (mg/dL)`)]<-mean(HCCNumerik$`Albumin (mg/dL)`,na.rm = TRUE)
HCCNumerik$`Total Bilirubin(mg/dL)`[is.na(HCCNumerik$`Total Bilirubin(mg/dL)`)]<-mean(HCCNumerik$`Total Bilirubin(mg/dL)`,na.rm = TRUE)
HCCNumerik$`International Normalised Ratio`[is.na(HCCNumerik$`International Normalised Ratio`)]<-mean(HCCNumerik$`International Normalised Ratio`,na.rm = TRUE)
HCCNumerik$`Alanine transaminase (U/L)`[is.na(HCCNumerik$`Alanine transaminase (U/L)`)]<-mean(HCCNumerik$`Alanine transaminase (U/L)`,na.rm = TRUE)
HCCNumerik$`Platelets (G/L)`[is.na(HCCNumerik$`Platelets (G/L)`)]<-mean(HCCNumerik$`Platelets (G/L)`,na.rm = TRUE)
HCCNumerik$`Mean Corpuscular Volume (fl)`[is.na(HCCNumerik$`Mean Corpuscular Volume (fl)`)]<-mean(HCCNumerik$`Mean Corpuscular Volume (fl)`,na.rm = TRUE)
HCCNumerik$`Leukocytes(G/L)`[is.na(HCCNumerik$`Leukocytes(G/L)`)]<-mean(HCCNumerik$`Leukocytes(G/L)`,na.rm = TRUE)
HCCNumerik$`Haemoglobin (g/dL)`[is.na(HCCNumerik$`Haemoglobin (g/dL)`)]<-mean(HCCNumerik$`Haemoglobin (g/dL)`,na.rm = TRUE)
HCCNumerik$`Gamma glutamyl transferase (U/L)`[is.na(HCCNumerik$`Gamma glutamyl transferase (U/L)`)]<-mean(HCCNumerik$`Gamma glutamyl transferase (U/L)`,na.rm = TRUE)
HCCNumerik$`Aspartate transaminase (U/L)`[is.na(HCCNumerik$`Aspartate transaminase (U/L)`)]<-mean(HCCNumerik$`Aspartate transaminase (U/L)`,na.rm = TRUE)
HCCNumerik$`Alkaline phosphatase (U/L)`[is.na(HCCNumerik$`Alkaline phosphatase (U/L)`)]<-mean(HCCNumerik$`Alkaline phosphatase (U/L)`,na.rm = TRUE)
HCCNumerik$`Number of Nodules`[is.na(HCCNumerik$`Number of Nodules`)]<-mean(HCCNumerik$`Number of Nodules`,na.rm = TRUE)
Missing value dari dataset numerik sudah terisi, dapat kita cek kembali menggunakan plot yang sama seperti sebelumnya.
options(repr.plot.width = 6, repr.plot.height = 4)
missing_data <- HCCNumerik %>% summarise_all(funs(sum(is.na(.))/n()))
missing_data <- gather(missing_data, key = "variables", value = "percent_missing")
ggplot(missing_data, aes(x = reorder(variables, percent_missing), y = percent_missing)) +
geom_bar(stat = "identity", fill = "black", aes(color = I('white')), size = 0.3)+
xlab('variables')+
coord_flip()+
theme_bw()
Dari plot di atas terlihat bahwa sudah tidak ada lagi atribut yang memiliki missing value.
set.seed(111)
boruta.datanumerik <- Boruta(Class~., data = HCCNumerik, doTrace = 2)
print(boruta.datanumerik)
## Boruta performed 99 iterations in 6.806245 secs.
## 7 attributes confirmed important: `Albumin (mg/dL)`, `Alkaline
## phosphatase (U/L)`, `Alpha-Fetoprotein (ng/mL)`, `Aspartate
## transaminase (U/L)`, `Ferritin (ng/mL)` and 2 more;
## 14 attributes confirmed unimportant: `Alanine transaminase (U/L)`,
## `Creatinine (mg/dL)`, `Direct Bilirubin (mg/dL)`, `Gamma glutamyl
## transferase (U/L)`, `Grams of Alcohol per day` and 9 more;
## 3 tentative attributes left: `Age at diagnosis`, `Oxygen Saturation
## (%)`, `Platelets (G/L)`;
getSelectedAttributes(boruta.datanumerik, withTentative = F)
## [1] "`Alpha-Fetoprotein (ng/mL)`" "`Haemoglobin (g/dL)`"
## [3] "`Albumin (mg/dL)`" "`Aspartate transaminase (U/L)`"
## [5] "`Alkaline phosphatase (U/L)`" "`Iron (mcg/dL)`"
## [7] "`Ferritin (ng/mL)`"
Hasil perhitungan menggunakan package Boruta, atribut yang terpilih ialah Alpha-Fetoprotein, Haemoglobin, Albumin, Aspartate Transminase, Alkaline Phosphatase, Iron, dan Ferritin
###5. Mengambil Dataset Numerik yang Telah Direduksi
Atribut_Numerik_Boruta <- select(.data = HCCNumerik, Patients, `Alpha-Fetoprotein (ng/mL)`,`Haemoglobin (g/dL)`, `Albumin (mg/dL)`, `Aspartate transaminase (U/L)`, `Alkaline phosphatase (U/L)`, `Iron (mcg/dL)`, `Ferritin (ng/mL)`, Class)
glimpse(Atribut_Numerik_Boruta)
## Rows: 165
## Columns: 9
## $ Patients <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, ~
## $ `Alpha-Fetoprotein (ng/mL)` <dbl> 95.00, 19299.95, 5.80, 2440.00, 49.00, ~
## $ `Haemoglobin (g/dL)` <dbl> 13.70000, 12.87901, 8.90000, 13.40000, ~
## $ `Albumin (mg/dL)` <dbl> 3.400000, 3.445535, 3.300000, 3.700000,~
## $ `Aspartate transaminase (U/L)` <dbl> 41.00000, 96.38272, 68.00000, 64.00000,~
## $ `Alkaline phosphatase (U/L)` <dbl> 150.0000, 212.2116, 109.0000, 174.0000,~
## $ `Iron (mcg/dL)` <dbl> 85.59884, 85.59884, 28.00000, 85.59884,~
## $ `Ferritin (ng/mL)` <dbl> 438.9976, 438.9976, 16.0000, 438.9976, ~
## $ Class <fct> 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, ~
Setelah menyeleksi atribut pada dataset kategorik dan dataset numerik, selanjutnya adalah menggabungkan kedua dataset tersebut. karena dalam mereduksi data kategorik kita coba menggunakan 2 metode maka data set terbentuk 2 jenis. ###Data Set dengan Metode Manual Barchart
datasetfinalbarchart=data.frame(Atribut_Kategorik_Barchart[,1:10],Atribut_Numerik_Boruta[,2:9])
head(datasetfinalbarchart)
## Patients Symptoms Hepatitis.B.e.Antigen Endemic.Countries Smoking
## 1 1 0 0 0 1
## 2 2 1 0 0 1
## 3 3 0 0 0 1
## 4 4 1 0 0 1
## 5 5 1 0 0 1
## 6 6 0 0 0 1
## PortalVeinThrombosis LiverMetastasis PerformanceStatus Encefalopathy.degree
## 1 0 0 0 1
## 2 0 0 0 1
## 3 0 1 2 1
## 4 0 1 0 1
## 5 0 0 0 1
## 6 0 0 1 1
## Ascitesdegree Alpha.Fetoprotein..ng.mL. Haemoglobin..g.dL. Albumin..mg.dL.
## 1 1 95.00 13.70000 3.400000
## 2 1 19299.95 12.87901 3.445535
## 3 2 5.80 8.90000 3.300000
## 4 1 2440.00 13.40000 3.700000
## 5 1 49.00 14.30000 4.100000
## 6 2 110.00 13.40000 3.400000
## Aspartate.transaminase..U.L. Alkaline.phosphatase..U.L. Iron..mcg.dL.
## 1 41.00000 150.0000 85.59884
## 2 96.38272 212.2116 85.59884
## 3 68.00000 109.0000 28.00000
## 4 64.00000 174.0000 85.59884
## 5 306.00000 109.0000 59.00000
## 6 122.00000 396.0000 53.00000
## Ferritin..ng.mL. Class
## 1 438.9976 1
## 2 438.9976 1
## 3 16.0000 1
## 4 438.9976 0
## 5 22.0000 1
## 6 111.0000 0
###Data Set dengan Metode Package Boruta
datasetfinalboruta=data.frame(Atribut_Kategorik_Boruta[,1:6],Atribut_Numerik_Boruta[,2:9])
head(datasetfinalboruta)
## Patients Symptoms PortalVeinThrombosis LiverMetastasis PerformanceStatus
## 1 1 0 0 0 0
## 2 2 1 0 0 0
## 3 3 0 0 1 2
## 4 4 1 0 1 0
## 5 5 1 0 0 0
## 6 6 0 0 0 1
## Ascitesdegree Alpha.Fetoprotein..ng.mL. Haemoglobin..g.dL. Albumin..mg.dL.
## 1 1 95.00 13.70000 3.400000
## 2 1 19299.95 12.87901 3.445535
## 3 2 5.80 8.90000 3.300000
## 4 1 2440.00 13.40000 3.700000
## 5 1 49.00 14.30000 4.100000
## 6 2 110.00 13.40000 3.400000
## Aspartate.transaminase..U.L. Alkaline.phosphatase..U.L. Iron..mcg.dL.
## 1 41.00000 150.0000 85.59884
## 2 96.38272 212.2116 85.59884
## 3 68.00000 109.0000 28.00000
## 4 64.00000 174.0000 85.59884
## 5 306.00000 109.0000 59.00000
## 6 122.00000 396.0000 53.00000
## Ferritin..ng.mL. Class
## 1 438.9976 1
## 2 438.9976 1
## 3 16.0000 1
## 4 438.9976 0
## 5 22.0000 1
## 6 111.0000 0
Pada tahap ini penulis hanya menggunakan atribut yang bersifat kategorik untuk malakukan klasifikasi
##BARCHART (50 DATA TESTING) ##1. Load Data HCC Survival
Data_DT_Barchart <- datasetfinalbarchart
Data_DT_Barchart
## Patients Symptoms Hepatitis.B.e.Antigen Endemic.Countries Smoking
## 1 1 0 0 0 1
## 2 2 1 0 0 1
## 3 3 0 0 0 1
## 4 4 1 0 0 1
## 5 5 1 0 0 1
## 6 6 0 0 0 1
## 7 7 0 0 0 0
## 8 8 1 0 0 1
## 9 9 1 0 0 1
## 10 10 1 0 0 0
## 11 11 0 0 0 1
## 12 12 0 0 1 0
## 13 13 0 0 0 1
## 14 14 1 0 0 1
## 15 15 0 0 0 1
## 16 16 0 0 0 0
## 17 17 0 0 0 1
## 18 18 1 0 0 1
## 19 19 1 0 0 1
## 20 20 1 0 0 1
## 21 21 0 0 0 1
## 22 22 0 0 0 1
## 23 23 1 0 0 0
## 24 24 1 0 1 1
## 25 25 1 0 0 1
## 26 26 0 0 0 0
## 27 27 1 0 0 1
## 28 28 1 0 0 1
## 29 29 0 0 0 1
## 30 30 1 0 0 1
## 31 31 1 0 1 1
## 32 32 0 0 0 1
## 33 33 1 0 0 1
## 34 34 1 0 0 1
## 35 35 1 0 0 0
## 36 36 1 0 0 0
## 37 37 1 0 0 0
## 38 38 1 0 0 1
## 39 39 0 0 0 0
## 40 40 1 0 1 0
## 41 41 1 0 0 0
## 42 42 0 0 0 0
## 43 43 0 0 0 1
## 44 44 0 0 0 1
## 45 45 0 0 0 0
## 46 46 0 0 0 0
## 47 47 1 0 0 0
## 48 48 1 0 0 1
## 49 49 1 0 0 1
## 50 50 1 0 0 1
## 51 51 1 0 0 0
## 52 52 0 0 0 0
## 53 53 1 0 0 0
## 54 54 1 0 0 0
## 55 55 1 0 0 1
## 56 56 1 0 0 0
## 57 57 1 0 0 1
## 58 58 1 0 0 1
## 59 59 1 0 0 0
## 60 60 1 0 0 1
## 61 61 1 0 1 1
## 62 62 1 0 0 1
## 63 63 0 0 0 1
## 64 64 1 0 0 1
## 65 65 0 0 0 0
## 66 66 1 0 0 0
## 67 67 1 0 0 1
## 68 68 1 0 0 0
## 69 69 1 0 0 0
## 70 70 1 0 0 0
## 71 71 1 0 1 0
## 72 72 1 0 0 1
## 73 73 1 0 0 1
## 74 74 1 0 0 0
## 75 75 1 0 0 1
## 76 76 1 0 0 1
## 77 77 1 0 0 1
## 78 78 1 0 0 1
## 79 79 1 0 0 1
## 80 80 1 0 0 1
## 81 81 1 0 0 1
## 82 82 0 0 0 1
## 83 83 0 0 0 0
## 84 84 1 0 0 1
## 85 85 1 0 0 1
## 86 86 1 0 0 1
## 87 87 1 0 0 1
## 88 88 0 0 0 1
## 89 89 1 0 0 1
## 90 90 0 0 0 1
## 91 91 1 0 0 0
## 92 92 0 0 0 1
## 93 93 1 0 0 0
## 94 94 1 0 0 1
## 95 95 1 0 0 1
## 96 96 0 0 0 0
## 97 97 1 0 1 0
## 98 98 0 0 0 1
## 99 99 0 0 0 1
## 100 100 0 0 0 0
## 101 101 1 0 0 0
## 102 102 1 0 0 1
## 103 103 1 0 0 1
## 104 104 1 0 0 1
## 105 105 1 0 0 1
## 106 106 1 0 0 0
## 107 107 1 0 0 1
## 108 108 1 0 0 0
## 109 109 1 0 0 0
## 110 110 0 0 0 1
## 111 111 1 0 0 1
## 112 112 1 0 0 1
## 113 113 0 0 0 1
## 114 114 1 0 0 1
## 115 115 1 0 0 1
## 116 116 1 0 0 1
## 117 117 1 0 0 1
## 118 118 1 1 0 1
## 119 119 0 0 0 0
## 120 120 0 0 0 1
## 121 121 1 0 1 0
## 122 122 0 0 0 0
## 123 123 1 0 0 1
## 124 124 0 0 0 0
## 125 125 1 0 0 1
## 126 126 1 0 0 1
## 127 127 0 0 0 1
## 128 128 0 0 0 0
## 129 129 0 0 0 1
## 130 130 1 0 0 0
## 131 131 1 0 0 0
## 132 132 0 0 0 1
## 133 133 0 0 0 0
## 134 134 1 0 0 0
## 135 135 1 0 0 0
## 136 136 1 0 0 0
## 137 137 1 0 1 1
## 138 138 1 0 0 0
## 139 139 1 0 0 1
## 140 140 1 0 0 1
## 141 141 1 0 0 0
## 142 142 1 0 0 1
## 143 143 1 0 0 0
## 144 144 0 0 0 1
## 145 145 1 0 0 1
## 146 146 1 0 0 1
## 147 147 1 0 0 1
## 148 148 0 0 0 1
## 149 149 0 0 0 0
## 150 150 1 0 0 0
## 151 151 0 0 0 0
## 152 152 0 0 0 1
## 153 153 1 0 0 1
## 154 154 1 0 0 0
## 155 155 0 0 0 0
## 156 156 1 0 0 1
## 157 157 1 0 0 0
## 158 158 1 0 0 1
## 159 159 1 0 0 1
## 160 160 1 0 0 0
## 161 161 0 0 0 1
## 162 162 1 0 0 0
## 163 163 0 0 0 1
## 164 164 0 0 1 1
## 165 165 1 0 0 1
## PortalVeinThrombosis LiverMetastasis PerformanceStatus Encefalopathy.degree
## 1 0 0 0 1
## 2 0 0 0 1
## 3 0 1 2 1
## 4 0 1 0 1
## 5 0 0 0 1
## 6 0 0 1 1
## 7 0 0 0 1
## 8 1 0 3 1
## 9 0 0 1 1
## 10 0 0 0 1
## 11 0 0 0 1
## 12 0 0 0 1
## 13 1 0 0 1
## 14 0 0 0 1
## 15 0 0 2 2
## 16 1 0 1 1
## 17 1 0 3 2
## 18 1 1 2 1
## 19 0 0 1 1
## 20 1 0 0 1
## 21 0 0 0 1
## 22 0 0 0 1
## 23 0 0 1 1
## 24 1 1 1 1
## 25 0 1 4 1
## 26 1 0 2 1
## 27 1 0 1 1
## 28 0 0 0 1
## 29 1 0 1 1
## 30 0 0 1 1
## 31 1 0 2 1
## 32 0 0 0 1
## 33 0 0 0 1
## 34 0 0 0 1
## 35 0 0 0 1
## 36 0 1 0 1
## 37 0 0 4 1
## 38 0 0 1 1
## 39 0 0 0 1
## 40 0 1 0 1
## 41 0 0 0 1
## 42 0 0 0 1
## 43 0 1 3 1
## 44 0 0 0 1
## 45 0 0 0 1
## 46 0 0 0 1
## 47 0 0 0 1
## 48 1 1 1 1
## 49 0 0 0 1
## 50 0 0 0 1
## 51 0 1 2 1
## 52 0 1 0 1
## 53 0 0 0 1
## 54 0 0 0 1
## 55 1 0 3 1
## 56 0 1 1 1
## 57 0 0 2 1
## 58 0 0 0 1
## 59 0 0 3 2
## 60 0 0 0 1
## 61 1 0 1 1
## 62 0 0 0 2
## 63 0 0 2 1
## 64 0 0 3 1
## 65 0 0 0 1
## 66 0 1 2 1
## 67 0 0 0 1
## 68 0 0 0 1
## 69 0 1 0 1
## 70 1 0 2 2
## 71 0 0 0 1
## 72 1 1 3 2
## 73 1 1 2 1
## 74 1 1 0 1
## 75 0 0 0 1
## 76 0 0 0 1
## 77 1 0 2 1
## 78 0 0 0 1
## 79 0 1 3 1
## 80 0 0 2 2
## 81 0 1 3 2
## 82 0 0 1 2
## 83 0 0 1 1
## 84 1 0 2 1
## 85 0 0 0 1
## 86 1 0 1 2
## 87 0 1 1 1
## 88 0 0 0 1
## 89 0 0 0 1
## 90 0 0 1 2
## 91 0 1 2 1
## 92 0 0 0 1
## 93 0 0 3 1
## 94 1 1 1 1
## 95 0 0 0 1
## 96 0 0 1 1
## 97 0 0 1 1
## 98 1 0 0 1
## 99 0 0 0 1
## 100 0 0 2 1
## 101 0 1 2 1
## 102 0 1 1 1
## 103 0 0 2 1
## 104 0 0 4 3
## 105 0 0 1 1
## 106 0 0 3 1
## 107 0 0 0 1
## 108 0 0 3 1
## 109 0 1 1 1
## 110 0 0 0 1
## 111 1 1 2 2
## 112 1 0 2 1
## 113 0 0 0 1
## 114 1 1 1 1
## 115 1 0 3 3
## 116 0 0 0 1
## 117 0 0 0 1
## 118 0 0 4 3
## 119 0 0 0 1
## 120 0 0 0 1
## 121 0 0 1 1
## 122 0 0 0 1
## 123 1 1 2 1
## 124 0 0 0 1
## 125 0 0 2 1
## 126 1 0 4 2
## 127 0 0 0 1
## 128 1 0 3 2
## 129 0 0 2 1
## 130 0 0 2 2
## 131 1 1 2 1
## 132 0 0 2 2
## 133 0 0 0 1
## 134 0 1 1 1
## 135 0 0 0 1
## 136 0 1 1 1
## 137 0 0 2 1
## 138 0 0 0 2
## 139 0 0 0 1
## 140 0 0 2 1
## 141 1 0 0 1
## 142 0 0 0 1
## 143 0 0 0 1
## 144 0 0 0 1
## 145 0 0 0 1
## 146 0 0 0 1
## 147 1 0 2 1
## 148 0 0 0 1
## 149 0 0 0 1
## 150 1 0 2 1
## 151 0 0 0 1
## 152 0 0 0 1
## 153 0 1 1 1
## 154 0 0 3 1
## 155 0 0 0 1
## 156 1 0 3 2
## 157 0 1 1 1
## 158 0 0 0 1
## 159 0 0 3 1
## 160 0 1 3 3
## 161 0 0 0 1
## 162 0 0 2 1
## 163 0 0 0 1
## 164 1 1 2 1
## 165 0 1 0 1
## Ascitesdegree Alpha.Fetoprotein..ng.mL. Haemoglobin..g.dL. Albumin..mg.dL.
## 1 1 95.00 13.70000 3.400000
## 2 1 19299.95 12.87901 3.445535
## 3 2 5.80 8.90000 3.300000
## 4 1 2440.00 13.40000 3.700000
## 5 1 49.00 14.30000 4.100000
## 6 2 110.00 13.40000 3.400000
## 7 1 138.90 10.40000 2.350000
## 8 1 9860.00 10.80000 3.100000
## 9 2 8.80 11.90000 1.900000
## 10 1 1.80 11.80000 4.200000
## 11 2 100809.00 13.00000 4.400000
## 12 1 86.00 15.70000 3.700000
## 13 1 60.00 13.30000 4.400000
## 14 1 6.60 13.70000 4.500000
## 15 2 29.00 13.50000 3.150000
## 16 2 4.60 10.20000 3.100000
## 17 1 60.00 12.10000 2.400000
## 18 1 9.20 10.30000 3.800000
## 19 1 8.80 14.90000 4.300000
## 20 3 34.00 15.90000 3.400000
## 21 1 19.60 11.70000 3.600000
## 22 1 3.90 16.40000 4.500000
## 23 3 1975.00 10.80000 3.100000
## 24 1 185.00 10.70000 3.600000
## 25 1 5532.00 13.10000 2.400000
## 26 1 13327.00 13.70000 3.100000
## 27 3 19299.95 13.60000 3.300000
## 28 1 5.90 15.50000 4.500000
## 29 1 3255.00 12.20000 3.000000
## 30 1 1.90 9.90000 3.200000
## 31 1 11.00 13.10000 3.500000
## 32 1 1237.00 12.20000 4.200000
## 33 1 7.70 15.70000 3.400000
## 34 1 266.00 13.70000 4.200000
## 35 1 5689.00 14.30000 3.800000
## 36 1 14.20 14.80000 4.200000
## 37 2 3.10 13.10000 2.300000
## 38 2 633.00 13.00000 2.300000
## 39 1 5.40 14.90000 4.200000
## 40 1 479.00 11.30000 3.100000
## 41 1 19.00 13.90000 4.400000
## 42 1 2.80 15.70000 4.100000
## 43 1 185203.00 15.00000 3.600000
## 44 1 5.00 15.10000 3.800000
## 45 1 237.00 15.60000 3.200000
## 46 1 2.80 16.60000 4.900000
## 47 1 19299.95 12.87901 3.445535
## 48 2 16.00 14.00000 3.100000
## 49 1 163.00 10.60000 3.300000
## 50 1 20.00 15.00000 4.600000
## 51 1 46.00 10.50000 2.400000
## 52 1 41470.00 12.60000 3.110000
## 53 1 4.70 11.80000 4.100000
## 54 1 7.30 15.30000 3.445535
## 55 2 1898.00 12.40000 2.700000
## 56 1 77.00 10.80000 3.200000
## 57 2 2.70 7.30000 3.400000
## 58 1 2.60 10.30000 3.900000
## 59 2 12.00 10.90000 1.900000
## 60 1 19299.95 18.70000 4.000000
## 61 3 608.00 12.60000 4.000000
## 62 2 41.00 14.60000 3.100000
## 63 2 19299.95 11.50000 2.700000
## 64 3 2.00 12.60000 3.500000
## 65 1 7.00 12.10000 3.800000
## 66 1 3.10 15.10000 2.800000
## 67 1 13.00 15.60000 4.540000
## 68 1 1.70 9.50000 2.100000
## 69 2 249.00 12.70000 3.100000
## 70 3 66.00 10.90000 2.700000
## 71 1 358.00 12.70000 2.900000
## 72 2 1810346.00 13.00000 3.400000
## 73 2 33502.00 14.40000 3.100000
## 74 1 20.00 13.00000 3.700000
## 75 1 2.50 14.90000 3.800000
## 76 1 2269.00 12.10000 3.445535
## 77 2 4181.00 9.10000 3.000000
## 78 1 5.10 14.40000 4.100000
## 79 1 345.00 9.80000 2.600000
## 80 2 2.90 10.40000 2.400000
## 81 1 2.50 12.60000 3.300000
## 82 1 5.04 15.80000 3.500000
## 83 3 3.70 14.80000 2.700000
## 84 3 2.60 12.70000 2.900000
## 85 1 2.90 16.40000 4.100000
## 86 1 20.00 13.90000 3.000000
## 87 1 2.79 12.60000 3.445535
## 88 2 42.00 15.80000 4.200000
## 89 2 457.00 11.70000 4.100000
## 90 3 19299.95 9.50000 2.200000
## 91 3 123.00 10.10000 4.000000
## 92 1 8.70 14.60000 4.200000
## 93 1 226.00 11.50000 4.200000
## 94 1 2159.00 13.10000 3.600000
## 95 1 48.00 12.60000 4.200000
## 96 1 2.40 9.50000 3.500000
## 97 2 5.00 9.10000 2.470000
## 98 1 64.00 14.10000 2.900000
## 99 1 5.50 13.10000 3.200000
## 100 2 8.50 13.00000 2.700000
## 101 1 2785.00 12.00000 2.600000
## 102 3 6574.00 13.50000 3.500000
## 103 2 5.70 13.90000 2.100000
## 104 3 19299.95 13.50000 2.900000
## 105 1 39.00 12.60000 3.800000
## 106 2 2.50 14.30000 2.200000
## 107 1 32.00 14.00000 4.200000
## 108 2 7.60 5.00000 3.700000
## 109 1 173.00 11.30000 2.400000
## 110 1 2.30 12.10000 2.700000
## 111 2 114.00 14.90000 3.200000
## 112 2 173.00 11.10000 3.100000
## 113 1 18.00 15.30000 4.200000
## 114 1 42.00 16.20000 4.100000
## 115 2 28274.00 10.30000 2.400000
## 116 1 736.00 14.00000 4.700000
## 117 3 1009.00 13.80000 3.000000
## 118 2 22475.00 12.00000 3.280000
## 119 1 5.20 14.60000 4.200000
## 120 2 5.20 14.90000 3.200000
## 121 1 14177.00 10.20000 2.600000
## 122 1 3.10 10.80000 3.600000
## 123 1 50655.00 9.80000 2.600000
## 124 1 1.20 15.10000 4.700000
## 125 2 657.00 11.80000 3.200000
## 126 2 421500.00 14.30000 3.100000
## 127 1 472.00 15.60000 4.000000
## 128 1 77.00 12.30000 3.200000
## 129 1 2.10 12.60000 3.680000
## 130 2 2.00 11.60000 3.890000
## 131 3 4.20 14.90000 3.500000
## 132 2 7.50 11.30000 3.200000
## 133 1 152.00 10.90000 3.200000
## 134 1 811.00 11.40000 3.400000
## 135 1 4.70 14.20000 3.100000
## 136 1 2.60 13.30000 4.100000
## 137 1 2089.00 15.40000 4.400000
## 138 1 18.00 13.20000 4.300000
## 139 1 4.90 13.60000 4.500000
## 140 1 240.00 14.70000 3.600000
## 141 1 180.00 12.20000 3.500000
## 142 1 15.00 13.30000 3.200000
## 143 1 24.00 16.00000 4.100000
## 144 1 9.40 11.20000 3.400000
## 145 1 7.00 14.90000 3.000000
## 146 1 4.80 9.20000 3.880000
## 147 3 92421.00 14.30000 3.200000
## 148 1 5.20 11.10000 3.300000
## 149 1 16.00 13.30000 3.800000
## 150 1 1.50 12.87901 3.445535
## 151 1 3204.00 16.10000 4.300000
## 152 1 10.00 13.50000 4.000000
## 153 3 695.00 11.10000 2.700000
## 154 1 33.00 9.70000 3.800000
## 155 1 615.00 11.70000 2.820000
## 156 3 1671.00 12.80000 3.440000
## 157 1 975.00 15.30000 3.500000
## 158 2 1.70 14.70000 3.500000
## 159 1 1713.00 8.20000 3.600000
## 160 3 4.90 7.90000 2.430000
## 161 1 19299.95 15.40000 4.600000
## 162 1 4887.00 12.10000 3.000000
## 163 1 75.00 13.30000 4.300000
## 164 1 94964.00 15.60000 4.800000
## 165 1 44340.00 12.70000 2.200000
## Aspartate.transaminase..U.L. Alkaline.phosphatase..U.L. Iron..mcg.dL.
## 1 41.00000 150.0000 85.59884
## 2 96.38272 212.2116 85.59884
## 3 68.00000 109.0000 28.00000
## 4 64.00000 174.0000 85.59884
## 5 306.00000 109.0000 59.00000
## 6 122.00000 396.0000 53.00000
## 7 183.00000 211.0000 171.00000
## 8 108.00000 300.0000 42.00000
## 9 59.00000 63.0000 85.00000
## 10 45.00000 303.0000 85.59884
## 11 334.00000 236.0000 85.59884
## 12 168.00000 154.0000 144.00000
## 13 36.00000 74.0000 85.59884
## 14 96.00000 70.0000 82.00000
## 15 116.00000 163.0000 197.00000
## 16 57.00000 176.0000 25.00000
## 17 63.00000 235.0000 136.00000
## 18 91.00000 146.0000 187.00000
## 19 23.00000 180.0000 144.00000
## 20 87.00000 147.0000 67.00000
## 21 35.00000 141.0000 152.60000
## 22 47.00000 97.0000 87.00000
## 23 136.00000 562.0000 112.00000
## 24 86.00000 396.0000 85.59884
## 25 67.00000 311.0000 85.59884
## 26 107.00000 233.0000 93.00000
## 27 325.00000 172.0000 85.59884
## 28 65.00000 111.0000 180.00000
## 29 85.00000 293.0000 94.00000
## 30 112.00000 974.0000 22.00000
## 31 26.00000 158.0000 85.59884
## 32 85.00000 129.0000 75.00000
## 33 29.00000 135.0000 85.59884
## 34 85.00000 227.0000 85.59884
## 35 102.00000 184.0000 85.59884
## 36 86.00000 113.0000 85.59884
## 37 74.00000 127.0000 98.00000
## 38 58.00000 209.0000 57.00000
## 39 26.00000 92.0000 85.59884
## 40 143.00000 288.0000 85.59884
## 41 117.00000 104.0000 143.00000
## 42 553.00000 68.0000 85.59884
## 43 117.00000 278.0000 21.00000
## 44 56.00000 56.0000 85.59884
## 45 52.00000 97.0000 85.59884
## 46 29.00000 68.0000 85.59884
## 47 96.38272 212.2116 85.59884
## 48 244.00000 595.0000 63.00000
## 49 57.00000 171.0000 45.00000
## 50 49.00000 109.0000 184.00000
## 51 185.00000 539.0000 85.59884
## 52 94.00000 350.0000 84.00000
## 53 47.00000 62.0000 104.00000
## 54 93.00000 130.0000 184.00000
## 55 523.00000 397.0000 56.00000
## 56 19.00000 923.0000 85.59884
## 57 32.00000 55.0000 22.00000
## 58 28.00000 120.0000 32.00000
## 59 85.00000 263.0000 44.00000
## 60 73.00000 103.0000 85.59884
## 61 99.00000 100.0000 85.59884
## 62 165.00000 207.0000 224.00000
## 63 48.00000 178.0000 85.59884
## 64 40.00000 166.0000 85.59884
## 65 38.00000 161.0000 0.00000
## 66 50.00000 104.0000 46.00000
## 67 52.00000 113.0000 94.00000
## 68 82.00000 113.0000 94.00000
## 69 42.00000 108.0000 37.00000
## 70 84.00000 260.0000 15.00000
## 71 41.00000 94.0000 50.00000
## 72 86.00000 417.0000 85.59884
## 73 80.00000 1.2800 26.00000
## 74 354.00000 684.0000 85.59884
## 75 38.00000 101.0000 61.00000
## 76 357.00000 174.0000 178.00000
## 77 91.00000 165.0000 72.00000
## 78 63.00000 114.0000 85.59884
## 79 43.00000 88.0000 19.00000
## 80 145.00000 190.0000 85.59884
## 81 43.00000 204.0000 85.59884
## 82 85.00000 165.0000 200.00000
## 83 157.00000 280.0000 85.59884
## 84 53.00000 207.0000 105.00000
## 85 74.00000 85.0000 85.59884
## 86 92.00000 244.0000 181.00000
## 87 46.00000 213.0000 26.00000
## 88 62.00000 70.0000 120.00000
## 89 33.00000 90.0000 13.00000
## 90 51.00000 474.0000 85.59884
## 91 60.00000 177.0000 37.00000
## 92 33.00000 120.0000 184.00000
## 93 31.00000 222.0000 85.59884
## 94 197.00000 335.0000 85.59884
## 95 31.00000 91.0000 53.00000
## 96 17.00000 124.0000 15.00000
## 97 29.00000 254.0000 85.59884
## 98 43.00000 137.0000 85.59884
## 99 30.00000 92.0000 93.00000
## 100 335.00000 66.0000 85.59884
## 101 34.00000 297.0000 55.00000
## 102 192.00000 262.0000 121.00000
## 103 75.00000 110.0000 92.00000
## 104 266.00000 670.0000 85.59884
## 105 74.00000 312.0000 87.00000
## 106 71.00000 97.0000 92.00000
## 107 226.00000 174.0000 26.40000
## 108 69.00000 86.0000 9.00000
## 109 114.00000 163.0000 131.00000
## 110 27.00000 120.0000 85.59884
## 111 154.00000 166.0000 161.00000
## 112 206.00000 188.0000 85.59884
## 113 49.00000 109.0000 85.59884
## 114 118.00000 158.0000 85.59884
## 115 111.00000 128.0000 51.00000
## 116 113.00000 629.0000 85.59884
## 117 55.00000 235.0000 85.59884
## 118 178.00000 146.0000 106.00000
## 119 69.00000 79.0000 85.59884
## 120 87.00000 239.0000 85.59884
## 121 113.00000 980.0000 85.59884
## 122 125.00000 433.0000 52.50000
## 123 219.00000 363.0000 40.00000
## 124 17.00000 151.0000 88.00000
## 125 101.00000 466.0000 85.59884
## 126 44.00000 217.0000 52.00000
## 127 128.00000 117.0000 85.59884
## 128 38.00000 182.0000 93.00000
## 129 38.00000 127.0000 28.00000
## 130 48.00000 171.0000 85.59884
## 131 61.00000 150.0000 85.59884
## 132 94.00000 147.0000 85.59884
## 133 80.00000 106.0000 85.59884
## 134 102.00000 587.0000 14.00000
## 135 44.00000 517.0000 85.59884
## 136 52.00000 123.0000 85.59884
## 137 32.00000 295.0000 85.59884
## 138 29.00000 141.0000 91.00000
## 139 63.00000 89.0000 78.00000
## 140 132.00000 192.0000 85.59884
## 141 58.00000 302.0000 85.59884
## 142 87.00000 108.0000 85.59884
## 143 158.00000 84.0000 85.59884
## 144 63.00000 106.0000 94.00000
## 145 401.00000 93.0000 124.00000
## 146 51.00000 141.0000 85.59884
## 147 76.00000 472.0000 29.00000
## 148 47.00000 117.0000 85.59884
## 149 35.00000 105.0000 85.59884
## 150 96.38272 212.2116 85.59884
## 151 31.00000 79.0000 85.59884
## 152 79.00000 85.0000 85.59884
## 153 73.00000 44.0000 85.59884
## 154 54.00000 338.0000 85.59884
## 155 50.00000 318.0000 85.59884
## 156 95.00000 139.0000 111.00000
## 157 85.00000 266.0000 180.00000
## 158 24.00000 97.0000 85.59884
## 159 59.00000 263.0000 85.59884
## 160 71.00000 73.0000 40.00000
## 161 40.00000 109.0000 85.59884
## 162 91.00000 280.0000 85.59884
## 163 52.00000 181.0000 85.59884
## 164 60.00000 170.0000 85.59884
## 165 127.00000 462.0000 85.59884
## Ferritin..ng.mL. Class
## 1 438.9976 1
## 2 438.9976 1
## 3 16.0000 1
## 4 438.9976 0
## 5 22.0000 1
## 6 111.0000 0
## 7 1452.0000 0
## 8 706.0000 0
## 9 982.0000 1
## 10 438.9976 1
## 11 438.9976 0
## 12 277.0000 1
## 13 438.9976 1
## 14 438.9976 1
## 15 302.0000 1
## 16 60.0000 1
## 17 767.0000 0
## 18 443.0000 1
## 19 295.0000 1
## 20 774.0000 0
## 21 76.9000 1
## 22 84.0000 1
## 23 1001.0000 0
## 24 438.9976 0
## 25 438.9976 0
## 26 79.0000 1
## 27 438.9976 0
## 28 438.9976 1
## 29 70.0000 0
## 30 369.0000 1
## 31 438.9976 1
## 32 239.0000 1
## 33 438.9976 1
## 34 438.9976 1
## 35 438.9976 1
## 36 438.9976 1
## 37 870.0000 0
## 38 134.0000 0
## 39 438.9976 1
## 40 438.9976 1
## 41 120.0000 1
## 42 438.9976 1
## 43 279.0000 0
## 44 438.9976 1
## 45 438.9976 1
## 46 438.9976 1
## 47 438.9976 1
## 48 888.0000 0
## 49 802.0000 0
## 50 905.0000 1
## 51 438.9976 0
## 52 497.0000 0
## 53 635.0000 1
## 54 59.0000 1
## 55 742.0000 0
## 56 438.9976 0
## 57 48.0000 0
## 58 18.0000 1
## 59 176.0000 0
## 60 438.9976 0
## 61 438.9976 1
## 62 363.0000 1
## 63 438.9976 1
## 64 438.9976 1
## 65 0.0000 1
## 66 438.9976 0
## 67 393.0000 1
## 68 48.0000 1
## 69 419.0000 0
## 70 639.0000 0
## 71 20.0000 1
## 72 438.9976 1
## 73 227.0000 1
## 74 438.9976 0
## 75 255.0000 1
## 76 960.0000 0
## 77 355.0000 0
## 78 438.9976 1
## 79 141.0000 1
## 80 438.9976 1
## 81 438.9976 0
## 82 316.0000 1
## 83 859.0000 1
## 84 221.0000 0
## 85 438.9976 1
## 86 108.0000 1
## 87 438.9976 1
## 88 30.0000 1
## 89 28.0000 0
## 90 438.9976 1
## 91 173.0000 0
## 92 423.0000 1
## 93 438.9976 0
## 94 438.9976 0
## 95 278.0000 1
## 96 810.0000 0
## 97 438.9976 1
## 98 438.9976 1
## 99 29.0000 1
## 100 438.9976 1
## 101 256.0000 1
## 102 749.0000 0
## 103 489.0000 1
## 104 438.9976 0
## 105 81.0000 1
## 106 48.9000 1
## 107 2230.0000 1
## 108 490.0000 0
## 109 1316.0000 0
## 110 438.9976 1
## 111 297.0000 1
## 112 438.9976 0
## 113 438.9976 1
## 114 438.9976 1
## 115 56.0000 0
## 116 438.9976 1
## 117 438.9976 0
## 118 2165.0000 0
## 119 438.9976 1
## 120 438.9976 1
## 121 438.9976 1
## 122 856.0000 0
## 123 57.0000 0
## 124 90.0000 1
## 125 579.0000 0
## 126 832.0000 0
## 127 438.9976 1
## 128 307.0000 1
## 129 308.0000 0
## 130 438.9976 0
## 131 438.9976 1
## 132 438.9976 1
## 133 438.9976 0
## 134 149.0000 0
## 135 438.9976 1
## 136 438.9976 1
## 137 438.9976 1
## 138 80.0000 1
## 139 220.0000 1
## 140 438.9976 1
## 141 206.0000 0
## 142 438.9976 0
## 143 438.9976 1
## 144 344.0000 1
## 145 642.0000 1
## 146 438.9976 1
## 147 14.0000 0
## 148 438.9976 1
## 149 438.9976 1
## 150 438.9976 1
## 151 438.9976 1
## 152 438.9976 1
## 153 438.9976 1
## 154 438.9976 0
## 155 438.9976 1
## 156 1600.0000 0
## 157 1176.0000 0
## 158 438.9976 1
## 159 438.9976 1
## 160 283.0000 0
## 161 438.9976 1
## 162 438.9976 0
## 163 438.9976 1
## 164 438.9976 0
## 165 438.9976 0
##2. Membagi data training dan data testing
Sample_DT_Barchart <- sample(1:165, 50)
Testing_DT_Barchart<- Data_DT_Barchart[Sample_DT_Barchart,]
Training_DT_Barchart<-Data_DT_Barchart[-Sample_DT_Barchart,]
head(Testing_DT_Barchart)
## Patients Symptoms Hepatitis.B.e.Antigen Endemic.Countries Smoking
## 54 54 1 0 0 0
## 8 8 1 0 0 1
## 67 67 1 0 0 1
## 69 69 1 0 0 0
## 61 61 1 0 1 1
## 30 30 1 0 0 1
## PortalVeinThrombosis LiverMetastasis PerformanceStatus Encefalopathy.degree
## 54 0 0 0 1
## 8 1 0 3 1
## 67 0 0 0 1
## 69 0 1 0 1
## 61 1 0 1 1
## 30 0 0 1 1
## Ascitesdegree Alpha.Fetoprotein..ng.mL. Haemoglobin..g.dL. Albumin..mg.dL.
## 54 1 7.3 15.3 3.445535
## 8 1 9860.0 10.8 3.100000
## 67 1 13.0 15.6 4.540000
## 69 2 249.0 12.7 3.100000
## 61 3 608.0 12.6 4.000000
## 30 1 1.9 9.9 3.200000
## Aspartate.transaminase..U.L. Alkaline.phosphatase..U.L. Iron..mcg.dL.
## 54 93 130 184.00000
## 8 108 300 42.00000
## 67 52 113 94.00000
## 69 42 108 37.00000
## 61 99 100 85.59884
## 30 112 974 22.00000
## Ferritin..ng.mL. Class
## 54 59.0000 1
## 8 706.0000 0
## 67 393.0000 1
## 69 419.0000 0
## 61 438.9976 1
## 30 369.0000 1
head(Training_DT_Barchart)
## Patients Symptoms Hepatitis.B.e.Antigen Endemic.Countries Smoking
## 2 2 1 0 0 1
## 3 3 0 0 0 1
## 4 4 1 0 0 1
## 6 6 0 0 0 1
## 7 7 0 0 0 0
## 9 9 1 0 0 1
## PortalVeinThrombosis LiverMetastasis PerformanceStatus Encefalopathy.degree
## 2 0 0 0 1
## 3 0 1 2 1
## 4 0 1 0 1
## 6 0 0 1 1
## 7 0 0 0 1
## 9 0 0 1 1
## Ascitesdegree Alpha.Fetoprotein..ng.mL. Haemoglobin..g.dL. Albumin..mg.dL.
## 2 1 19299.95 12.87901 3.445535
## 3 2 5.80 8.90000 3.300000
## 4 1 2440.00 13.40000 3.700000
## 6 2 110.00 13.40000 3.400000
## 7 1 138.90 10.40000 2.350000
## 9 2 8.80 11.90000 1.900000
## Aspartate.transaminase..U.L. Alkaline.phosphatase..U.L. Iron..mcg.dL.
## 2 96.38272 212.2116 85.59884
## 3 68.00000 109.0000 28.00000
## 4 64.00000 174.0000 85.59884
## 6 122.00000 396.0000 53.00000
## 7 183.00000 211.0000 171.00000
## 9 59.00000 63.0000 85.00000
## Ferritin..ng.mL. Class
## 2 438.9976 1
## 3 16.0000 1
## 4 438.9976 0
## 6 111.0000 0
## 7 1452.0000 0
## 9 982.0000 1
##3. Package yang Dibutuhkan
library(rpart)
library(rpart.plot)
library(rattle)
library(caret)
##4.Pembentukan model
dtree_Barchart <- rpart(Class~Symptoms + PortalVeinThrombosis + LiverMetastasis + PerformanceStatus + Ascitesdegree + Hepatitis.B.e.Antigen + Endemic.Countries + Smoking + Encefalopathy.degree, data = Training_DT_Barchart, method = 'class')
rpart.plot(dtree_Barchart, extra = 106)
fancyRpartPlot(dtree_Barchart)
##5.Prediksi
pred_dtree_Barchart <- predict(dtree_Barchart, newdata = Testing_DT_Barchart, type = 'class')
confusionMatrix(pred_dtree_Barchart, Testing_DT_Barchart$Class)
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 11 13
## 1 8 18
##
## Accuracy : 0.58
## 95% CI : (0.4321, 0.7181)
## No Information Rate : 0.62
## P-Value [Acc > NIR] : 0.7683
##
## Kappa : 0.1519
##
## Mcnemar's Test P-Value : 0.3827
##
## Sensitivity : 0.5789
## Specificity : 0.5806
## Pos Pred Value : 0.4583
## Neg Pred Value : 0.6923
## Prevalence : 0.3800
## Detection Rate : 0.2200
## Detection Prevalence : 0.4800
## Balanced Accuracy : 0.5798
##
## 'Positive' Class : 0
##
##BORUTA (50 DATA TESTING) ##1. Load Data HCC Survival
Data_DT_Boruta <- datasetfinalboruta
Data_DT_Boruta
## Patients Symptoms PortalVeinThrombosis LiverMetastasis PerformanceStatus
## 1 1 0 0 0 0
## 2 2 1 0 0 0
## 3 3 0 0 1 2
## 4 4 1 0 1 0
## 5 5 1 0 0 0
## 6 6 0 0 0 1
## 7 7 0 0 0 0
## 8 8 1 1 0 3
## 9 9 1 0 0 1
## 10 10 1 0 0 0
## 11 11 0 0 0 0
## 12 12 0 0 0 0
## 13 13 0 1 0 0
## 14 14 1 0 0 0
## 15 15 0 0 0 2
## 16 16 0 1 0 1
## 17 17 0 1 0 3
## 18 18 1 1 1 2
## 19 19 1 0 0 1
## 20 20 1 1 0 0
## 21 21 0 0 0 0
## 22 22 0 0 0 0
## 23 23 1 0 0 1
## 24 24 1 1 1 1
## 25 25 1 0 1 4
## 26 26 0 1 0 2
## 27 27 1 1 0 1
## 28 28 1 0 0 0
## 29 29 0 1 0 1
## 30 30 1 0 0 1
## 31 31 1 1 0 2
## 32 32 0 0 0 0
## 33 33 1 0 0 0
## 34 34 1 0 0 0
## 35 35 1 0 0 0
## 36 36 1 0 1 0
## 37 37 1 0 0 4
## 38 38 1 0 0 1
## 39 39 0 0 0 0
## 40 40 1 0 1 0
## 41 41 1 0 0 0
## 42 42 0 0 0 0
## 43 43 0 0 1 3
## 44 44 0 0 0 0
## 45 45 0 0 0 0
## 46 46 0 0 0 0
## 47 47 1 0 0 0
## 48 48 1 1 1 1
## 49 49 1 0 0 0
## 50 50 1 0 0 0
## 51 51 1 0 1 2
## 52 52 0 0 1 0
## 53 53 1 0 0 0
## 54 54 1 0 0 0
## 55 55 1 1 0 3
## 56 56 1 0 1 1
## 57 57 1 0 0 2
## 58 58 1 0 0 0
## 59 59 1 0 0 3
## 60 60 1 0 0 0
## 61 61 1 1 0 1
## 62 62 1 0 0 0
## 63 63 0 0 0 2
## 64 64 1 0 0 3
## 65 65 0 0 0 0
## 66 66 1 0 1 2
## 67 67 1 0 0 0
## 68 68 1 0 0 0
## 69 69 1 0 1 0
## 70 70 1 1 0 2
## 71 71 1 0 0 0
## 72 72 1 1 1 3
## 73 73 1 1 1 2
## 74 74 1 1 1 0
## 75 75 1 0 0 0
## 76 76 1 0 0 0
## 77 77 1 1 0 2
## 78 78 1 0 0 0
## 79 79 1 0 1 3
## 80 80 1 0 0 2
## 81 81 1 0 1 3
## 82 82 0 0 0 1
## 83 83 0 0 0 1
## 84 84 1 1 0 2
## 85 85 1 0 0 0
## 86 86 1 1 0 1
## 87 87 1 0 1 1
## 88 88 0 0 0 0
## 89 89 1 0 0 0
## 90 90 0 0 0 1
## 91 91 1 0 1 2
## 92 92 0 0 0 0
## 93 93 1 0 0 3
## 94 94 1 1 1 1
## 95 95 1 0 0 0
## 96 96 0 0 0 1
## 97 97 1 0 0 1
## 98 98 0 1 0 0
## 99 99 0 0 0 0
## 100 100 0 0 0 2
## 101 101 1 0 1 2
## 102 102 1 0 1 1
## 103 103 1 0 0 2
## 104 104 1 0 0 4
## 105 105 1 0 0 1
## 106 106 1 0 0 3
## 107 107 1 0 0 0
## 108 108 1 0 0 3
## 109 109 1 0 1 1
## 110 110 0 0 0 0
## 111 111 1 1 1 2
## 112 112 1 1 0 2
## 113 113 0 0 0 0
## 114 114 1 1 1 1
## 115 115 1 1 0 3
## 116 116 1 0 0 0
## 117 117 1 0 0 0
## 118 118 1 0 0 4
## 119 119 0 0 0 0
## 120 120 0 0 0 0
## 121 121 1 0 0 1
## 122 122 0 0 0 0
## 123 123 1 1 1 2
## 124 124 0 0 0 0
## 125 125 1 0 0 2
## 126 126 1 1 0 4
## 127 127 0 0 0 0
## 128 128 0 1 0 3
## 129 129 0 0 0 2
## 130 130 1 0 0 2
## 131 131 1 1 1 2
## 132 132 0 0 0 2
## 133 133 0 0 0 0
## 134 134 1 0 1 1
## 135 135 1 0 0 0
## 136 136 1 0 1 1
## 137 137 1 0 0 2
## 138 138 1 0 0 0
## 139 139 1 0 0 0
## 140 140 1 0 0 2
## 141 141 1 1 0 0
## 142 142 1 0 0 0
## 143 143 1 0 0 0
## 144 144 0 0 0 0
## 145 145 1 0 0 0
## 146 146 1 0 0 0
## 147 147 1 1 0 2
## 148 148 0 0 0 0
## 149 149 0 0 0 0
## 150 150 1 1 0 2
## 151 151 0 0 0 0
## 152 152 0 0 0 0
## 153 153 1 0 1 1
## 154 154 1 0 0 3
## 155 155 0 0 0 0
## 156 156 1 1 0 3
## 157 157 1 0 1 1
## 158 158 1 0 0 0
## 159 159 1 0 0 3
## 160 160 1 0 1 3
## 161 161 0 0 0 0
## 162 162 1 0 0 2
## 163 163 0 0 0 0
## 164 164 0 1 1 2
## 165 165 1 0 1 0
## Ascitesdegree Alpha.Fetoprotein..ng.mL. Haemoglobin..g.dL. Albumin..mg.dL.
## 1 1 95.00 13.70000 3.400000
## 2 1 19299.95 12.87901 3.445535
## 3 2 5.80 8.90000 3.300000
## 4 1 2440.00 13.40000 3.700000
## 5 1 49.00 14.30000 4.100000
## 6 2 110.00 13.40000 3.400000
## 7 1 138.90 10.40000 2.350000
## 8 1 9860.00 10.80000 3.100000
## 9 2 8.80 11.90000 1.900000
## 10 1 1.80 11.80000 4.200000
## 11 2 100809.00 13.00000 4.400000
## 12 1 86.00 15.70000 3.700000
## 13 1 60.00 13.30000 4.400000
## 14 1 6.60 13.70000 4.500000
## 15 2 29.00 13.50000 3.150000
## 16 2 4.60 10.20000 3.100000
## 17 1 60.00 12.10000 2.400000
## 18 1 9.20 10.30000 3.800000
## 19 1 8.80 14.90000 4.300000
## 20 3 34.00 15.90000 3.400000
## 21 1 19.60 11.70000 3.600000
## 22 1 3.90 16.40000 4.500000
## 23 3 1975.00 10.80000 3.100000
## 24 1 185.00 10.70000 3.600000
## 25 1 5532.00 13.10000 2.400000
## 26 1 13327.00 13.70000 3.100000
## 27 3 19299.95 13.60000 3.300000
## 28 1 5.90 15.50000 4.500000
## 29 1 3255.00 12.20000 3.000000
## 30 1 1.90 9.90000 3.200000
## 31 1 11.00 13.10000 3.500000
## 32 1 1237.00 12.20000 4.200000
## 33 1 7.70 15.70000 3.400000
## 34 1 266.00 13.70000 4.200000
## 35 1 5689.00 14.30000 3.800000
## 36 1 14.20 14.80000 4.200000
## 37 2 3.10 13.10000 2.300000
## 38 2 633.00 13.00000 2.300000
## 39 1 5.40 14.90000 4.200000
## 40 1 479.00 11.30000 3.100000
## 41 1 19.00 13.90000 4.400000
## 42 1 2.80 15.70000 4.100000
## 43 1 185203.00 15.00000 3.600000
## 44 1 5.00 15.10000 3.800000
## 45 1 237.00 15.60000 3.200000
## 46 1 2.80 16.60000 4.900000
## 47 1 19299.95 12.87901 3.445535
## 48 2 16.00 14.00000 3.100000
## 49 1 163.00 10.60000 3.300000
## 50 1 20.00 15.00000 4.600000
## 51 1 46.00 10.50000 2.400000
## 52 1 41470.00 12.60000 3.110000
## 53 1 4.70 11.80000 4.100000
## 54 1 7.30 15.30000 3.445535
## 55 2 1898.00 12.40000 2.700000
## 56 1 77.00 10.80000 3.200000
## 57 2 2.70 7.30000 3.400000
## 58 1 2.60 10.30000 3.900000
## 59 2 12.00 10.90000 1.900000
## 60 1 19299.95 18.70000 4.000000
## 61 3 608.00 12.60000 4.000000
## 62 2 41.00 14.60000 3.100000
## 63 2 19299.95 11.50000 2.700000
## 64 3 2.00 12.60000 3.500000
## 65 1 7.00 12.10000 3.800000
## 66 1 3.10 15.10000 2.800000
## 67 1 13.00 15.60000 4.540000
## 68 1 1.70 9.50000 2.100000
## 69 2 249.00 12.70000 3.100000
## 70 3 66.00 10.90000 2.700000
## 71 1 358.00 12.70000 2.900000
## 72 2 1810346.00 13.00000 3.400000
## 73 2 33502.00 14.40000 3.100000
## 74 1 20.00 13.00000 3.700000
## 75 1 2.50 14.90000 3.800000
## 76 1 2269.00 12.10000 3.445535
## 77 2 4181.00 9.10000 3.000000
## 78 1 5.10 14.40000 4.100000
## 79 1 345.00 9.80000 2.600000
## 80 2 2.90 10.40000 2.400000
## 81 1 2.50 12.60000 3.300000
## 82 1 5.04 15.80000 3.500000
## 83 3 3.70 14.80000 2.700000
## 84 3 2.60 12.70000 2.900000
## 85 1 2.90 16.40000 4.100000
## 86 1 20.00 13.90000 3.000000
## 87 1 2.79 12.60000 3.445535
## 88 2 42.00 15.80000 4.200000
## 89 2 457.00 11.70000 4.100000
## 90 3 19299.95 9.50000 2.200000
## 91 3 123.00 10.10000 4.000000
## 92 1 8.70 14.60000 4.200000
## 93 1 226.00 11.50000 4.200000
## 94 1 2159.00 13.10000 3.600000
## 95 1 48.00 12.60000 4.200000
## 96 1 2.40 9.50000 3.500000
## 97 2 5.00 9.10000 2.470000
## 98 1 64.00 14.10000 2.900000
## 99 1 5.50 13.10000 3.200000
## 100 2 8.50 13.00000 2.700000
## 101 1 2785.00 12.00000 2.600000
## 102 3 6574.00 13.50000 3.500000
## 103 2 5.70 13.90000 2.100000
## 104 3 19299.95 13.50000 2.900000
## 105 1 39.00 12.60000 3.800000
## 106 2 2.50 14.30000 2.200000
## 107 1 32.00 14.00000 4.200000
## 108 2 7.60 5.00000 3.700000
## 109 1 173.00 11.30000 2.400000
## 110 1 2.30 12.10000 2.700000
## 111 2 114.00 14.90000 3.200000
## 112 2 173.00 11.10000 3.100000
## 113 1 18.00 15.30000 4.200000
## 114 1 42.00 16.20000 4.100000
## 115 2 28274.00 10.30000 2.400000
## 116 1 736.00 14.00000 4.700000
## 117 3 1009.00 13.80000 3.000000
## 118 2 22475.00 12.00000 3.280000
## 119 1 5.20 14.60000 4.200000
## 120 2 5.20 14.90000 3.200000
## 121 1 14177.00 10.20000 2.600000
## 122 1 3.10 10.80000 3.600000
## 123 1 50655.00 9.80000 2.600000
## 124 1 1.20 15.10000 4.700000
## 125 2 657.00 11.80000 3.200000
## 126 2 421500.00 14.30000 3.100000
## 127 1 472.00 15.60000 4.000000
## 128 1 77.00 12.30000 3.200000
## 129 1 2.10 12.60000 3.680000
## 130 2 2.00 11.60000 3.890000
## 131 3 4.20 14.90000 3.500000
## 132 2 7.50 11.30000 3.200000
## 133 1 152.00 10.90000 3.200000
## 134 1 811.00 11.40000 3.400000
## 135 1 4.70 14.20000 3.100000
## 136 1 2.60 13.30000 4.100000
## 137 1 2089.00 15.40000 4.400000
## 138 1 18.00 13.20000 4.300000
## 139 1 4.90 13.60000 4.500000
## 140 1 240.00 14.70000 3.600000
## 141 1 180.00 12.20000 3.500000
## 142 1 15.00 13.30000 3.200000
## 143 1 24.00 16.00000 4.100000
## 144 1 9.40 11.20000 3.400000
## 145 1 7.00 14.90000 3.000000
## 146 1 4.80 9.20000 3.880000
## 147 3 92421.00 14.30000 3.200000
## 148 1 5.20 11.10000 3.300000
## 149 1 16.00 13.30000 3.800000
## 150 1 1.50 12.87901 3.445535
## 151 1 3204.00 16.10000 4.300000
## 152 1 10.00 13.50000 4.000000
## 153 3 695.00 11.10000 2.700000
## 154 1 33.00 9.70000 3.800000
## 155 1 615.00 11.70000 2.820000
## 156 3 1671.00 12.80000 3.440000
## 157 1 975.00 15.30000 3.500000
## 158 2 1.70 14.70000 3.500000
## 159 1 1713.00 8.20000 3.600000
## 160 3 4.90 7.90000 2.430000
## 161 1 19299.95 15.40000 4.600000
## 162 1 4887.00 12.10000 3.000000
## 163 1 75.00 13.30000 4.300000
## 164 1 94964.00 15.60000 4.800000
## 165 1 44340.00 12.70000 2.200000
## Aspartate.transaminase..U.L. Alkaline.phosphatase..U.L. Iron..mcg.dL.
## 1 41.00000 150.0000 85.59884
## 2 96.38272 212.2116 85.59884
## 3 68.00000 109.0000 28.00000
## 4 64.00000 174.0000 85.59884
## 5 306.00000 109.0000 59.00000
## 6 122.00000 396.0000 53.00000
## 7 183.00000 211.0000 171.00000
## 8 108.00000 300.0000 42.00000
## 9 59.00000 63.0000 85.00000
## 10 45.00000 303.0000 85.59884
## 11 334.00000 236.0000 85.59884
## 12 168.00000 154.0000 144.00000
## 13 36.00000 74.0000 85.59884
## 14 96.00000 70.0000 82.00000
## 15 116.00000 163.0000 197.00000
## 16 57.00000 176.0000 25.00000
## 17 63.00000 235.0000 136.00000
## 18 91.00000 146.0000 187.00000
## 19 23.00000 180.0000 144.00000
## 20 87.00000 147.0000 67.00000
## 21 35.00000 141.0000 152.60000
## 22 47.00000 97.0000 87.00000
## 23 136.00000 562.0000 112.00000
## 24 86.00000 396.0000 85.59884
## 25 67.00000 311.0000 85.59884
## 26 107.00000 233.0000 93.00000
## 27 325.00000 172.0000 85.59884
## 28 65.00000 111.0000 180.00000
## 29 85.00000 293.0000 94.00000
## 30 112.00000 974.0000 22.00000
## 31 26.00000 158.0000 85.59884
## 32 85.00000 129.0000 75.00000
## 33 29.00000 135.0000 85.59884
## 34 85.00000 227.0000 85.59884
## 35 102.00000 184.0000 85.59884
## 36 86.00000 113.0000 85.59884
## 37 74.00000 127.0000 98.00000
## 38 58.00000 209.0000 57.00000
## 39 26.00000 92.0000 85.59884
## 40 143.00000 288.0000 85.59884
## 41 117.00000 104.0000 143.00000
## 42 553.00000 68.0000 85.59884
## 43 117.00000 278.0000 21.00000
## 44 56.00000 56.0000 85.59884
## 45 52.00000 97.0000 85.59884
## 46 29.00000 68.0000 85.59884
## 47 96.38272 212.2116 85.59884
## 48 244.00000 595.0000 63.00000
## 49 57.00000 171.0000 45.00000
## 50 49.00000 109.0000 184.00000
## 51 185.00000 539.0000 85.59884
## 52 94.00000 350.0000 84.00000
## 53 47.00000 62.0000 104.00000
## 54 93.00000 130.0000 184.00000
## 55 523.00000 397.0000 56.00000
## 56 19.00000 923.0000 85.59884
## 57 32.00000 55.0000 22.00000
## 58 28.00000 120.0000 32.00000
## 59 85.00000 263.0000 44.00000
## 60 73.00000 103.0000 85.59884
## 61 99.00000 100.0000 85.59884
## 62 165.00000 207.0000 224.00000
## 63 48.00000 178.0000 85.59884
## 64 40.00000 166.0000 85.59884
## 65 38.00000 161.0000 0.00000
## 66 50.00000 104.0000 46.00000
## 67 52.00000 113.0000 94.00000
## 68 82.00000 113.0000 94.00000
## 69 42.00000 108.0000 37.00000
## 70 84.00000 260.0000 15.00000
## 71 41.00000 94.0000 50.00000
## 72 86.00000 417.0000 85.59884
## 73 80.00000 1.2800 26.00000
## 74 354.00000 684.0000 85.59884
## 75 38.00000 101.0000 61.00000
## 76 357.00000 174.0000 178.00000
## 77 91.00000 165.0000 72.00000
## 78 63.00000 114.0000 85.59884
## 79 43.00000 88.0000 19.00000
## 80 145.00000 190.0000 85.59884
## 81 43.00000 204.0000 85.59884
## 82 85.00000 165.0000 200.00000
## 83 157.00000 280.0000 85.59884
## 84 53.00000 207.0000 105.00000
## 85 74.00000 85.0000 85.59884
## 86 92.00000 244.0000 181.00000
## 87 46.00000 213.0000 26.00000
## 88 62.00000 70.0000 120.00000
## 89 33.00000 90.0000 13.00000
## 90 51.00000 474.0000 85.59884
## 91 60.00000 177.0000 37.00000
## 92 33.00000 120.0000 184.00000
## 93 31.00000 222.0000 85.59884
## 94 197.00000 335.0000 85.59884
## 95 31.00000 91.0000 53.00000
## 96 17.00000 124.0000 15.00000
## 97 29.00000 254.0000 85.59884
## 98 43.00000 137.0000 85.59884
## 99 30.00000 92.0000 93.00000
## 100 335.00000 66.0000 85.59884
## 101 34.00000 297.0000 55.00000
## 102 192.00000 262.0000 121.00000
## 103 75.00000 110.0000 92.00000
## 104 266.00000 670.0000 85.59884
## 105 74.00000 312.0000 87.00000
## 106 71.00000 97.0000 92.00000
## 107 226.00000 174.0000 26.40000
## 108 69.00000 86.0000 9.00000
## 109 114.00000 163.0000 131.00000
## 110 27.00000 120.0000 85.59884
## 111 154.00000 166.0000 161.00000
## 112 206.00000 188.0000 85.59884
## 113 49.00000 109.0000 85.59884
## 114 118.00000 158.0000 85.59884
## 115 111.00000 128.0000 51.00000
## 116 113.00000 629.0000 85.59884
## 117 55.00000 235.0000 85.59884
## 118 178.00000 146.0000 106.00000
## 119 69.00000 79.0000 85.59884
## 120 87.00000 239.0000 85.59884
## 121 113.00000 980.0000 85.59884
## 122 125.00000 433.0000 52.50000
## 123 219.00000 363.0000 40.00000
## 124 17.00000 151.0000 88.00000
## 125 101.00000 466.0000 85.59884
## 126 44.00000 217.0000 52.00000
## 127 128.00000 117.0000 85.59884
## 128 38.00000 182.0000 93.00000
## 129 38.00000 127.0000 28.00000
## 130 48.00000 171.0000 85.59884
## 131 61.00000 150.0000 85.59884
## 132 94.00000 147.0000 85.59884
## 133 80.00000 106.0000 85.59884
## 134 102.00000 587.0000 14.00000
## 135 44.00000 517.0000 85.59884
## 136 52.00000 123.0000 85.59884
## 137 32.00000 295.0000 85.59884
## 138 29.00000 141.0000 91.00000
## 139 63.00000 89.0000 78.00000
## 140 132.00000 192.0000 85.59884
## 141 58.00000 302.0000 85.59884
## 142 87.00000 108.0000 85.59884
## 143 158.00000 84.0000 85.59884
## 144 63.00000 106.0000 94.00000
## 145 401.00000 93.0000 124.00000
## 146 51.00000 141.0000 85.59884
## 147 76.00000 472.0000 29.00000
## 148 47.00000 117.0000 85.59884
## 149 35.00000 105.0000 85.59884
## 150 96.38272 212.2116 85.59884
## 151 31.00000 79.0000 85.59884
## 152 79.00000 85.0000 85.59884
## 153 73.00000 44.0000 85.59884
## 154 54.00000 338.0000 85.59884
## 155 50.00000 318.0000 85.59884
## 156 95.00000 139.0000 111.00000
## 157 85.00000 266.0000 180.00000
## 158 24.00000 97.0000 85.59884
## 159 59.00000 263.0000 85.59884
## 160 71.00000 73.0000 40.00000
## 161 40.00000 109.0000 85.59884
## 162 91.00000 280.0000 85.59884
## 163 52.00000 181.0000 85.59884
## 164 60.00000 170.0000 85.59884
## 165 127.00000 462.0000 85.59884
## Ferritin..ng.mL. Class
## 1 438.9976 1
## 2 438.9976 1
## 3 16.0000 1
## 4 438.9976 0
## 5 22.0000 1
## 6 111.0000 0
## 7 1452.0000 0
## 8 706.0000 0
## 9 982.0000 1
## 10 438.9976 1
## 11 438.9976 0
## 12 277.0000 1
## 13 438.9976 1
## 14 438.9976 1
## 15 302.0000 1
## 16 60.0000 1
## 17 767.0000 0
## 18 443.0000 1
## 19 295.0000 1
## 20 774.0000 0
## 21 76.9000 1
## 22 84.0000 1
## 23 1001.0000 0
## 24 438.9976 0
## 25 438.9976 0
## 26 79.0000 1
## 27 438.9976 0
## 28 438.9976 1
## 29 70.0000 0
## 30 369.0000 1
## 31 438.9976 1
## 32 239.0000 1
## 33 438.9976 1
## 34 438.9976 1
## 35 438.9976 1
## 36 438.9976 1
## 37 870.0000 0
## 38 134.0000 0
## 39 438.9976 1
## 40 438.9976 1
## 41 120.0000 1
## 42 438.9976 1
## 43 279.0000 0
## 44 438.9976 1
## 45 438.9976 1
## 46 438.9976 1
## 47 438.9976 1
## 48 888.0000 0
## 49 802.0000 0
## 50 905.0000 1
## 51 438.9976 0
## 52 497.0000 0
## 53 635.0000 1
## 54 59.0000 1
## 55 742.0000 0
## 56 438.9976 0
## 57 48.0000 0
## 58 18.0000 1
## 59 176.0000 0
## 60 438.9976 0
## 61 438.9976 1
## 62 363.0000 1
## 63 438.9976 1
## 64 438.9976 1
## 65 0.0000 1
## 66 438.9976 0
## 67 393.0000 1
## 68 48.0000 1
## 69 419.0000 0
## 70 639.0000 0
## 71 20.0000 1
## 72 438.9976 1
## 73 227.0000 1
## 74 438.9976 0
## 75 255.0000 1
## 76 960.0000 0
## 77 355.0000 0
## 78 438.9976 1
## 79 141.0000 1
## 80 438.9976 1
## 81 438.9976 0
## 82 316.0000 1
## 83 859.0000 1
## 84 221.0000 0
## 85 438.9976 1
## 86 108.0000 1
## 87 438.9976 1
## 88 30.0000 1
## 89 28.0000 0
## 90 438.9976 1
## 91 173.0000 0
## 92 423.0000 1
## 93 438.9976 0
## 94 438.9976 0
## 95 278.0000 1
## 96 810.0000 0
## 97 438.9976 1
## 98 438.9976 1
## 99 29.0000 1
## 100 438.9976 1
## 101 256.0000 1
## 102 749.0000 0
## 103 489.0000 1
## 104 438.9976 0
## 105 81.0000 1
## 106 48.9000 1
## 107 2230.0000 1
## 108 490.0000 0
## 109 1316.0000 0
## 110 438.9976 1
## 111 297.0000 1
## 112 438.9976 0
## 113 438.9976 1
## 114 438.9976 1
## 115 56.0000 0
## 116 438.9976 1
## 117 438.9976 0
## 118 2165.0000 0
## 119 438.9976 1
## 120 438.9976 1
## 121 438.9976 1
## 122 856.0000 0
## 123 57.0000 0
## 124 90.0000 1
## 125 579.0000 0
## 126 832.0000 0
## 127 438.9976 1
## 128 307.0000 1
## 129 308.0000 0
## 130 438.9976 0
## 131 438.9976 1
## 132 438.9976 1
## 133 438.9976 0
## 134 149.0000 0
## 135 438.9976 1
## 136 438.9976 1
## 137 438.9976 1
## 138 80.0000 1
## 139 220.0000 1
## 140 438.9976 1
## 141 206.0000 0
## 142 438.9976 0
## 143 438.9976 1
## 144 344.0000 1
## 145 642.0000 1
## 146 438.9976 1
## 147 14.0000 0
## 148 438.9976 1
## 149 438.9976 1
## 150 438.9976 1
## 151 438.9976 1
## 152 438.9976 1
## 153 438.9976 1
## 154 438.9976 0
## 155 438.9976 1
## 156 1600.0000 0
## 157 1176.0000 0
## 158 438.9976 1
## 159 438.9976 1
## 160 283.0000 0
## 161 438.9976 1
## 162 438.9976 0
## 163 438.9976 1
## 164 438.9976 0
## 165 438.9976 0
##2. Membagi data training dan data testing
Sample_DT_Boruta <- sample(1:165, 50)
Testing_DT_Boruta<- Data_DT_Boruta[Sample_DT_Boruta,]
Training_DT_Boruta<-Data_DT_Boruta[-Sample_DT_Boruta,]
head(Testing_DT_Boruta)
## Patients Symptoms PortalVeinThrombosis LiverMetastasis PerformanceStatus
## 40 40 1 0 1 0
## 77 77 1 1 0 2
## 97 97 1 0 0 1
## 160 160 1 0 1 3
## 102 102 1 0 1 1
## 127 127 0 0 0 0
## Ascitesdegree Alpha.Fetoprotein..ng.mL. Haemoglobin..g.dL. Albumin..mg.dL.
## 40 1 479.0 11.3 3.10
## 77 2 4181.0 9.1 3.00
## 97 2 5.0 9.1 2.47
## 160 3 4.9 7.9 2.43
## 102 3 6574.0 13.5 3.50
## 127 1 472.0 15.6 4.00
## Aspartate.transaminase..U.L. Alkaline.phosphatase..U.L. Iron..mcg.dL.
## 40 143 288 85.59884
## 77 91 165 72.00000
## 97 29 254 85.59884
## 160 71 73 40.00000
## 102 192 262 121.00000
## 127 128 117 85.59884
## Ferritin..ng.mL. Class
## 40 438.9976 1
## 77 355.0000 0
## 97 438.9976 1
## 160 283.0000 0
## 102 749.0000 0
## 127 438.9976 1
head(Training_DT_Boruta)
## Patients Symptoms PortalVeinThrombosis LiverMetastasis PerformanceStatus
## 1 1 0 0 0 0
## 2 2 1 0 0 0
## 3 3 0 0 1 2
## 4 4 1 0 1 0
## 5 5 1 0 0 0
## 6 6 0 0 0 1
## Ascitesdegree Alpha.Fetoprotein..ng.mL. Haemoglobin..g.dL. Albumin..mg.dL.
## 1 1 95.00 13.70000 3.400000
## 2 1 19299.95 12.87901 3.445535
## 3 2 5.80 8.90000 3.300000
## 4 1 2440.00 13.40000 3.700000
## 5 1 49.00 14.30000 4.100000
## 6 2 110.00 13.40000 3.400000
## Aspartate.transaminase..U.L. Alkaline.phosphatase..U.L. Iron..mcg.dL.
## 1 41.00000 150.0000 85.59884
## 2 96.38272 212.2116 85.59884
## 3 68.00000 109.0000 28.00000
## 4 64.00000 174.0000 85.59884
## 5 306.00000 109.0000 59.00000
## 6 122.00000 396.0000 53.00000
## Ferritin..ng.mL. Class
## 1 438.9976 1
## 2 438.9976 1
## 3 16.0000 1
## 4 438.9976 0
## 5 22.0000 1
## 6 111.0000 0
##3. Package yang Dibutuhkan
library(rpart)
library(rpart.plot)
library(rattle)
library(caret)
##4.Pembentukan model
dtree_Boruta <- rpart(Class~Symptoms + PortalVeinThrombosis + LiverMetastasis + PerformanceStatus + Ascitesdegree, data = Training_DT_Boruta, method = 'class')
rpart.plot(dtree_Boruta, extra = 106)
fancyRpartPlot(dtree_Boruta)
##5.Prediksi
pred_dtree_Boruta <- predict(dtree_Boruta, newdata = Testing_DT_Boruta, type = 'class')
confusionMatrix(pred_dtree_Boruta, Testing_DT_Boruta$Class)
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 11 7
## 1 6 26
##
## Accuracy : 0.74
## 95% CI : (0.5966, 0.8537)
## No Information Rate : 0.66
## P-Value [Acc > NIR] : 0.1476
##
## Kappa : 0.4288
##
## Mcnemar's Test P-Value : 1.0000
##
## Sensitivity : 0.6471
## Specificity : 0.7879
## Pos Pred Value : 0.6111
## Neg Pred Value : 0.8125
## Prevalence : 0.3400
## Detection Rate : 0.2200
## Detection Prevalence : 0.3600
## Balanced Accuracy : 0.7175
##
## 'Positive' Class : 0
##
Pada tahap ini penulis mencoba metode random forest dalam pengkasifikasian
##Package yang Dibutuhkan
library(randomForest)
library(caret)
##Pembentukan model
##BARCHART (50 DATA TESTING)
rf_Barchart <- randomForest(Class~Symptoms + PortalVeinThrombosis + LiverMetastasis + PerformanceStatus + Ascitesdegree + Hepatitis.B.e.Antigen + Endemic.Countries + Smoking + Encefalopathy.degree, data = Training_DT_Barchart)
print(rf_Barchart)
##
## Call:
## randomForest(formula = Class ~ Symptoms + PortalVeinThrombosis + LiverMetastasis + PerformanceStatus + Ascitesdegree + Hepatitis.B.e.Antigen + Endemic.Countries + Smoking + Encefalopathy.degree, data = Training_DT_Barchart)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 3
##
## OOB estimate of error rate: 32.17%
## Confusion matrix:
## 0 1 class.error
## 0 23 21 0.4772727
## 1 16 55 0.2253521
##5.Prediksi
pred_rf_Barchart <- predict(rf_Barchart, newdata = Testing_DT_Barchart)
confusionMatrix(pred_rf_Barchart %>% as.factor(), Testing_DT_Barchart$Class %>% as.factor())
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 11 9
## 1 8 22
##
## Accuracy : 0.66
## 95% CI : (0.5123, 0.7879)
## No Information Rate : 0.62
## P-Value [Acc > NIR] : 0.3347
##
## Kappa : 0.2857
##
## Mcnemar's Test P-Value : 1.0000
##
## Sensitivity : 0.5789
## Specificity : 0.7097
## Pos Pred Value : 0.5500
## Neg Pred Value : 0.7333
## Prevalence : 0.3800
## Detection Rate : 0.2200
## Detection Prevalence : 0.4000
## Balanced Accuracy : 0.6443
##
## 'Positive' Class : 0
##
varImpPlot(rf_Barchart)
##BORUTA (50 DATA TESTING)
rf_Boruta <- randomForest(Class~Symptoms + PortalVeinThrombosis + LiverMetastasis + PerformanceStatus + Ascitesdegree, data = Training_DT_Boruta)
print(rf_Boruta)
##
## Call:
## randomForest(formula = Class ~ Symptoms + PortalVeinThrombosis + LiverMetastasis + PerformanceStatus + Ascitesdegree, data = Training_DT_Boruta)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 2
##
## OOB estimate of error rate: 33.04%
## Confusion matrix:
## 0 1 class.error
## 0 24 22 0.4782609
## 1 16 53 0.2318841
##5.Prediksi
pred_rf_Boruta <- predict(rf_Boruta, newdata = Testing_DT_Boruta)
confusionMatrix(pred_rf_Boruta %>% as.factor(), Testing_DT_Boruta$Class %>% as.factor())
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 11 6
## 1 6 27
##
## Accuracy : 0.76
## 95% CI : (0.6183, 0.8694)
## No Information Rate : 0.66
## P-Value [Acc > NIR] : 0.08699
##
## Kappa : 0.4652
##
## Mcnemar's Test P-Value : 1.00000
##
## Sensitivity : 0.6471
## Specificity : 0.8182
## Pos Pred Value : 0.6471
## Neg Pred Value : 0.8182
## Prevalence : 0.3400
## Detection Rate : 0.2200
## Detection Prevalence : 0.3400
## Balanced Accuracy : 0.7326
##
## 'Positive' Class : 0
##
varImpPlot(rf_Boruta)