Dataset

## [1] "C:/Users/cliff/OneDrive/Documents/File Aktu ITS/Semester 5/Business Intelligence"
## # A tibble: 7,043 × 16
##    `Customer ID` `Tenure Months` Location `Device Class` `Games Product`    
##            <dbl>           <dbl> <chr>    <chr>          <chr>              
##  1             0               2 Jakarta  Mid End        Yes                
##  2             1               2 Jakarta  High End       No                 
##  3             2               8 Jakarta  High End       No                 
##  4             3              28 Jakarta  High End       No                 
##  5             4              49 Jakarta  High End       No                 
##  6             5              10 Jakarta  Mid End        No                 
##  7             6               1 Jakarta  Mid End        No                 
##  8             7               1 Jakarta  Low End        No internet service
##  9             8              47 Jakarta  High End       No                 
## 10             9               1 Jakarta  Mid End        No                 
## # ℹ 7,033 more rows
## # ℹ 11 more variables: `Music Product` <chr>, `Education Product` <chr>,
## #   `Call Center` <chr>, `Video Product` <chr>, `Use MyApp` <chr>,
## #   `Payment Method` <chr>, `Monthly Purchase (Thou. IDR)` <dbl>,
## #   `Churn Label` <chr>, Longitude <dbl>, Latitude <dbl>,
## #   `CLTV (Predicted Thou. IDR)` <dbl>
##   Customer ID   Tenure Months     Location         Device Class      
##  Min.   :   0   Min.   : 0.00   Length:7043        Length:7043       
##  1st Qu.:1760   1st Qu.: 9.00   Class :character   Class :character  
##  Median :3521   Median :29.00   Mode  :character   Mode  :character  
##  Mean   :3521   Mean   :32.37                                        
##  3rd Qu.:5282   3rd Qu.:55.00                                        
##  Max.   :7042   Max.   :72.00                                        
##  Games Product      Music Product      Education Product  Call Center       
##  Length:7043        Length:7043        Length:7043        Length:7043       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  Video Product       Use MyApp         Payment Method    
##  Length:7043        Length:7043        Length:7043       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##  Monthly Purchase (Thou. IDR) Churn Label          Longitude    
##  Min.   : 23.73               Length:7043        Min.   :106.8  
##  1st Qu.: 46.15               Class :character   1st Qu.:106.8  
##  Median : 91.45               Mode  :character   Median :106.8  
##  Mean   : 84.19                                  Mean   :107.0  
##  3rd Qu.:116.81                                  3rd Qu.:107.6  
##  Max.   :154.38                                  Max.   :107.6  
##     Latitude      CLTV (Predicted Thou. IDR)
##  Min.   :-6.915   Min.   :2604              
##  1st Qu.:-6.915   1st Qu.:4510              
##  Median :-6.200   Median :5885              
##  Mean   :-6.404   Mean   :5720              
##  3rd Qu.:-6.200   3rd Qu.:6995              
##  Max.   :-6.200   Max.   :8450

Including Plots

Untuk menjelaskan secara lebih lanjut mengenai deskriptif data dapat dilihat pada visualisasi berikut

Analisis Korelasi

library(corrplot)
## Warning: package 'corrplot' was built under R version 4.2.3
## corrplot 0.92 loaded
numdata<-data[,c(2,12,16)]
cormatrix<-cor(numdata)
corrplot(cormatrix,method="number")

Menggunakan analisis korelasi data numerik dapat dilihat bahwa variabel numerik pada dataset memiliki hubunga satu sama lain yang relatif kecil yaitu dibawah 0.5 dengan hubungan tertinggi antara data numerik adalah antara CLTV dengan Tenure Months atau lamanya customer membeli paket data atau kuota secara berapa bulan berturut-turut.

Regresi Linear

simpleregression<-lm(formula=`CLTV (Predicted Thou. IDR)`~`Tenure Months`,data=data)
summary(simpleregression)
## 
## Call:
## lm(formula = `CLTV (Predicted Thou. IDR)` ~ `Tenure Months`, 
##     data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3456.2 -1055.1    64.9  1145.8  2853.2 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     4916.8068    27.8395  176.61   <2e-16 ***
## `Tenure Months`   24.8239     0.6852   36.23   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1412 on 7041 degrees of freedom
## Multiple R-squared:  0.1571, Adjusted R-squared:  0.157 
## F-statistic:  1313 on 1 and 7041 DF,  p-value: < 2.2e-16

Berdasarkan hasil analisis regresi sederhana 1 variabel dengan variabel independen CLTV memberikan hasil kebaikan model 15.7%

multiregression<-lm(formula=`CLTV (Predicted Thou. IDR)`~`Tenure Months`+`Monthly Purchase (Thou. IDR)`,data=data)
summary(multiregression)
## 
## Call:
## lm(formula = `CLTV (Predicted Thou. IDR)` ~ `Tenure Months` + 
##     `Monthly Purchase (Thou. IDR)`, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3455.1 -1055.4    65.3  1145.6  2852.7 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    4.916e+03  4.220e+01  116.49   <2e-16 ***
## `Tenure Months`                2.482e+01  7.073e-01   35.09   <2e-16 ***
## `Monthly Purchase (Thou. IDR)` 1.777e-02  4.441e-01    0.04    0.968    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1412 on 7040 degrees of freedom
## Multiple R-squared:  0.1571, Adjusted R-squared:  0.1569 
## F-statistic: 656.2 on 2 and 7040 DF,  p-value: < 2.2e-16

Berdasarkan hasil analisis regresi berganda 2 variabel dengan variabel independen CLTV memberikan hasil kebaikan model 15.69%.