R-Studio is a free and open-source integrated development environment (IDE) for R, a programming language for statistical computing and graphics. The Credit Risk Data displays the credit risk of an individual based on the loan they have taken out and other features of the individual.
R studio is able to compute various statistical and graphical techniques, such as linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, time series plots, maps, etc.
After downloading the bdad_lab01 zip folder, make sure to open the folder in the downloads, right click it, and select ‘extract’. This will give you a new unzipped folder. Next, we must set this folder as the working directory. The way to do this is to open R Studio, go to ‘Session’, scroll down to ‘Set Working Directory’, and click ‘To Source File Location’. Now, follow the directions to complete the lab.
To begin the Lab, examine the content of the csv file ‘creditrisk.csv’ by opening the file in RStudio. Create a simple star relational schema in erdplus stanalone feature https://erdplus.com/#/standalone, take a screenshot of the image, and upload it below.
Above is my Star Schema that is centered around the fact table all incorporating into the overall scoring a person has based on the different options available. All of the dimensions that could potentially have different(non-numerical) outcomes are attached to the Scoring Fact Table in the center, and connected to them thanks to the ID that is located within each dimension.
Next, read the csv file into R Studio. It can be useful to name your data to create a shortcut to it. Here we will label the data, ‘mydata’. To see the data in the console, one can ‘call’ it by referring to it by its given name.
mydata = read.csv(file="data/Scoring.csv")
head(mydata)
summary(mydata)
Status Seniority Home Time Age Marital
bad :1249 Min. : 0.000 ignore : 20 Min. : 6.00 Min. :18.00 divorced : 38
good:3197 1st Qu.: 2.000 other : 319 1st Qu.:36.00 1st Qu.:28.00 married :3238
Median : 5.000 owner :2106 Median :48.00 Median :36.00 separated: 130
Mean : 7.991 parents: 782 Mean :46.45 Mean :37.08 single : 973
3rd Qu.:12.000 priv : 246 3rd Qu.:60.00 3rd Qu.:45.00 widow : 67
Max. :48.000 rent : 973 Max. :72.00 Max. :68.00
Records Job Expenses Income Assets Debt
no_rec :3677 fixed :2803 Min. : 35.0 Min. : 1.0 Min. : 0 Min. : 0.0
yes_rec: 769 freelance:1021 1st Qu.: 35.0 1st Qu.: 90.0 1st Qu.: 0 1st Qu.: 0.0
others : 171 Median : 51.0 Median :124.0 Median : 3000 Median : 0.0
partime : 451 Mean : 55.6 Mean :140.6 Mean : 5355 Mean : 342.3
3rd Qu.: 72.0 3rd Qu.:170.0 3rd Qu.: 6000 3rd Qu.: 0.0
Max. :180.0 Max. :959.0 Max. :300000 Max. :30000.0
Amount Price Finrat Savings
$1,000.00 : 541 $1,500.00 : 46 Min. : 6.702 Min. :-8.160
$1,200.00 : 221 $1,200.00 : 45 1st Qu.: 60.030 1st Qu.: 1.615
$800.00 : 219 $1,300.00 : 45 Median : 77.097 Median : 3.120
$1,100.00 : 210 $1,600.00 : 43 Mean : 72.616 Mean : 3.860
$1,300.00 : 198 $1,100.00 : 41 3rd Qu.: 88.460 3rd Qu.: 5.196
$900.00 : 198 $1,700.00 : 39 Max. :100.000 Max. :33.250
(Other) :2859 (Other) :4187
Above is the information found within mydata that I am utilizing throughout this lab. The work and different categories and outcomes are listed for quick reference and accessability.
To capture, or extract, the checking and savings columns and perform some analytics on them, we must first be able to extract the columns from the data separately. Using the ‘$’ sign following the label for the data extracts a specific column. For convenience, we relabel the extracted data.
Below, we have extracted the checking column.
#Extracting the Checking Column
checking = mydata$Checking
#Calling the Checking Column
checking
NULL
class(checking)
[1] "NULL"
Now, fill in the code to extract and call the savings column.
#Extracting the Savings Column
savings = mydata$Savings
#Calling the Savings Column
savings
[1] 4.2000000 4.9800000 1.9800000 7.9333333 7.0838710 12.8307692 1.8750000 2.7000000
[9] 0.8500000 -0.4000000 2.7130435 3.3784615 3.9600000 5.5440000 0.6750000 1.4933333
[17] 4.7200000 4.6000000 3.7090909 4.8960000 2.4400000 15.2181818 0.2000000 4.1684211
[25] 4.3555556 3.6000000 6.3000000 3.5200000 0.4000000 3.1418182 6.9300000 6.5600000
[33] 2.7000000 4.4181818 2.5371429 7.8240000 2.4000000 1.6484211 4.0000000 1.6984615
[41] 4.1333333 12.3157895 3.2160000 -0.1600000 4.4347826 4.3714286 5.1360000 6.4000000
[49] 4.3200000 2.8125000 1.0000000 9.8000000 4.5200000 2.1600000 0.7200000 3.9000000
[57] 3.3000000 3.8000000 18.7500000 5.7377049 2.6181818 2.4500000 2.7600000 2.7720000
[65] 2.2666667 3.1800000 6.2500000 8.8235294 2.4900000 7.1200000 2.8200000 12.4000000
[73] 13.3200000 0.6171429 1.3200000 9.8400000 0.0000000 3.7028571 2.9333333 0.3600000
[81] 5.8909091 0.4235294 13.6800000 7.5000000 0.6240000 7.3000000 2.7349229 0.5769231
[89] 5.1600000 2.2105263 2.8187919 14.9760000 8.5950000 7.2000000 4.5000000 13.3920000
[97] 2.8421053 2.1120000 3.6000000 0.4160000 3.1090909 3.8000000 5.6492308 2.5846154
[105] 14.4000000 3.5200000 11.7500000 2.8500000 7.6114286 6.6947368 2.2720000 8.2835821
[113] 1.5120000 13.8268657 0.6600000 7.4250000 3.8347826 2.4757895 13.7142857 5.0000000
[121] 3.8666667 1.8333333 6.8307692 15.0545455 9.1764706 2.5714286 4.8387097 1.8461538
[129] 2.2320000 4.1052632 0.6428571 4.8461538 7.5000000 11.5714286 7.7076923 4.6666667
[137] 2.8695652 2.4545455 2.9040000 5.4750000 1.2600000 1.6421053 5.0040000 20.2353333
[145] 3.0461538 -0.1270588 16.2857143 2.2000000 4.6000000 14.4000000 2.3040000 8.4545455
[153] 0.0000000 7.2857143 0.7840000 4.0400000 1.3600000 12.8571429 1.7760000 2.3225806
[161] 3.4036364 8.6160000 16.0200000 3.7800000 17.5483871 1.6000000 3.0000000 2.5600000
[169] 0.5076923 6.7200000 3.9111111 1.4823529 2.7428571 9.0000000 2.1378579 7.9000000
[177] 5.2000000 20.6250000 2.6000000 1.6000000 4.4336842 0.4285714 5.3520000 2.5000000
[185] 3.1714286 5.4562909 2.2500000 2.1000000 2.2971429 5.2800000 6.4800000 5.1000000
[193] 1.6800000 0.9900000 2.1000000 0.6315789 1.6822430 9.6413793 1.7500000 1.8285714
[201] 2.7840000 8.9294118 8.0914286 6.0000000 -1.1280000 5.9600000 1.8000000 3.7800000
[209] 1.5300000 3.7080000 3.3000000 1.0080000 2.5440000 4.7500000 5.6000000 1.2545455
[217] 3.0240000 3.7200000 4.4727273 6.0000000 8.1473684 3.3722628 3.5040000 7.0121951
[225] 6.2280000 7.0000000 2.9100000 4.7666667 3.0600000 2.2000000 0.0000000 4.7400000
[233] 2.8695652 3.6000000 2.5714286 3.1800000 2.9647059 12.1000000 6.6171429 2.6222222
[241] 3.8400000 3.6000000 11.4600000 6.9529412 1.2631579 9.9000000 14.2500000 1.3333333
[249] 4.8311688 4.7500000 4.4571429 4.5500000 3.1309091 4.5000000 3.6000000 4.2923077
[257] 15.1404000 2.0160000 5.7085714 6.3000000 11.2400000 2.6891566 4.9200000 9.2880000
[265] 7.0129870 6.4285714 1.6800000 8.4960000 5.8200000 1.7684211 3.4800000 2.8200000
[273] 4.2000000 2.5440000 4.5333333 -0.3000000 4.6153846 11.3400000 1.9200000 3.4690909
[281] 6.1600000 3.7800000 0.5294118 13.9636364 6.8000000 3.5000000 5.0181818 6.5400000
[289] 6.5700000 5.9142857 2.1600000 0.1200000 3.1456311 1.5000000 4.5257143 0.8640000
[297] 8.2200000 0.6720000 0.3157895 5.4857143 1.8514286 -0.9913043 -0.2000000 2.1600000
[305] 1.5230769 7.4400000 2.0057143 -0.1440000 0.7000000 0.9415385 10.8461539 1.4400000
[313] 3.9085714 2.3571429 3.1200000 0.8000000 14.5600000 2.4500000 1.2218182 6.1714286
[321] 3.0000000 7.7700000 7.1092437 2.0347826 2.6896552 14.3684211 -0.9000000 12.7800000
[329] 4.0727273 0.3888889 4.9578947 -1.6013592 7.1250000 13.2857143 3.3458824 2.7716129
[337] 3.0000000 4.0800000 2.9200000 3.1090909 5.6400000 -0.6250000 5.6290909 5.5457143
[345] -1.8162162 -1.4117647 -7.2000000 7.6800000 1.4400000 2.6040000 5.7303371 -1.8000000
[353] 8.8800000 2.0250000 1.4400000 3.0580645 4.0909091 14.9600000 3.6000000 6.4285714
[361] 4.3333333 0.1200000 5.6000000 2.3400000 2.6511628 7.5420000 4.8000000 -0.4800000
[369] 5.9250000 15.0000000 7.9800000 3.0923077 3.7384615 2.6400000 1.5545455 -4.0806000
[377] 2.7600000 5.5200000 4.1280000 2.6800000 5.1120000 1.9200000 10.2000000 13.5000000
[385] 1.6615385 4.3058824 5.2800000 4.8000000 8.4720000 6.6000000 5.4705882 -0.6720000
[393] 2.7428571 -1.1868132 4.5600000 6.7200000 3.5000000 7.6363636 1.8000000 3.7500000
[401] 3.7777778 4.8500000 2.8486957 4.9285714 3.1285714 3.4971429 2.0800000 2.2560000
[409] 4.4210526 7.9462500 4.5428571 4.0320000 3.0514286 0.2278481 0.0900000 -5.8800000
[417] 1.3043478 5.3000000 5.5200000 3.9927273 4.2666667 -0.5672727 2.0571429 1.2705882
[425] 2.1857143 4.4800000 2.0088106 2.3595506 6.7090909 2.7600000 7.4541176 7.8000000
[433] 8.1000000 7.8923077 1.3642105 7.0720000 1.3440000 4.0000000 9.9840000 -0.1714286
[441] 12.8571429 1.9869767 2.3294118 4.7454545 1.7837838 0.3582090 5.2363636 0.4266667
[449] 1.0344828 8.3454545 2.8285714 3.6000000 7.3000000 6.2400000 6.0857143 1.6363636
[457] 3.2275862 7.4000000 5.5555556 5.4800000 1.2681638 3.0000000 3.0428571 0.4444444
[465] 3.4800000 1.6363636 4.5818182 2.4000000 0.2571429 2.8588235 1.0000000 1.1500000
[473] 6.5000000 8.9760000 12.5250000 5.2500000 -0.5866667 12.1714286 3.1600000 1.9800000
[481] 3.7800000 -0.7200000 0.7285714 1.8461538 2.9160000 2.3400000 5.5466667 2.2800000
[489] 2.3555556 4.0714286 5.8200000 4.2162162 1.6695652 2.5894737 3.6000000 8.4000000
[497] 2.3333333 1.5000000 1.4000000 0.3600000 4.1600000 4.4800000 1.8720000 4.0800000
[505] 5.9294118 5.5875000 0.7200000 5.7000000 1.6363636 5.7402062 1.9090909 3.6000000
[513] 12.1371429 2.2000000 7.2461538 2.4272727 6.9750000 1.8200000 2.3040000 5.6000000
[521] 30.4200000 4.0000000 20.8615385 10.7500000 5.3333333 1.6800000 2.6742857 6.2769231
[529] 3.2400000 7.1563636 2.9880000 0.2076923 3.4560000 2.8800000 1.6524590 3.6000000
[537] 6.1764706 7.8400000 5.7240000 3.0500000 10.6666667 7.0400000 3.4800000 -0.5625000
[545] 0.4698947 0.4137931 3.2231405 7.4000000 7.3200000 2.6400000 2.9600000 0.3469880
[553] -1.0285714 6.8363636 -0.9062500 -0.1800000 6.6705882 2.4705882 3.4941176 6.5142857
[561] 1.1200000 2.0640000 1.3200000 9.9000000 0.2571429 5.0800000 2.8581818 2.6619718
[569] 6.6470588 1.8800000 2.5405091 1.1840000 2.6500000 8.6117647 2.9200000 7.9000000
[577] 3.3600000 0.1551724 3.4800000 2.8500000 1.9636364 1.7454545 3.3517241 10.5000000
[585] 2.8800000 7.7400000 8.7000000 1.4769231 6.6428571 2.9454545 1.5750000 0.5454545
[593] 1.7280000 1.9800000 1.3636364 3.3600000 4.1632653 3.1428571 6.7320000 4.0145455
[601] 0.1200000 2.3333333 20.2800000 2.3791304 8.2285714 13.3200000 3.0000000 0.0000000
[609] 0.8000000 5.1000000 13.5000000 3.0000000 3.6955200 0.6776471 4.5176471 2.4000000
[617] 1.8947368 6.5167883 0.6776471 3.1418182 3.6000000 -0.1500000 7.2000000 2.4000000
[625] 2.2666667 6.6461538 3.9600000 2.7818182 2.1600000 10.7368421 17.2444444 2.3261538
[633] 9.0000000 0.3818182 8.2000000 1.5000000 2.4000000 3.1090909 3.6428571 0.9692308
[641] 0.1125000 -1.3333333 3.7846154 2.6142857 -0.2571429 2.9280000 3.0260870 3.3646154
[649] 1.5384615 10.8000000 1.2240000 2.7500000 0.1125000 9.0600000 8.7000000 9.8313253
[657] 6.3473684 5.5862069 1.3800000 2.9217391 7.2000000 9.2142857 0.3000000 4.5000000
[665] 7.9000000 12.4000000 4.1071429 7.5000000 3.2400000 -0.4861091 0.9000000 3.6750000
[673] 5.4276923 7.0451613 -0.2173913 3.7107692 3.4457143 16.4000000 2.9090909 0.9000000
[681] 7.7142857 14.8800000 6.1565217 4.8648649 1.0800000 1.3440000 2.1818182 18.9000000
[689] 3.5625000 0.2817391 3.7894737 3.1680000 8.4600000 7.0200000 7.7777778 3.4500000
[697] 3.2533333 7.0666667 0.6338028 0.4400000 6.0000000 6.5505882 -0.4258065 0.7200000
[705] 3.3000000 3.5733333 5.1840000 2.1000000 3.5368421 5.0769231 4.8800000 1.2672000
[713] -1.4786730 2.3142857 -0.1476923 1.2857143 0.6857143 2.3400000 6.9000000 5.8125000
[721] 2.3680000 3.3517241 3.7894737 5.3866667 4.2545455 -0.3750000 0.7363636 6.1333333
[729] 3.3000000 1.8000000 3.7371429 2.5285714 11.0571429 6.6240000 2.3400000 1.1320755
[737] 4.2500000 1.9569231 3.3912000 2.9008000 7.7760000 4.0444444 2.4000000 4.8947368
[745] 0.9000000 -0.5760000 3.7333333 2.0914286 5.2500000 6.7200000 0.1200000 1.2396694
[753] 23.1000000 3.3692308 -1.6000000 5.1330363 6.1333333 2.8000000 6.5000000 2.8235294
[761] 3.0600000 3.2727273 2.2500000 3.2000000 0.5450000 3.4560000 2.2500000 0.3428571
[769] 0.8000000 2.8000000 9.1764706 20.4857143 4.8240000 6.1714286 3.2727273 9.5142857
[777] 0.6222222 11.7000000 -0.6600000 1.1700000 -0.3375000 3.8000000 1.0028571 2.4428571
[785] 1.6740000 12.7358491 4.5000000 2.7230769 3.4285714 1.6800000 11.5000000 2.2950000
[793] 0.2400000 2.7600000 1.9200000 2.3333333 6.7800000 4.1400000 5.4171429 4.5333333
[801] 1.1040000 4.2070588 5.3739130 3.7136842 18.7200000 2.9557895 -0.4200000 5.6914286
[809] 4.7154000 8.3200000 13.6666667 3.6809816 10.0884956 1.6500000 5.8500000 6.0000000
[817] 2.9400000 9.0000000 4.7076923 3.4285714 3.6545455 2.7600000 -2.7000000 4.3200000
[825] 3.0600000 -1.4608696 1.4040000 5.3333333 3.5345455 -0.0800000 1.8991304 4.1142857
[833] 2.9760000 9.0000000 1.9920000 3.4200000 30.2000000 6.9176471 0.9000000 4.0500000
[841] 10.5000000 3.4560000 5.9563636 6.1600000 1.9200000 3.8964706 0.2880000 -0.9000000
[849] 1.5300000 3.9000000 7.0819672 4.6200000 2.0600000 7.5789474 0.9913043 4.1538462
[857] 4.4526316 2.1666667 5.3142857 0.5454545 2.8333333 2.1853659 -1.2960000 -1.0800000
[865] 3.0000000 6.0000000 2.1767442 2.1000000 3.6981818 4.7040000 9.4615385 2.9333333
[873] 5.2800000 -1.9200000 4.3405714 6.6514286 5.1958763 2.5875000 3.5345455 0.7800000
[881] 1.2800000 2.0000000 3.0000000 3.1200000 -1.0588235 6.4500000 -0.3450000 3.6000000
[889] 3.5200000 3.3600000 5.0400000 3.7800000 5.0666667 2.5600000 1.8897638 5.3076923
[897] 3.7166667 2.2702703 0.3085714 1.0200000 7.6909091 3.0000000 7.5085714 2.3750000
[905] 4.2000000 2.8421053 3.1200000 2.5548387 4.0800000 4.3200000 1.5840000 3.1680000
[913] 4.5942857 1.6581818 3.7800000 7.6153846 26.0000000 16.2576923 2.7000000 2.4000000
[921] 1.8782609 3.8571429 1.0800000 8.0000000 1.5000000 3.7120000 2.2702703 7.4880000
[929] 2.2560000 5.3742857 2.2758621 -0.3900000 8.0616333 2.9364706 1.9285714 0.0000000
[937] -2.8000000 8.0000000 5.4109091 7.2000000 0.2470588 -0.5142857 0.1107692 2.5699482
[945] 5.1428571 8.2568807 0.9375000 -1.6744186 -0.5400000 11.6000000 2.1600000 3.0000000
[953] 0.9120000 6.5454545 4.7040000 2.5200000 5.4000000 2.8595745 -8.1600000 8.5371429
[961] 4.6200000 1.8500000 3.1058824 1.7431579 6.8347826 0.5714286 3.3600000 6.3529412
[969] 2.6000000 3.7800000 3.0000000 0.9600000 4.0800000 8.4600000 2.0625000 11.2000000
[977] 13.0285714 10.6666667 1.9200000 -1.2500000 4.2201290 1.2600000 2.2105263 5.5333333
[985] 1.4100000 1.3764706 4.0800000 1.4500000 7.8000000 2.4000000 1.8000000 6.0000000
[993] 4.8413793 2.4000000 9.6120000 2.8200000 5.1264000 3.2470588 2.0100000 1.5000000
[ reached getOption("max.print") -- omitted 3446 entries ]
class(savings)
[1] "numeric"
Above lists the numerical amounts for the savings accounts for all applicants. Beneath all the numerical data in the savings sections, is the class in which the data is interpreted as, and in both cases, that is “numeric”.
In order to calculate the mean, or the average by hand of the checkings columns, one can add each individual entry and divide by the total number or rows. This would take much time, but thankfully, R has a command for this.
We have done an example using the checkings column. Compute the same using the savings column.
#Using the 'mean' function on checking to calculate the checking average and naming the average 'meanChecking'
meanChecking = mean(checking)
argument is not numeric or logical: returning NA
#Calling the average
meanChecking
[1] NA
#Find the average of the savings column and name the average of the savings meanSavings
meanSaving = mean(savings)
#Call mean savings
meanSaving
[1] 3.860083
Next, compute the standard deviation or spread of both the checkings and savings columns.
#Computing the standard deviation of standard deviation
spreadSavings = sd(savings)
#Find the standard deviation of savings
spreadSavings
[1] 3.726292
Now, to compute the SNR, the signal to noise ratio, a formula is created because there is no built in function.
SNR is the mean, or average, divided by the spread.
#Compute the snr of Checking and name it snr_Checking
snr_Checking = meanChecking/spreadChecking
#Call snr_Checking
snr_Checking
[1] NA
#Find the snr of the savings and name it snr_Saving
snr_Savings = meanSaving/spreadSavings
#Call snr_Saving
snr_Savings
[1] 1.035905
Of the Checking and Savings, which has a higher SNR? Why do you think that is?
The Savings account has the highest SNR overall. In general, when you combine the large group of subjects in question that are applying for a loan, their savings accounts would logically total a higher amount than their savings account would. This in turn would allow for them ti accumulate a hugher SNR in the savings category, rather than the checkings one.
After using Watson Analytics to find patterns in the data, save your work and upload a screenshot here. Refer to Task 1 on how to upload a photo.
Attached is a graph that shows the relationship between “How the values of Income and Expenses associated?” A dot graph is utilized to show each of the 4447 individauls who’s information is entered into the data chart that I have utilized throughout this lab.
Above is the graph that shows tghe relationship between the bedt each individual appliocant has in relation to their status, be it good, or bad.