Help


Brief This dataset was taken from Kaggle. This is a NBA players information dataset. The sample size of this dataset is 1340 with 21 feature. The only categorical variable is the response Variable - Target

1-Signifies whether a player has a career of 5 years or more. 0-Signifies the career of the player is shorter than 5 years.

Other features: Name : Name of the player

GamesPlayed: Total no of games played

MinutesPlayed: Total minutes a player played

PointsPerGame: Points scored per game.

FieldGoalsMade: Field goals made by a player

FieldGoalsAttempt: Field goals attempt of a player

FieldGoalPercent: Percentage of field goals of a player

3PointMade: no of 3 point made by a player

3PointAttempt: no of 3 point attempt made by a player

3PointPercent: percentage of 3 point of a player

FreeThrowMade

FreeThrowAttempt

FreeThrowPercent

OffensiveRebounds

DefensiveRebounds

Rebounds

Assists

Steals

Blocks

Turnovers

When doing the data cleaning, ( get-and-clean-data.R)a YeoJohn transformation method was used. Since there is no big different between the model test with original dataset and transformed dataset. I decided to keep the original dataset which made the model data more interpretative.

When doing traing the model, logistic regression, random tress and decision tress are used to test the model accuracy. The roc accuracy and sensitivity results shows that logistic regression is the best model for this dataset. Final model named: nba_model

Table

Funner tables with gt and gtExtras packages

Table 3. average result for nba players
GamesPlayed n Ave_3_PT Ave_Free_Threw Prob_career_over_5Yrs
15 1 0.30 2.00 0.27
18 1 0.00 0.20 0.20
19 2 0.30 0.70 0.20
20 1 0.00 2.70 0.42
21 1 0.00 0.60 0.23
22 2 0.05 0.40 0.16
23 3 0.20 0.97 0.27
24 3 0.43 1.23 0.22
25 2 0.20 0.75 0.30
26 3 0.00 0.97 0.29
27 2 0.40 1.30 0.23
31 3 0.30 1.00 0.25
32 3 0.40 1.10 0.41
33 3 0.07 0.77 0.32
34 6 0.37 0.87 0.30
35 8 0.10 0.91 0.31
36 7 0.17 0.87 0.27
37 7 0.20 1.14 0.35
38 8 0.29 0.99 0.37
39 5 0.08 0.94 0.42
40 5 0.24 0.86 0.37
41 5 0.14 0.82 0.32
42 5 0.48 1.22 0.37
43 5 0.02 1.38 0.43
44 3 0.07 0.63 0.29
45 4 0.00 0.90 0.46
46 5 0.10 0.72 0.46
47 7 0.16 1.16 0.45
48 6 0.33 2.05 0.58
49 5 0.10 1.10 0.43
50 6 0.30 1.53 0.47
51 7 0.27 1.51 0.54
52 7 0.04 1.36 0.58
53 9 0.17 1.06 0.53
54 4 0.17 1.55 0.54
55 6 0.13 1.43 0.56
56 5 0.32 1.66 0.58
57 6 0.15 0.92 0.51
58 4 0.10 1.60 0.68
59 8 0.44 1.44 0.57
60 2 0.05 1.45 0.58
61 8 0.40 1.65 0.62
62 8 0.35 1.78 0.66
63 7 0.16 1.86 0.76
64 8 0.41 1.89 0.62
65 6 0.15 1.73 0.71
66 8 0.46 1.55 0.64
67 4 0.12 1.80 0.78
68 9 0.14 1.48 0.67
69 2 0.20 1.55 0.67
70 8 0.40 2.56 0.78
71 7 0.01 2.47 0.74
72 7 0.30 3.19 0.83
73 7 0.51 1.91 0.77
74 4 0.08 3.17 0.80
75 4 0.35 1.95 0.78
76 11 0.47 2.70 0.81
77 14 0.35 1.61 0.77
78 14 0.26 2.94 0.81
79 12 0.32 2.59 0.80
80 22 0.35 3.33 0.87
81 21 0.23 3.36 0.86
82 21 0.30 2.64 0.85

Graph


Games Played vs Career Length

Number of game indicate the lenght of the palyers’ career This result shows that there are couple of player who paly as the NBA player for at lease 5 years, however, their participation is very low. May be the have servarious injury or just played as a backup player.

3 PT vs Career Length

This graph tells that the participants in this dataset may not good at 3pt shotting,

Graph


Assist vs Career Length

The graphs tells that most player provide 0 to 3 assist per game

Career length distribution

This result shows more than 50% of the NBA players work as a basketball player for more than 5 years.

```