GITHUB User and details

The user is Brad Traversy and profile link is https://github.com/bradtraversy. The data is collected via gh package using github api. We have used mutiple libraries in this project for data manipulation. Token is used for multiple transition with the github api. The user have 241 repositories and 50.9 K following

Collecting the user’s login, name, public_repos, followers

We are calling all the required libraries at the start of the program to avoid confusion at different step. Here we are mentioning our token for Api use. gh package is used to pass the paramer of user’s name to fetch his information. After information is been fetched we have converted into a dataframe and selected required column into another datafarme as per the ask. Kable function is used to view the output.

library(gh)
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.6     v dplyr   1.0.8
## v tidyr   1.1.4     v stringr 1.4.0
## v readr   2.1.2     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(purrr)
library(ggplot2)
library(ggthemes)
library(jsonlite)
## 
## Attaching package: 'jsonlite'
## The following object is masked from 'package:purrr':
## 
##     flatten
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
## 
##     group_rows
library(httr)

my_token = "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"
Sys.setenv(GITHUB_TOKEN = my_token)
view_user = gh("/users/bradtraversy")


df_user = data.frame(do.call(cbind,view_user))


user_info_tb = df_user %>% select(login,name,public_repos,followers)
kable(user_info_tb)
login name public_repos followers
bradtraversy Brad Traversy 241 50950

Collecting the followers’ login, name, public_repos, followers

To fetch the followers information we are using gh package with the reqired parameter passed in it(“followers-end point”). Next using map_chr function to fetch all the elements from the details and storing in a variable “login”. lapply function is used on “login” to fetch the data and element by element which is stored in another variable “Ld”. Using map_df function we are achieving the dataframe with required data. A limit of 200 in gh package is mimimize computation time. In order to get all the data “limit=inf” can be provided as a parameter

view_followers = gh("/users/bradtraversy/followers", .limit = 200)
## i Running gh query
## i Running gh query, got 100 records of about 51000
login = map_chr(view_followers, function(x)
  {
  return(x$login)
})

ld = lapply(login, function(i){
  u = gh(str_interp("/users/${i}"))
  if(is.null(u$name)) u$name = 'NA'
  return(u)
})

follower_df = map_df(ld, magrittr::extract,c("login","name","public_repos","followers"))

kable(follower_df)
login name public_repos followers
lgs Luca G. Soave 955 137
jjoaquim José Carlos Joaquim 22 27
acangiano Antonio Cangiano 5 413
jameydavis Jamey Davis 8 25
tarasis Robert McGovern 25 26
jamur Rafael Jamur 26 7
dausech Douglas Ausech 55 7
souzaonofre Onofre Souza 118 75
fmo Mustafa Özyurt 4 12
b4youleap Bruce MacDonald 73 6
lyarinet Asif Agaria 30 9
renanviegas Renan Viegas 3 8
dharmang Dharmang 9 17
nippermh Mark Hallam 18 4
bsbodden Brian Sam-Bodden 72 64
rmax R Max Espinoza 81 529
benganl Lanton Hulisani Vhengani 4 7
Jorger Jorge Rubiano 157 160
jmac007 NA 5 5
kant Darío Hereñú 3965 145
vlada5 NA 213 10
tomshaw Tom Shaw 37 26
Wyvern {@MainActor (self: Self) in Ultra}(<U+F8FF>) 39 25
ubk77 Gianni Liburdi 3 9
elvan Elvan H 85 31
f1vlad f1vlad 36 17
alirezameskin Alireza Meskin 20 44
alinabeel Ali Nabeel Ahmed 55 24
knice Rob Knight 13 17
leonardorb Leo 33 78
giovanigenerali Giovani Generali 55 69
zendyani Belakhdar Abdeldjalil 51 44
rdricco renato ricco 26 21
AliRezaTaleghani AliReza Taleghani 29 7
allrude Ruud van Zuidam 12 33
harshals NA 18 3
davidsheardown Dav.id 51 4
baldrailers Julius Francisco 24 86
joshlong Josh Long 518 8039
jgwynn2901 John Gwynn 33 3
radub Radu Barbu 5 26
l3dlp L3DLP 16 109
janogarcia Jano Garcia 7 30
hrishikeshrt Hrishikesh Terdalkar 42 14
tinku99 Naveen Garg 15 36
neviim Jorge Silva 325 48
elusive John Gilliland 50 15
israelsantiago Israel Santiago 25 54
pankajspace Pankaj Wakchaure 31 5
esin Andrey Esin 33 6209
ik5 Ido Kanner 181 367
NickBlomberg Nick Blomberg 11 5
yxm0513 Simon Yang 46 11
farhany Farhan Yousaf 88 6
lkruscic Luka 13 4
coder-selvarajan Selvarajan Thangavel 25 13
dmrutgos David Rutgos 12 7
mamude Samir Mamude 7 23
chadrey Chad Reynoldson 9 6
nurikabe Evan Owens 67 65
psahni Prashant Sahni 104 21
odixon Omar 129 22
digamesystems John Price 8 5
Sicarius R.Sergiyko 6 4
anjanesh Anjanesh Lekshminarayanan 11 11
samvil Sammy Villarreal 3 4
gobetter Suthep Chumsri 20 5
jschuur Joost Schuur 14 15
miquelbrazil Miquel Brazil 10 29
ThomasGHenry Thomas G Henry 27 40
returnvalue NA 33 13
barbietunnie Babatunde Adeyemi 231 59
mafulafunk Martin Funk 22 20
jameserie James Erie 0 2
sunpech Steven S 49 16
pixelcool pixelcool 2 3
mahdimp Mahdi MP 47 73
avoitishin Andrey Voitishin 10 6
ismet Ismet Togay 2 8
AaronLaw Aaron Law 371 51
scottliu NA 78 5
johndel John Deliyiannis 26 71
mavisland Tanju Yildiz 37 80
rajakvk NA 57 17
harikt Hari K T 271 209
jamesxv7 Jaime Olmo 58 50
wayneburkett Wayne Burkett 11 15
mandava Bharat Mandava 7 86
indigo11 indigo 1 6
CodeNegar CodeNegar 17 6
casjay casjay 24 7
jessefrye Jesse 14 32
senseluo Luke Luo 16 24
AndiKod Andrei Curelaru 25 4
fly51fly <U+7231><U+53EF><U+53EF>-<U+7231><U+751F><U+6D3B> 10 3176
jovenbarola Joven A. Barola 13 25
karthiks Karthik Sirasanagandla 95 36
gipcompany Atsushi Ishida 34 104
locohost Mark Deibert 10 3
Wemago André Philip 4 49
Mottie Rob Garrison 48 779
openstrikz NA 5 11
jfca Anthony Pantekoek 2 3
ryanjohnston Ryan Johnston 35 64
arturparkhisenko Artur Parkhisenko 92 163
9kopb 9kopb 341 67
diversen Dennis Iversen 100 43
etangreal Ernst Salzmann 29 74
pino1068 Giuseppe Di Pierri 17 6
drenovac martin drenovac 40 14
nekofar Milad Nekofar 53 246
kpessa Kurt 73 6
catull catull 33 33
bryanpsd Bryan P 14 3
kirils Kiril Stoilov 29 19
junmei Junmei Duan 35 4
harun NA 17 25
rahulh77 Rahul 32 5
pillar pillar 1 89
sachinsharma Sachin Sharma 26 4
CodyPChristian Cody P. Christian 7 29
elamurugan Ela 15 20
aleph2c Scott Volk 15 26
ulziibadrakh-p ULZIIBADRAKH Purevtogtokh 9 13
jsilva Joao Da Silva 7 27
anup4khandelwal Anup Khandelwal 16 5
aslamdoctor Aslam Doctor 43 37
aminnagpure Amin Nagpure 113 17
cenda Cenda Kovár 2 2
jvillama Josh 59 28
mir100 Vladimir Storch 0 7
chandanjog Chandan Jog 44 31
leopardxl Keith Emmanuel 41 2
rollemira Ira Mellor 13 6
kentjones Kent Jones 12 2
vishalkukreja Vishal Kukreja 22 5
cp0k Ilya Nabutovsky 18 7
jasondavis Jason Davis 633 83
akkiris Ali Kaan Kiris 0 15
manishnakar Manish Nakar 494 10
webgoon Webgoonie 29 4
mrlynn Michael Lynn 236 91
lanqy Lan Qingyong 98 81
tuamtium tuamtium 5 2
Ribeiro Geovanny Ribeiro 473 49
tutkun Samet Tutkun 61 8
marioluevanos Mario Luevanos 10 18
hengkiardo Hengki Sihombing 426 371
elisandroesp Elisandro Espindola 9 13
fajarsulaksono Fajar Sulaksono 43 7
sanjum308 Sanjay M 2 3
waizkode Michael N. Agbamuche 8 4
lmammino Luciano Mammino 250 916
Tmeister Enrique Chavez 99 113
psenger Philip A Senger 171 48
mikeputnam Mike Putnam 25 49
guptarajesh Rajesh Gupta 5 5
IamHavingFun CodingWarriorBen 214 6
eliasrodrigues Elias Rodrigues 25 42
JenWardell Jen Wardell 4 11
ccatto Catto 29 11
z3rocall Alessandro Gambin da Silva 8 18
ertugerata Ertugrul Erata 10 86
mjacobus Marcelo Jacobus 150 167
lcalla calla 14 3
pechef NA 0 11
maciejgunia Maciej Gunia 16 5
Furqan NA 13 7
dmcgill50 David McGill 37 7
chenyukang Yukang Chen 154 292
foonnnnn KennethPCF 230 31
andradeandrey Andrey Andrade 713 48
talbott Talbott Crowell 0 9
jp555soul Jason Paul 16 30
wymna wymna 12 5
lucciano Luciano Andrade 758 24
DomenicF Domenic Fiore 5 21
jermbo jermbo 45 53
tlandn Lan Tra 123 9
changsijay Jay Chang 2 19
xploregopi Gopi Ponnusamy 39 6
georgeyk George Kussumoto 42 88
mttfshr Matt Fisher 14 3
daveydee33 Dave Degeatano 36 10
jpiemeisl Jason Piemeisl 0 17
batermj Bater.Makhabel 2362 1311
borisbreuer Boris Breuer 12 4
fhenderson Francois Henderson 5 14
jalagamkalyan kalyan chakravarthy 0 3
fuinha Marcelo Anjos 1164 35
dirkplatts Dirk Platts 4 5
edgarronda Edgar Ronda 42 46
robmontesinos Roberto Montesinos 6 4
erainey Eric 11 10
infantiablue Truong Phan 19 15
antonymer Eric H 3 4
ertankayalar Ertan Kayalar 16 15
gnovaro Gustavo Novaro 93 60
dcarter0317 NA 19 4
drakiula Drago<U+0219> Haiduc 43 31

Collecting the repositories’ name, size, forks_count,open_issues_count, closed_issue_count

By using gh package and passing the paramenter for repositories we are fetching the repository data. An emtpy data frame is defined as “repos_df” to use it for storing the data. Next, we are using a for loop from 1 to the length of “repos”(data fetched from gh package) to store data in variables and then using it to create a dataset. Dataset is viewed by using kable function. A limit of 200 in gh package is mimimize computation time. In order to get all the data “limit=inf” can be provided as a parameter

repos = gh("GET /users/bradtraversy/repos", .limit = 200)
## i Running gh query
## i Running gh query, got 100 records of about 300
repos_data = lapply(repos,function(i)
  {
  if(is.null(i$size))
    i$size = 'unknown'
  return(i)
})


repos = gh("GET /users/bradtraversy/repos",username = "bradtraversy", .limit = 200)
## i Running gh query
## i Running gh query, got 100 records of about 300
repos_df <- data.frame(name=character(),
                               size=integer(), 
                               forks_count=integer(), 
                               open_issues_count=integer(),
                               closed_issues_count=integer()) 


for (i in 1:length(repos))
{ 
  
  # Find all required columns 
  name = repos[[i]]$name
  size = repos[[i]]$size
  created_year = as.integer(substring(repos[[i]]$created_at,1,4))
  forks = repos[[i]]$forks_count
  open_issues_count = repos[[i]]$open_issues_count
  
  closed_issues_url <-
    paste0(repos[[i]]$url,"/issues?state=closed")
  
  closed_issues = gh(closed_issues_url,username = "bradtraversy",.limit = 200)
  closed_issues_count = length(closed_issues)
  

  repos_df<-rbind(repos_df, data.frame(name = (name),
                                                         size = (size),
                                                         forks_count = (forks),
                                                         created_year = (created_year),
                                                         open_issues_count = (open_issues_count),
                                                         closed_issues_count = (closed_issues_count)))
  
  
}
## i Running gh query
## i Running gh query, got 100 records of about 1200
## i Running gh query
## i Running gh query, got 100 records of about 300
## i Running gh query
## i Running gh query, got 100 records of about 200
kable(repos_df)
name size forks_count created_year open_issues_count closed_issues_count
50projects50days 457 3591 2020 39 58
A-to-Z-Resources-for-Students 2546 143 2018 2 0
a4app 14 52 2017 2 2
acme_keystone 469 10 2017 0 0
adonis40blog 135 18 2017 0 0
aioweb 952 10 2014 2 0
alexis_speech_assistant 21 175 2019 26 12
angular-crash-2021 328 473 2021 1 3
angular-crash-todolist 112 232 2019 2 5
angular60 17 101 2016 3 2
angularfs 16 42 2017 5 1
animiesample 1 15 2017 0 0
apex-legends-tracker 1547 18 2019 6 24
articlebase 131 21 2017 0 4
aurelia_customer_manager 666 5 2017 0 0
axios-crash 3 330 2019 0 0
barchart 1 10 2015 0 0
basicphonegap 1680 5 2015 0 0
bcc-home-page 250 0 2021 0 0
bitzprice 53 65 2018 0 1
bizlight_theme 1607 19 2016 1 1
bookmarker 144 116 2017 4 5
bookstore 2667 213 2015 6 3
bootsassy 3528 3 2014 1 0
bootstrap-bootcamp-website 31 249 2021 2 0
brainjs_examples 9 55 2018 1 2
breaking-bad-cast 931 144 2020 6 6
breakout_desktop_nw 175 26 2020 7 1
bs4-sass-starter 1326 12 2018 0 0
bs4starter 5 151 2017 5 2
bs4starter_alpha6 52 22 2017 2 0
bthosting_foundation 1040 24 2017 1 0
btre_project 2460 284 2018 13 25
casablanca-tutsplus 5704 3 2017 0 0
chatcord 69 945 2020 24 22
ciblog 559 164 2016 11 9
clientpanel_react 3625 22 2018 18 14
codegig 391 86 2018 9 10
coindex-cli 24 18 2020 5 3
contact-keeper 2859 200 2019 1 59
contact_keeper_api 471 48 2019 11 3
creative-agency-website 4057 108 2021 1 1
customer-cli 4 28 2017 4 2
customerbase 4 63 2017 3 2
deno-rest-api 7 71 2020 3 3
design-resources-for-developers 4728 8501 2020 1 200
devcamper-api 556 319 2019 28 33
devconnector_2.0 2131 1160 2019 5 200
devconnector_theme_sass 305 50 2019 2 0
devspace-blog 2231 39 2021 0 1
DevYouTubeList 383 33 2020 1 0
dj-events-backend 494 41 2021 1 0
dj-events-frontend 3267 128 2021 4 2
django-todolist 12 62 2017 2 5
djangocrashcourse 9 49 2017 2 0
docker-node-mongo 25 410 2018 8 7
ebookseller 126 55 2017 1 0
electron-course-files 17389 92 2020 78 58
electronshoppinglist 221 134 2017 6 8
embertasks 480 2 2014 0 0
emtasks 12 15 2015 1 0
escalate_theme 233 20 2016 0 0
expense-tracker-mern 6628 123 2020 31 10
expense-tracker-react 5029 259 2020 22 14
express_crash_course 86 199 2019 8 6
face_recognition_examples 5516 128 2019 0 0
fancy_form 105 27 2018 9 4
fastify-crash-course 42 24 2021 0 2
feedback-app 2155 188 2021 0 8
finddit 6762 31 2018 0 1
find_a_pet 21 24 2018 0 1
firebasecontact 3 128 2017 4 1
flask-wysiwyg 612 4 2017 0 0
flask_sqlalchemy_rest 4 103 2019 4 0
fluxboiler 70 30 2016 2 0
frontend-mentor-challenges 4543 0 2021 0 0
gatsby_crash_course 188 80 2018 0 0
github-finder 6715 289 2019 16 20
github-finder-app 3174 69 2021 2 10
githubsearch 27 21 2016 1 2
go_crash_course 6 104 2018 1 0
go_restapi 4 113 2018 3 2
grid-crash 12 32 2022 1 0
gulpexapp 9 40 2017 0 1
hapiapp 228 10 2017 6 2
house-marketplace 851 76 2021 2 3
html5audioplayer 90788 53 2014 4 0
huddle_styled_components 663 77 2021 0 0
hulu-webpage-clone 2394 118 2021 4 1
infinite_scroll_react_unsplash 307 25 2019 2 2
ionreddit 998 41 2016 5 0
it-logger 6304 84 2019 16 14
iweather 13702 56 2017 5 2
javascript_cardio 90 324 2018 12 48
jest_testing_basics 46 102 2018 0 1
jquery-githubfinder 30 48 2017 1 3
jquery.linkIt 132 3 2014 0 0
jquery.scrollSlide 508 7 2014 0 0
jquery_crash_course 6 131 2016 0 1
jsdoc-examples 385 36 2019 1 2
keystone-blog 310 12 2021 0 1
laravel-sanctum-api 73 99 2021 3 1
larticles_api 1268 134 2017 8 8
larticles_vue_app 679 101 2018 6 1
lead_manager_react_django 1212 300 2019 40 27
livestreamer_project_ideas 45 24 2019 3 1
loruki-website 253 475 2020 12 5
lsapp 2585 379 2017 12 15
lyricfinder 3348 95 2018 14 14
mailchimp_newsletter 27 35 2018 1 2
meanauthapp 2265 159 2017 33 22
meanblog 126 16 2017 0 0
mean_mytasklist 18395 151 2016 7 1
meetupz 371 44 2017 2 1
mern-tutorial 371 108 2022 2 2
mern_shopping_list 2691 422 2018 52 26
microposts_fullstack_vue 60 79 2018 12 8
mobilecontacts 1576 5 2015 0 0
modern_js_udemy_projects 44 500 2019 17 5
modern_portfolio 1026 486 2018 18 22
mongochat 3 145 2017 5 5
mongo_file_uploads 27 104 2018 7 4
movieinfo 1 86 2017 4 5
myflaskapp 11 259 2017 11 7
mylogin 71 21 2015 2 0
mysubscribers 1544 10 2015 0 0
mytodo_ci 504 24 2014 0 0
mytunes_landing 3480 79 2018 0 0
nestjs_rest_api 1027 71 2019 19 8
netlify_lambda 56 11 2018 0 1
next-crash-course 68 251 2021 3 2
next-markdown-blog 1088 130 2021 0 2
ng2-play 155 4 2016 0 0
ngauth0 16 19 2016 1 0
ngspotify 28 30 2016 2 1
node-api-proxy-server 42 65 2021 0 3
nodeauthapp 31 75 2017 7 9
nodecontactform 3 123 2017 3 2
nodekb 833 182 2017 14 7
nodetextapp 7 52 2017 1 3
nodeuploads 2 110 2017 1 1
node_babel_starter 33 2 2018 1 0
node_crash_course 64 238 2019 12 13
node_jwt_example 1 134 2017 1 2
node_passport_login 3728 1252 2016 50 59
node_paypal_sdk_sample 1 50 2017 0 0
node_push_notifications 9 107 2018 8 0
notion-video-schedule 136 14 2021 1 0
numberguesser 10 31 2017 0 1
nuxt_dadjokes 93 32 2019 0 0
part_manager 8 83 2019 1 3
passgen 5 40 2021 3 1
pdf_viewer 403 79 2019 4 0
php-crash 148 5 2022 0 0
phploginapp 9 43 2017 0 2
php_rest_myblog 13 380 2018 11 3
php_stripe_paypage 8 89 2018 5 0
pixabay_image_finder 201 25 2018 0 2
pollster_django_crash 12 110 2019 3 7
projectmanager 7 110 2016 1 2
proplistings 30 14 2017 2 2
proshop_mern 1245 948 2020 20 172
public-apis 2529 46 2019 0 0
pusherpoll 117 56 2018 3 6
python_bokeh_chart 7 63 2019 0 0
python_feedback_app 89 151 2019 4 0
python_folium_example 181 85 2018 1 1
python_sandbox 18 631 2018 7 10
railscms 164 7 2014 0 0
rblog 1168 9 2014 0 0
rcontacts 3 9 2016 0 0
react-admin-example 191 50 2020 1 2
react-crash-2021 625 1385 2021 17 12
react-tailwind-pixabay-gallery 5791 55 2020 12 7
reactcharts 13 69 2017 3 3
reactnativeapp 126 57 2017 0 0
react_crash_todo 787 331 2019 13 10
react_express_starter 54 464 2017 6 7
react_file_uploader 4373 152 2019 19 5
react_native_shopping_list 20006 110 2020 12 8
react_otka_auth 143 69 2018 1 1
react_redux_express_starter 2611 157 2018 25 14
react_step_form 151 209 2018 5 6
react_webpack_starter 31 178 2018 8 4
real-time-tweet-stream 22 34 2020 1 1
recaptchav2_node 15 29 2017 3 2
redusers 3 60 2017 0 1
redux_crash_course 160 335 2018 7 3
remix-blog 99 37 2021 2 0
repofinder 1524 6 2015 0 0
restify_customer_api 33 36 2018 2 1
rust_sandbox 8 115 2019 0 2
rxjs_boiler 2 93 2016 3 1
sampletextgen 6 12 2017 0 0
sass_starter_pack 3 67 2017 0 2
simple-electron-react 5158 65 2020 18 15
simple-rails-rest 18 33 2017 0 1
simple_react_pagination 4516 180 2019 19 8
skyapp_bootstrap 1988 58 2014 0 0
slack_jokebot 5 47 2018 0 0

Plot 1

Here a graph is been plot to figure out the sizes of repositories so we have used a filter mentioning the size > 5000.

filter_df <- filter(repos_df, size > 5000)
ggplot(aes(x = name, y = size),
             data = filter_df) +
  geom_bar(stat = "identity") + 
  labs(title="Repositories", x="name", y="Size of Repository")+
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

From the above graph we can see that the largest repository is html5audiopplayer which have a value of more than 75000 as compared to the the other repositories whose value ranges from 5000 to 25000

Plot 2

We are plotting a graph from the followers dataframe we created to get some information regarding the followers. Motive was to judge the possibilities of fake followers based on their public repositories

Follower_chck = filter(follower_df, public_repos < 5)

ggplot(aes(x = name, y = public_repos),data = Follower_chck) +
  geom_bar(stat = "identity", position = "dodge") + 
  theme_economist() +
  scale_color_gdocs() +
  theme(axis.text.x=element_text(angle = 30, vjust = 0.5)) +
  theme(plot.title = element_text(hjust = 0.5), legend.position = "bottom") +
  ggtitle("Followers Repositories") +
  xlab("Name of folloers") +
  ylab ("No of repositories") 

From the above graph we can see that there are mutiple follower with not more than 4 public repositories and even multiple users with 0 public repositories. If the number of users would be high having 0 repositories we could judge or comment on the genuiness of the followers which could be used for classification of real or fake followers