Do we have meaningful clusters of Mechanical Turk workers that can be grouped by age and years of computer programming experience?
Remove NA’s, invalid age and years of experience, and entries for which difference between worker age and years of experience is smaller than 5 (assuming that the minimum age for person to start programming is 5 years old)
## years_of_programming_experience age
## Min. : 0.200 Min. :16
## 1st Qu.: 2.000 1st Qu.:24
## Median : 3.000 Median :28
## Mean : 5.239 Mean :30
## 3rd Qu.: 6.000 3rd Qu.:33
## Max. :35.000 Max. :71
## number of workers: 2000
## `geom_smooth()` using method = 'gam'
Many people seemed to have reported their experiences in multiples of 5 years.
What happens if we divide workers in 5 clusters, which is the number of professions?
## `geom_smooth()` using method = 'loess'
We can see a lot of superposition of clusters. So, we let’s try fewer clusters
## `geom_smooth()` using method = 'loess'
## `geom_smooth()` using method = 'gam'
## `geom_smooth()` using method = 'gam'
Only with two clusters we circles do not superpose. Interpreting that, we would have two large groups of workers. People above 35 years old with a wide spread of programming experience. While people below 35 concentrated from 1 to 15 years of programming experience.
Thanks for reading! Is there anything I could improve? Please leave a comment - Christian.
“We are trying to prove ourselves wrong as quickly as possible, because only in that way we can find progress.” Richard P. Feynman