Welcome to the world of data science
Throughout the world of data science, there are many languages and tools that can be used to complete a given task. While you are often able to use whichever tool you prefer, it is often important for analysts to work with similar platforms so that they can share their code with one another. Learning what professionals in the data science industry use while at work can help you gain a better understanding of things that you may be asked to do in the future.
In this project, we are going to find out what tools and languages professionals use in their day-to-day work. Our data comes from the Kaggle Data Science Survey which includes responses from over 10,000 people that write code to analyze data in their daily work.
# Load necessary packages
There were 16 warnings (use warnings() to see them)
library(tidyverse)
# Load the data
responses <- read_csv("kagglesurvey.csv")
Parsed with column specification:
cols(
Respondent = [32mcol_double()[39m,
WorkToolsSelect = [31mcol_character()[39m,
LanguageRecommendationSelect = [31mcol_character()[39m,
EmployerIndustry = [31mcol_character()[39m,
WorkAlgorithmsSelect = [31mcol_character()[39m
)
# Print the first 10 rows
head(responses, 10)
The R vs Python debate
Within the field of data science, there is a lot of debate among professionals about whether R or Python should reign supreme. You can see from our last figure that R and Python are the two most commonly used languages, but it’s possible that many respondents use both R and Python. Let’s take a look at how many people use R, Python, and both tools.
# Create a new column called language preference
debate_tools <- responses %>%
mutate(language_preference = case_when(
str_detect(WorkToolsSelect, "R") & !str_detect(WorkToolsSelect, "Python") ~ "R",
str_detect(WorkToolsSelect, "Python") & !str_detect(WorkToolsSelect, "R") ~ "Python",
str_detect(WorkToolsSelect, "R") & str_detect(WorkToolsSelect, "Python") ~ "both",
TRUE ~ "neither"
))
# Print the first 6 rows
head(debate_tools)
Plotting R vs Python users
Now we just need to take a closer look at how many respondents use R, Python, and both!
# Group by language preference, calculate number of responses, and remove "neither"
debate_plot <- debate_tools %>%
group_by(language_preference) %>%
summarise(count = n()) %>%
filter(language_preference != "neither")
`summarise()` ungrouping output (override with `.groups` argument)
# Create a bar chart
ggplot(debate_plot, aes(x = language_preference, y = count)) +
geom_col()

Language recommendations
It looks like the largest group of professionals program in both Python and R. But what happens when they are asked which language they recommend to new learners? Do R lovers always recommend R?
# Group by, summarise, arrange, mutate, and filter
recommendations <- debate_tools %>%
group_by(language_preference, LanguageRecommendationSelect) %>%
summarise(count = n()) %>%
arrange(language_preference, desc(count)) %>%
mutate(row = row_number(language_preference)) %>%
filter(row <= 4)
`summarise()` regrouping output by 'language_preference' (override with `.groups` argument)
The most recommended language by the language used
Just one thing left. Let’s graphically determine which languages are most recommended based on the language that a person uses.
# Create a faceted bar plot
ggplot(recommendations, aes(x = LanguageRecommendationSelect, y = count)) +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)) +
facet_wrap(~language_preference)

The moral of the story
So we’ve made it to the end. We’ve found that Python is the most popular language used among Kaggle data scientists, but R users aren’t far behind. And while Python users may highly recommend that new learners learn Python
LS0tCnRpdGxlOiAiRXhwbG9yaW5nIHRoZSBLYWdnbGUgRGF0YSBTY2llbmNlIFN1cnZleXMiCm91dHB1dDoKICBodG1sX25vdGVib29rOgogICAgdG9jOiB0cnVlCiAgICB0b2NfZmxvYXQ6IHRydWUKICAgIHRvY19jb2xsYXBzZWQ6IGZhbHNlCiAgICBudW1iZXJfc2VjdGlvbnM6IHRydWUKICAgIAp0b2NfZGVwdGg6IDMKLS0tCgojIFdlbGNvbWUgdG8gdGhlIHdvcmxkIG9mIGRhdGEgc2NpZW5jZQoKVGhyb3VnaG91dCB0aGUgd29ybGQgb2YgZGF0YSBzY2llbmNlLCB0aGVyZSBhcmUgbWFueSBsYW5ndWFnZXMgYW5kIHRvb2xzIHRoYXQgY2FuIGJlIHVzZWQgdG8gY29tcGxldGUgYSBnaXZlbiB0YXNrLiBXaGlsZSB5b3UgYXJlIG9mdGVuIGFibGUgdG8gdXNlIHdoaWNoZXZlciB0b29sIHlvdSBwcmVmZXIsIGl0IGlzIG9mdGVuIGltcG9ydGFudCBmb3IgYW5hbHlzdHMgdG8gd29yayB3aXRoIHNpbWlsYXIgcGxhdGZvcm1zIHNvIHRoYXQgdGhleSBjYW4gc2hhcmUgdGhlaXIgY29kZSB3aXRoIG9uZSBhbm90aGVyLiBMZWFybmluZyB3aGF0IHByb2Zlc3Npb25hbHMgaW4gdGhlIGRhdGEgc2NpZW5jZSBpbmR1c3RyeSB1c2Ugd2hpbGUgYXQgd29yayBjYW4gaGVscCB5b3UgZ2FpbiBhIGJldHRlciB1bmRlcnN0YW5kaW5nIG9mIHRoaW5ncyB0aGF0IHlvdSBtYXkgYmUgYXNrZWQgdG8gZG8gaW4gdGhlIGZ1dHVyZS4KCkluIHRoaXMgcHJvamVjdCwgd2UgYXJlIGdvaW5nIHRvIGZpbmQgb3V0IHdoYXQgdG9vbHMgYW5kIGxhbmd1YWdlcyBwcm9mZXNzaW9uYWxzIHVzZSBpbiB0aGVpciBkYXktdG8tZGF5IHdvcmsuIE91ciBkYXRhIGNvbWVzIGZyb20gdGhlIFtLYWdnbGUgRGF0YSBTY2llbmNlIFN1cnZleV0oaHR0cHM6Ly93d3cua2FnZ2xlLmNvbS9rYWdnbGUva2FnZ2xlLXN1cnZleS0yMDE3P3V0bV9tZWRpdW09cGFydG5lciZ1dG1fc291cmNlPWRhdGFjYW1wLmNvbSZ1dG1fY2FtcGFpZ249bWwrc3VydmV5K2Nhc2Urc3R1ZHkpIHdoaWNoIGluY2x1ZGVzIHJlc3BvbnNlcyBmcm9tIG92ZXIgMTAsMDAwIHBlb3BsZSB0aGF0IHdyaXRlIGNvZGUgdG8gYW5hbHl6ZSBkYXRhIGluIHRoZWlyIGRhaWx5IHdvcmsuCmBgYHtyfQojIExvYWQgbmVjZXNzYXJ5IHBhY2thZ2VzCmxpYnJhcnkodGlkeXZlcnNlKQoKIyBMb2FkIHRoZSBkYXRhCnJlc3BvbnNlcyA8LSByZWFkX2Nzdigia2FnZ2xlc3VydmV5LmNzdiIpCgojIFByaW50IHRoZSBmaXJzdCAxMCByb3dzCmhlYWQocmVzcG9uc2VzLCAxMCkKYGBgCiMgVXNpbmcgbXVsdGlwbGUgdG9vbHMKCk5vdyB0aGF0IHdlIGhhdmUgbG9hZGVkIGluIHRoZSBzdXJ2ZXkgcmVzdWx0cywgd2Ugd2FudCB0byBmb2N1cyBvbiB0aGUgdG9vbHMgYW5kIGxhbmd1YWdlcyB0aGF0IHRoZSBzdXJ2ZXkgcmVzcG9uZGVudHMgdXNlIGF0IHdvcmsuCgpUbyBnZXQgYSBiZXR0ZXIgaWRlYSBvZiBob3cgdGhlIGRhdGEgYXJlIGZvcm1hdHRlZCwgd2Ugd2lsbCBsb29rIGF0IHRoZSBmaXJzdCByZXNwb25kZW50J3MgdG9vbC11c2UgYW5kIHNlZSB0aGF0IHRoaXMgc3VydmV5LXRha2VyIGxpc3RlZCBtdWx0aXBsZSB0b29scyB0aGF0IGFyZSBlYWNoIHNlcGFyYXRlZCBieSBhIGNvbW1hLiBUbyBsZWFybiBob3cgbWFueSBwZW9wbGUgdXNlIGVhY2ggdG9vbCwgd2UgbmVlZCB0byBzZXBhcmF0ZSBvdXQgYWxsIG9mIHRoZSB0b29scyB1c2VkIGJ5IGVhY2ggaW5kaXZpZHVhbC4gVGhlcmUgYXJlIHNldmVyYWwgd2F5cyB0byBjb21wbGV0ZSB0aGlzIHRhc2ssIGJ1dCB3ZSB3aWxsIHVzZSBzdHJfc3BsaXQoKSBmcm9tIHN0cmluZ3IgdG8gc2VwYXJhdGUgdGhlIHRvb2xzIGF0IGVhY2ggY29tbWEuIFNpbmNlIHRoYXQgd2lsbCBjcmVhdGUgYSBsaXN0IGluc2lkZSBvZiB0aGUgZGF0YSBmcmFtZSwgd2UgY2FuIHVzZSB0aGUgdGlkeXIgZnVuY3Rpb24gdW5uZXN0KCkgdG8gc2VwYXJhdGUgZWFjaCBsaXN0IGl0ZW0gaW50byBhIG5ldyByb3cuCmBgYHtyfQojIFByaW50IHRoZSBmaXJzdCByZXNwb25kZW50J3MgdG9vbHMgYW5kIGxhbmd1YWdlcwpyZXNwb25zZXMkV29ya1Rvb2xzU2VsZWN0WzFdCiMgQWRkIGEgbmV3IGNvbHVtbiwgYW5kIHVubmVzdCB0aGUgbmV3IGNvbHVtbgp0b29scyA8LSByZXNwb25zZXMgICU+JSAKICAgIG11dGF0ZSh3b3JrX3Rvb2xzID0gc3RyX3NwbGl0KFdvcmtUb29sc1NlbGVjdCwgIiwiKSkgICU+JSAKICAgIHVubmVzdChjb2xzID0gYyh3b3JrX3Rvb2xzKSkKIyBWaWV3IHRoZSBmaXJzdCA2IHJvd3Mgb2YgdG9vbHMKaGVhZCh0b29scykKYGBgCiMgQ291bnRpbmcgdXNlcnMgb2YgZWFjaCB0b29sCgpOb3cgdGhhdCB3ZSd2ZSBzcGxpdCBhcGFydCBhbGwgb2YgdGhlIHRvb2xzIHVzZWQgYnkgZWFjaCByZXNwb25kZW50LCB3ZSBjYW4gZmlndXJlIG91dCB3aGljaCB0b29scyBhcmUgdGhlIG1vc3QgcG9wdWxhci4KYGBge3J9CiMgR3JvdXAgdGhlIGRhdGEgYnkgd29ya190b29scywgc3VtbWFyaXNlIHRoZSBjb3VudHMsIGFuZCBhcnJhbmdlIGluIGRlc2NlbmRpbmcgb3JkZXIKdG9vbF9jb3VudCA8LSB0b29scyAgJT4lIAogICAgZ3JvdXBfYnkod29ya190b29scykgICU+JSAKICAgIHN1bW1hcmlzZShuID0gbigpKSAlPiUgCiAgICBhcnJhbmdlKGRlc2MobikpICU+JSAKICAgIGZpbHRlcighaXMubmEod29ya190b29scykpCiMgUHJpbnQgdGhlIGZpcnN0IDYgcmVzdWx0cy4KaGVhZCh0b29sX2NvdW50KQpgYGAKIyBQbG90dGluZyB0aGUgbW9zdCBwb3B1bGFyIHRvb2xzCgpMZXQncyBzZWUgaG93IHRoZSBtb3N0IHBvcHVsYXIgdG9vbHMgc3RhY2sgdXAgYWdhaW5zdCB0aGUgcmVzdC4KYGBge3J9CiNsZXRzIGxpc3QgdGhlIGZpc3QgMTAKdG9vbF90b3AgPC0gaGVhZCh0b29sX2NvdW50LCAyMCkKYGBgCgpgYGB7cn0KIyBDcmVhdGUgYSBiYXIgY2hhcnQgb2YgdGhlIHdvcmtfdG9vbHMgY29sdW1uLCBtb3N0IGNvdW50cyBvbiB0aGUgZmFyIHJpZ2h0CmdncGxvdCh0b29sX3RvcCwgYWVzKHggPSBmY3RfcmVvcmRlcih3b3JrX3Rvb2xzLCBuKSwgeSA9IG4pKSArIAogICAgZ2VvbV9iYXIoc3RhdCA9ICJpZGVudGl0eSIpICsKICAgIGNvb3JkX2ZsaXAoKQogICAgI3RoZW1lKGF4aXMudGV4dC54ICA9IGVsZW1lbnRfdGV4dChhbmdsZT05MCwgdmp1c3Q9MC41LCBoanVzdD0gMSkpCmBgYAojIFRoZSBSIHZzIFB5dGhvbiBkZWJhdGUKCldpdGhpbiB0aGUgZmllbGQgb2YgZGF0YSBzY2llbmNlLCB0aGVyZSBpcyBhIGxvdCBvZiBkZWJhdGUgYW1vbmcgcHJvZmVzc2lvbmFscyBhYm91dCB3aGV0aGVyIFIgb3IgUHl0aG9uIHNob3VsZCByZWlnbiBzdXByZW1lLiBZb3UgY2FuIHNlZSBmcm9tIG91ciBsYXN0IGZpZ3VyZSB0aGF0IFIgYW5kIFB5dGhvbiBhcmUgdGhlIHR3byBtb3N0IGNvbW1vbmx5IHVzZWQgbGFuZ3VhZ2VzLCBidXQgaXQncyBwb3NzaWJsZSB0aGF0IG1hbnkgcmVzcG9uZGVudHMgdXNlIGJvdGggUiBhbmQgUHl0aG9uLiBMZXQncyB0YWtlIGEgbG9vayBhdCBob3cgbWFueSBwZW9wbGUgdXNlIFIsIFB5dGhvbiwgYW5kIGJvdGggdG9vbHMuCgpgYGB7cn0KIyBDcmVhdGUgYSBuZXcgY29sdW1uIGNhbGxlZCBsYW5ndWFnZSBwcmVmZXJlbmNlCmRlYmF0ZV90b29scyA8LSByZXNwb25zZXMgICU+JSAKICAgbXV0YXRlKGxhbmd1YWdlX3ByZWZlcmVuY2UgPSBjYXNlX3doZW4oCiAgICAgICBzdHJfZGV0ZWN0KFdvcmtUb29sc1NlbGVjdCwgIlIiKSAmICFzdHJfZGV0ZWN0KFdvcmtUb29sc1NlbGVjdCwgIlB5dGhvbiIpIH4gIlIiLCAKICAgICAgIHN0cl9kZXRlY3QoV29ya1Rvb2xzU2VsZWN0LCAiUHl0aG9uIikgJiAhc3RyX2RldGVjdChXb3JrVG9vbHNTZWxlY3QsICJSIikgfiAiUHl0aG9uIiwKICAgICAgIHN0cl9kZXRlY3QoV29ya1Rvb2xzU2VsZWN0LCAiUiIpICYgc3RyX2RldGVjdChXb3JrVG9vbHNTZWxlY3QsICJQeXRob24iKSB+ICJib3RoIiwKICAgICAgIFRSVUUgfiAibmVpdGhlciIKICAgKSkKCiMgUHJpbnQgdGhlIGZpcnN0IDYgcm93cwpoZWFkKGRlYmF0ZV90b29scykKYGBgCiMgUGxvdHRpbmcgUiB2cyBQeXRob24gdXNlcnMKCk5vdyB3ZSBqdXN0IG5lZWQgdG8gdGFrZSBhIGNsb3NlciBsb29rIGF0IGhvdyBtYW55IHJlc3BvbmRlbnRzIHVzZSBSLCBQeXRob24sIGFuZCBib3RoIQpgYGB7cn0KIyBHcm91cCBieSBsYW5ndWFnZSBwcmVmZXJlbmNlLCBjYWxjdWxhdGUgbnVtYmVyIG9mIHJlc3BvbnNlcywgYW5kIHJlbW92ZSAibmVpdGhlciIKZGViYXRlX3Bsb3QgPC0gZGViYXRlX3Rvb2xzICAlPiUgCiAgIGdyb3VwX2J5KGxhbmd1YWdlX3ByZWZlcmVuY2UpICAlPiUgCiAgIHN1bW1hcmlzZShjb3VudCA9IG4oKSkgICU+JSAKICAgIGZpbHRlcihsYW5ndWFnZV9wcmVmZXJlbmNlICE9ICJuZWl0aGVyIikKCiMgQ3JlYXRlIGEgYmFyIGNoYXJ0CmdncGxvdChkZWJhdGVfcGxvdCwgYWVzKHggPSBsYW5ndWFnZV9wcmVmZXJlbmNlLCB5ID0gY291bnQpKSArCmdlb21fY29sKCkKYGBgCiMgTGFuZ3VhZ2UgcmVjb21tZW5kYXRpb25zCgpJdCBsb29rcyBsaWtlIHRoZSBsYXJnZXN0IGdyb3VwIG9mIHByb2Zlc3Npb25hbHMgcHJvZ3JhbSBpbiBib3RoIFB5dGhvbiBhbmQgUi4gQnV0IHdoYXQgaGFwcGVucyB3aGVuIHRoZXkgYXJlIGFza2VkIHdoaWNoIGxhbmd1YWdlIHRoZXkgcmVjb21tZW5kIHRvIG5ldyBsZWFybmVycz8gRG8gUiBsb3ZlcnMgYWx3YXlzIHJlY29tbWVuZCBSPwpgYGB7cn0KIyBHcm91cCBieSwgc3VtbWFyaXNlLCBhcnJhbmdlLCBtdXRhdGUsIGFuZCBmaWx0ZXIKcmVjb21tZW5kYXRpb25zIDwtIGRlYmF0ZV90b29scyAgJT4lIAogICAgZ3JvdXBfYnkobGFuZ3VhZ2VfcHJlZmVyZW5jZSwgTGFuZ3VhZ2VSZWNvbW1lbmRhdGlvblNlbGVjdCkgICU+JSAKICAgIHN1bW1hcmlzZShjb3VudCA9IG4oKSkgICU+JSAKICAgIGFycmFuZ2UobGFuZ3VhZ2VfcHJlZmVyZW5jZSwgZGVzYyhjb3VudCkpICU+JSAKICAgIG11dGF0ZShyb3cgPSByb3dfbnVtYmVyKGxhbmd1YWdlX3ByZWZlcmVuY2UpKSAlPiUgCiAgICBmaWx0ZXIocm93IDw9IDQpCmBgYAojIFRoZSBtb3N0IHJlY29tbWVuZGVkIGxhbmd1YWdlIGJ5IHRoZSBsYW5ndWFnZSB1c2VkCgpKdXN0IG9uZSB0aGluZyBsZWZ0LiBMZXQncyBncmFwaGljYWxseSBkZXRlcm1pbmUgd2hpY2ggbGFuZ3VhZ2VzIGFyZSBtb3N0IHJlY29tbWVuZGVkIGJhc2VkIG9uIHRoZSBsYW5ndWFnZSB0aGF0IGEgcGVyc29uIHVzZXMuCgpgYGB7cn0KIyBDcmVhdGUgYSBmYWNldGVkIGJhciBwbG90CmdncGxvdChyZWNvbW1lbmRhdGlvbnMsIGFlcyh4ID0gTGFuZ3VhZ2VSZWNvbW1lbmRhdGlvblNlbGVjdCwgeSA9IGNvdW50KSkgKwogICAgZ2VvbV9iYXIoc3RhdCA9ICJpZGVudGl0eSIpICsKICAgIHRoZW1lKGF4aXMudGV4dC54ID0gZWxlbWVudF90ZXh0KGFuZ2xlID0gOTAsIHZqdXN0ID0gMC41LCBoanVzdCA9IDEpKSArCiAgICBmYWNldF93cmFwKH5sYW5ndWFnZV9wcmVmZXJlbmNlKQpgYGAKIyBUaGUgbW9yYWwgb2YgdGhlIHN0b3J5CgpTbyB3ZSd2ZSBtYWRlIGl0IHRvIHRoZSBlbmQuIFdlJ3ZlIGZvdW5kIHRoYXQgUHl0aG9uIGlzIHRoZSBtb3N0IHBvcHVsYXIgbGFuZ3VhZ2UgdXNlZCBhbW9uZyBLYWdnbGUgZGF0YSBzY2llbnRpc3RzLCBidXQgUiB1c2VycyBhcmVuJ3QgZmFyIGJlaGluZC4gQW5kIHdoaWxlIFB5dGhvbiB1c2VycyBtYXkgaGlnaGx5IHJlY29tbWVuZCB0aGF0IG5ldyBsZWFybmVycyBsZWFybiBQeXRob24KCg==