Indrani, Indranil, Sharanya, Michael
Data Science with R
Evaluating Programming Language Popularity based on various measurements
Reddit and Stackoverflow as two different platforms with different user spaces
Using Textmining Techniques to better understand the data
Can we decide which topic/language a certain post is about?
How do number of comments and number of posts correlate to popularity?
How does the popularity of programming languages change over time?
Can we predict the popularity of programming languages in the future?
How do the two platforms compare based on programming languages?
The growth of R programming language blog by Stackoverflow
The growth of Python programming language blog by Stackoverflow
The PYPL indexing / TIOBE indexing for programming language
Topic Modeling
Here is an example of the Reddit data set that we used:
| subreddit | title | author | selftext | id | domain | url | created_utc | score | num_comments | text_new | language | year | text_length | month | day |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| programming | Watch “"Compiling and a Running a java program - Part 1”“ on YouTube | repterx | 3z2kmn | youtu.be | https://youtu.be/oF4X6jq69PI | 2016-01-01 | 1 | 0 | watch ”“compiling and a running a java program - part 1”“ on youtube | Java | 2016 | 69 | 1 | 1 | |
| Python | MIT is offering one of the best, if not the best, ”“Introduction to Computer Science”“ courses for free and it uses Python 2.7! Course starts January 13th! | Longhorns2102 | 3z2n8y | edx.org | https://www.edx.org/course/introduction-computer-science-mitx-6-00-1x-6 | 2016-01-01 | 153 | 50 | mit is offering one of the best, if not the best, ”“introduction to computer science”“ courses for free and it uses python 2.7! course starts january 13th! | Python | 2016 | 156 | 1 | 1 | |
| javascript | What would be a nice short intro tutorial for beginners to JavaScript? | jamesfinn180 | Soon I’ll be running a 15 minute introductory class to JavaScript where I’ll be taking a group of adult students with little to no programming experience and will spend the time building a little application with them. If I do this well, they won’t get too confused with anything and by the end of the 15 minutes they will have something to be proud of. The idea is to entice them into learning more programming. What I was considering was to build a simple Rock, Paper, Scissors game. With this they’ll be introduced to variables, functions, function parameters and if else statements however it seems a bit convoluted with all the nested conditional statements and I don’t want to scare them. Also I might be a bit over zealous with the amount I can showcase in 15 minutes so perhaps less is more. So does anyone have any suggestions for a short and sweet JavaScript application that could be a good learning experience for newcomers? | 3z2xrh | self.javascript | https://www.reddit.com/r/javascript/comments/3z2xrh/what_would_be_a_nice_short_intro_tutorial_for/ | 2016-01-02 | 1 | 4 | soon i’ll be running a 15 minute introductory class to javascript where i’ll be taking a group of adult students with little to no programming experience and will spend the time building a little application with them. if i do this well, they won’t get too confused with anything and by the end of the 15 minutes they will have something to be proud of. the idea is to entice them into learning more programming. what i was considering was to build a simple rock, paper, scissors game. with this they’ll be introduced to variables, functions, function parameters and if else statements however it seems a bit convoluted with all the nested conditional statements and i don’t want to scare them. also i might be a bit over zealous with the amount i can showcase in 15 minutes so perhaps less is more. so does anyone have any suggestions for a short and sweet javascript application that could be a good learning experience for newcomers? what would be a nice short intro tutorial for beginners to javascript? | JavaScript | 2016 | 1009 | 1 | 2 |
Here is an example of the Stackoverflow data set that we used:
| Id | PostTypeId | AcceptedAnswerId | ParentId | CreationDate | DeletionDate | Score | ViewCount | Body | OwnerUserId | OwnerDisplayName | LastEditorUserId | LastEditorDisplayName | LastEditDate | LastActivityDate | Title | Tags | AnswerCount | FavoriteCount | ClosedDate | CommunityOwnedDate | ContentLicense | language | text_new | num_comments | id | text_length | year | month |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 57300494 | 1 | NA | NA | 2019-08-01 00:43:04 | NA | 0 | 34 | <p>When saving an animated GIF from a Numpy array of shape <code>(20, 64, 64, 3)</code> and loading it again, the shape is suddenly <code>(20, 64, 64)</code>. I think the array may contain indices into a color palette but I’m not sure how to access that. How can I restore the original data from the saved GIF?</p> <pre><code>import imageio import numpy as np imageio.mimsave(‘animation.gif’, np.zeros((20, 64, 64, 3))) np.array(imageio.mimread(‘animation.gif’)).shape # (20, 64, 64) </code></pre> | 1079110 | NA | 2019-08-01 00:43:04 | Why is the color dimension gone after saving and loading an animated GIF with imageio? | <python><animated-gif><python-imageio> | 0 | 1 | CC BY-SA 4.0 | Python | <p>when saving an animated gif from a numpy array of shape <code>(20, 64, 64, 3)</code> and loading it again, the shape is suddenly <code>(20, 64, 64)</code>. i think the array may contain indices into a color palette but i’m not sure how to access that. how can i restore the original data from the saved gif?</p> <pre><code>import imageio import numpy as np imageio.mimsave(‘animation.gif’, np.zeros((20, 64, 64, 3))) np.array(imageio.mimread(‘animation.gif’)).shape # (20, 64, 64) </code></pre> why is the color dimension gone after saving and loading an animated gif with imageio? | 2 | 57300494 | 586 | 2019 | 8 | |||||
| 57300495 | 1 | NA | NA | 2019-08-01 00:43:27 | NA | 32 | 12140 | <p><a href=“http://clang.llvm.org/docs/Modules.html” rel=“noreferrer”>Clang</a> and <a href=“http://blogs.msdn.com/b/vcblog/archive/2015/12/03/c-modules-in-vs-2015-update-1.aspx” rel=“noreferrer”>MSVC</a> already supports <a href=“https://github.com/cplusplus/modules-ts” rel=“noreferrer”>Modules TS</a> from unfinished C++20 standard. Can I build my modules based project with CMake or other build system and how?</p> <p>I tried <a href=“https://build2.org/” rel=“noreferrer”>build2</a>, it supports modules and it works very well, but i have a <a href=“https://stackoverflow.com/questions/57296089/build2-analog-of-cmakes-find-package”>question</a> about it’s dependency management (UPD: question is closed).</p> | 5468048 | 3204551 | 2019-10-05 20:37:33 | 2021-05-05 20:45:43 | How to use c++20 modules with CMake? | <c++><cmake><c++20><c++-modules> | 5 | 3 | CC BY-SA 4.0 | C++ | <p><a href=“http://clang.llvm.org/docs/modules.html” rel=“noreferrer”>clang</a> and <a href=“http://blogs.msdn.com/b/vcblog/archive/2015/12/03/c-modules-in-vs-2015-update-1.aspx” rel=“noreferrer”>msvc</a> already supports <a href=“https://github.com/cplusplus/modules-ts” rel=“noreferrer”>modules ts</a> from unfinished c++20 standard. can i build my modules based project with cmake or other build system and how?</p> <p>i tried <a href=“https://build2.org/” rel=“noreferrer”>build2</a>, it supports modules and it works very well, but i have a <a href=“https://stackoverflow.com/questions/57296089/build2-analog-of-cmakes-find-package”>question</a> about it’s dependency management (upd: question is closed).</p> how to use c++20 modules with cmake? | 3 | 57300495 | 752 | 2019 | 8 | ||||
| 57300497 | 1 | 57300621 | NA | 2019-08-01 00:44:06 | NA | 1 | 33 | <p>So basically I have an array of gallery items which contains class “GalleryItem” objects (contains gallery name and list of images and their names).</p> <p>I also have a component called “Gallery” which takes GalleryItem class object as a prop and renders it.</p> <p>What I want to do is possibility to navigate with …/galleries/:galleryName to rendering the specific gallery inside single page.</p> <p>Galleries render fine, but I need this to work together with nested routes!</p> <pre><code><Switch> <Route path=“/galleries/:name” render={(props) => <Gallery {…props} galleryItem={this.state.galleryItems[:name]} />} /> <Switch> </code></pre> <p>Obviously this doesn’t work so I’m asking how it’s done and what to know if I’m doing that absolutely wrong.</p> | 9854267 | 8330162 | 2019-09-16 15:42:44 | 2019-09-16 15:42:44 | Is there a way to use the nested route ID as a prop argument in React Router | <javascript><reactjs><routes><nested><react-router> | 2 | NA | CC BY-SA 4.0 | JavaScript | <p>so basically i have an array of gallery items which contains class “galleryitem” objects (contains gallery name and list of images and their names).</p> <p>i also have a component called “gallery” which takes galleryitem class object as a prop and renders it.</p> <p>what i want to do is possibility to navigate with …/galleries/:galleryname to rendering the specific gallery inside single page.</p> <p>galleries render fine, but i need this to work together with nested routes!</p> <pre><code><switch> <route path=“/galleries/:name” render={(props) => <gallery {…props} galleryitem={this.state.galleryitems[:name]} />} /> <switch> </code></pre> <p>obviously this doesn’t work so i’m asking how it’s done and what to know if i’m doing that absolutely wrong.</p> is there a way to use the nested route id as a prop argument in react router | 1 | 57300497 | 873 | 2019 | 8 |
plot_language_counts_bar_comb <- function(data){
data%>%
group_by(language,Platform)%>%
summarise(N = n())%>%
ggplot(aes(x=reorder(language,-N),
y=N,
fill=Platform))+
geom_bar(stat="identity",
position=position_dodge())+
labs(title="Total number of posts per language",
x = "Language",
y = "Number of posts")+
scale_fill_manual(values=c('#FF5700','#BCBBBB'))+
theme_bw()
}
plot_language_comments_bar_combined <- function(data,color='#FF5700'){
data%>%
group_by(language,Platform)%>%
summarise(
comments = sum(num_comments)/n())%>%
ggplot(aes(x=reorder(language,-comments),
y=comments,
fill=Platform))+
geom_bar(stat="identity",
position=position_dodge())+
scale_fill_manual(values=c('#FF5700','#BCBBBB'))+
labs(title="Average number of comments per post for all languages",
y = "Number of comments",
x = "Programming Language"
)+
theme_bw()+
theme(axis.text.x = element_text(angle = 90))
}
plot_text_length_bar <- function(data){
data %>%
group_by(language,Platform)%>%
summarise(text_length=mean(text_length))%>%
ggplot(aes(x=reorder(language,-text_length),
y=text_length,
fill=Platform))+
geom_bar(stat="identity", position=position_dodge())+
labs(title="Average number of text
length per post for all languages",
y = "Average text length",
x = "Programming Language"
)+
scale_fill_manual(values=c('#FF5700','#BCBBBB'))+
theme_bw()
}
Forecasting Model : ARIMA(Auto Regressive Integrated Moving Average)
library(tm)
text <- lrd %>% select(text_new)
# Create a corpus
docs <- Corpus(VectorSource(text$text_new))
docs <- docs %>%
tm_map(removeNumbers) %>%
tm_map(stripWhitespace) %>%
tm_map(removePunctuation)%>%
tm_map(content_transformer(tolower))%>%
tm_map(removeWords, stopwords("english"))
dtm <- DocumentTermMatrix(docs)
dtm <- removeSparseTerms(dtm,0.97)
sel_idx <- slam::row_sums(dtm) > 0
dtm <- dtm[sel_idx, ]
model = LDA(dtm,K)
Can we decide which topic/language a certain post is about?
How do number of comments and number of posts correlate to popularity?
Can we predict the popularity of programming languages in the future?
How do the two platforms compare based on programming languages?