Tag Prediction on Stack Exchange

Group 3

Shravan Honade
Ritesh Sengar
Nikhil Patil
Dhananjay Ghate
Neena Chaudhari

2023-10-25

Primary objectives:

This project aims to use data and modeling to explore and predict tags for questions on Stack Exchange.The primary objectives are as follows:

Real-world Application: Improving tag prediction for Stack Exchange and other platforms could benefit millions of users.
Data-Driven Insights: EDA can reveal user behavior, popular topics, and challenges on Stack Exchange.
Automation: Automating tag prediction can save time and improve efficiency for moderators and users.
User Experience Improvement: Accurate tagging improves user engagement by making it easier to find relevant content.

List of EDA tasks:

API Integration (httr package): httr will be used to interact with APIs and retrieve data for analysis.
Data Visualization (ggplot): ggplot will be used to visualize data patterns and model performance.
Machine Learning Libraries (H2O, Keras): H2O, Keras and other models will be utilized as the machine learning frameworks.

Objectives and goals:

Objective 1: Extract relevant data from the Stack Exchange API.

Objective 2: Conduct exploratory data analysis (EDA).

Objective 3: Develop a machine learning model.

Key Performance Indicators (KPIs):

KPI 1: Accuracy

KPI 2: Metrics

KPI 3: Reduction in Manual Tagging