Benefits

The interest in experimental study related to school will have the advantage to help schools’ officials in decision making in term of improving school education system. This project is seeking to make the collected data about xx school speaks or reveal useful information. I plan to become a consultant using my skills as data scientist in various domain of the society to present meaningful report to government entities, companies, and organizations to help them in decision making. So, this project will contribute to building skills necessary for one to be successful in data science.

Research question

Example: Do students studying at least 10hrs weekly do well in class than those who spend less time?

Data source

We found some interesting dataset from -> data source: https://archive.ics.uci.edu/ml/machine-learning-databases/00320/. This data is about a study on students taking math or/and Portuguese language course from 02 schools.

Overall Workflow

I will use OSEMN Process:
    1. Obtain Data
      • Unstructured Data
      • SQL database
    2. Scrub Data
      • organizing data
      • Tidying up data
    3. Explore Data
      • Inspect data and understand the characteristic of the data
      • Looking for relationship, patterns and values,
    4. Model Data
      • Create a predictive models (I don’t think we learn data modeling in this class)
    5. Interpret Results
      • Explaining findings (Answering the research question)
      • Understanding the audience
      • Actionable information

Challenges

There are few challenges in this project to be overcome:
    1. data source: In this project, we will not be sampling a population. We will use collected data available online and often the format of the data source is not easy to render.
    2. Finding a good statistical model that will help answer the research question(s).
    3. Interpreting Results: This is going to be crucial. How to present the result, which format will be suitable, what is the audience?