Capstone Proposal

What is the problem you want to solve?

  1. Understand the impact of various facilities provided by government over years
  2. Given the above fields in relation to number of schools in villages predict the literacy rate

Who is your client and why do they care about this problem? In other words, what will your client DO or DECIDE based on your analysis that they wouldn’t have otherwise?

  1. Education department of states
  2. District Magistrate of court

What data are you going to use for this? How will you acquire this data?

The data is taken from the www.dise.in/drc.htm fields to be studied :- Schools with Playground Facility,Schools with Boundarywall,Schools with Girls’ Toilet,Schools with Boys’ Toilet,Schools with Drinking Water,Schools with Electricity,Elementary Enrolment by School Category (Government) - Rural,Schools with PTR,Schools with SCR and their impact on overall literacy rate, SC ST OBC enrollment

we will take data from 2-3 yrs only and we will use only 5 states data.

In brief, outline your approach to solving this problem (knowing that this might change later).

Task 1:-

  1. Filter the data from the Excel sheet 3 yrs 5 states
  2. Tidy it and clean up unnecessary data
  3. Plotting and visualisations using ggplot Task 2:- ——
  4. Correlation analysis. We will be able able to figure out the relations between features / parameters
  5. Bring out patterns in the data to identify relevance of the fields

Task 3:-

  1. Statistical model building on the data like regression analysis
  2. you will need additional test data to build predictions and its accuracy.

What are your deliverables? Typically, this would include code, along with a paper and/or a slide deck.

  1. R code for wrangling
  2. Visualisations using ggplot or tableau
  3. Correlation analysis and bring out patterns
  4. statistical Model design & Prediction with test data and its accuracy.
  5. Data Story based on Video demo & PPT