SBIR Awards - Proposal

Team 7:
Ankit
Ashok Rayapati
Sivananda Reddy Bushireddy
Vamshidhar Reddy Kanamanthareddy

Data

What primary data will you use?
* SBIR Awards.csv
* The Small Business Innovation Research (or SBIR) program is a United States Government program, coordinated by the Small Business Administration, intended to help certain small businesses conduct research and development (R&D).
* This is a three-phase award system which provides qualified small business concerns with opportunities to propose innovative ideas that meet the specific research and research and development needs of the federal Government. Ref: SBIR.gov

Will you collect secondary data from other sources?
* No

Problem Description

What is the problem that you will be investigating?
* We would try to find top keywords in each topics considering 3-5 set of keywords(eg: medical, agriculture, technology etc.)
* From our data set we observed that we have covariates, so we would use this in our analysis to check how it affects the outcome.
* We would like investigate how the prevalence of topics change over time.

Why is it interesting?
* We get hand-on experience with niche skill in text-analytics using ground breaking technology
* It is also interesting to identify the upcoming technology through these programs

Analytics Plan

How will you analyze your data? What methods or tools will you use?
* We find the best number of topics using perplexity plot
* Quantitative analysis of textual data using the quanteda package
* Advanced topic modeling using keyATM Base, keyATM Covariates, keyATM Dynamic

Evaluation Plan:

How will you evaluate your results?
* We are planning to visualize the keyword to identify the rankings by proportion.
* We are also planning to use time trend plot to visualize trends in our data over time.