Team 7:
Ankit
Ashok Rayapati
Sivananda Reddy
Bushireddy
Vamshidhar Reddy Kanamanthareddy
What primary data will you use?
* SBIR
Awards.csv
* The Small Business Innovation Research (or SBIR) program is a United
States Government program, coordinated by the Small Business
Administration, intended to help certain small businesses conduct
research and development (R&D).
* This is a three-phase award system which provides qualified small
business concerns with opportunities to propose innovative ideas that
meet the specific research and research and development needs of the
federal Government. Ref:
SBIR.gov
Will you collect secondary data from other
sources?
* No
What is the problem that you will be
investigating?
* We would try to find top keywords in each topics considering 3-5 set
of keywords(eg: medical, agriculture, technology etc.)
* From our data set we observed that we have covariates, so we would use
this in our analysis to check how it affects the outcome.
* We would like investigate how the prevalence of topics change over
time.
Why is it interesting?
* We get hand-on experience with niche skill in text-analytics using
ground breaking technology
* It is also interesting to identify the upcoming technology through
these programs
How will you analyze your data? What methods or tools will
you use?
* We find the best number of topics using perplexity plot
* Quantitative analysis of textual data using the quanteda package
* Advanced topic modeling using keyATM Base, keyATM Covariates, keyATM
Dynamic
How will you evaluate your results?
* We are planning to visualize the keyword to identify the rankings by
proportion.
* We are also planning to use time trend plot to visualize trends in our
data over time.