1. Introduction


In this final project we will analyze the full data set from the GSS.

Please install the gssr package and read the instruction and example here.

In this project, you will be required to finish the following tasks:

  1. Understand basic information and the data structure of the data set.
  2. Understand how to get information for each variable from the code book or the help documentation built in in the R package.
  3. Read recommended articles to have hints about how the data set can be used to answer questions of interest.
  4. Do some research to find out a meaningful topic that can be explored with the data set.
  5. Perform data analysis and finish a polished report to answer a series of meaningful questions on the chosen topic. At least one question needs to involve geospatial data analysis.
  6. Present your results in a presentation to communicate your research, questions, and findings effectively.
  7. (Bonus) If you create a professional-looking dashboard based on your report, you will earn up to 10% bonuses depending on the quality of your dashboard.


2. A Step-by-step Guidline


1. Understand basic information and the data structure of the data set.

  • Install the package gssr, then read its mainpage to understand the basic structure of the data set. Obtain basic information about the data set from the GSS mainpage.

  • I am going to give a tutorial in class and share the videos with all of you.


2. Understand how to get information for each variable from the code book or the help documentation built in in the R package.

  • The gssr package main page has an tutorial of how to obtain the meaning for each variable in R.

  • You can use the tool to search for useful variables for your analysis.

  • Full documentation about the data set is a place for reference. You want to scan through the “Introduction” and “Index to Data Set” to get an idea about what variables are included.


4. Do some research to find out a meaningful topic that can be explored with the data set.

To find questions in depth and of practical importance, you need to do some research to find a meaningful topic of your analysis report. You are required to cite at least one article (giving link to the article is good enough) in your Introduction part to describe what motivates your study.

5. Perform data analysis and finish a polished report to answer a series of meaningful questions on the chosen topic. The requirements are as following:

Pick up at least three questions of your interest that can be explored by the gss data set. The questions should meet the following requirements:

  • At least one question needs to explore data from the past 10 years (2012-2022).

  • At least one question should study a trend with respect to time (compare the same variable from different years since 1970s up to now).

  • At least one question needs to involve geospatial data analysis. Check this link for the definition of regions.

  • The questions should be related to each other in some way such that you can draw an overall conclusion in the Conclusion section.

6. Present your results in a presentation to communicate your research, questions, and findings effectively.

You need to create powerpoint slides to present your results in the presentation session. Your presentation should cover the main contents of your report (Introduction, Questions and Findings, Conclusion) and should be between 10-15 minutes. You must be well prepared for the talk as an exercise to publicly presenting technical reports.


3. Format Requirement and Rubrics


Data Analysis Report (60%)

  1. You are required to submit a pdf (text + plot) with code hidden along with the rmd file that can be knitted to generate the pdf report. You can put this chunk in a code cell at top of Rmd will hide code in knitted doc: opts_chunk$set(echo=FALSE)

  2. The final report should be at least 8 pages long and formatted including sections of Introduction, Questions and Findings, and Conclusion:

    • Introduction: this section shall introduce the specific topic that interests you from your research.
    • Questions and Findings: this section shall go through each of your questions that are logically related to the chosen topic. For each question, clearly state the question, the visualization or analysis result that is related to that question, and meaningful discussions to summarize your findings and thoughts in words.
    • Conclusion: summarize what you learn from Questions and Findings section to give an overall conclusion regarding the chosen topic. Your analysis should give a comprehensive insight of some social aspects under investigation.
  3. Rubrics:

  • Report quality and Professionalism (overall organization, language quality, formatting, graph details, typo-free etc.) 40%
  • Technical Rigor of visualization and analysis 30%
  • Depth and Logical structure 30%


Presentation (40%)

Rubrics:

  • Understanding 30%
  • Preparation 40%
  • Presentation 30%