Instructor Information | Course Information |
---|---|
Anthony Howell, PhD | Course Meeting Time: Mondays, 1:00-2:50pm |
Office: #322 School of Economics Bldg. | Course Meeting Location: 理教415 |
Email: tonyjhowell@pku.edu.cn | Office Hours: By Appt. |
In 2016, Glassdoor named ‘Data Scientist’ as the best job of the year based on current job trends among thousands of different professions. Hal Varian, the chief economist at Google, said that the sexiest job in the next 10 years will be statisticians. At the same time, there is a major global skills-deficit when it comes to the tools required to perform in-depth data analysis. The McKinsey Global Institute, for instance, indicates that the “United States alone faces a shortage of 140,000 to 190,000 people with deep analytical skills.”
The emphasis of this course will be on learning basic statistical concepts and methods while gaining experience working with hands-on data science projects. During the class, students will learn data visualization and analysis techniques using R statistical software, work with multiple datasets and be exposed to applied economic analysis topics. After completing the course, students will be able to: (1) read critically economic research reports; (2) use statistical methods in their own work; (3) write professional reports and reproducible research; (4) pursue further coursework in statistics/econometrics.
Participation
The participation grade, which accounts for 30% of your final grade, is based on in-class attendance and participation in lab work.
Mini-Project
Create and present (2-3 minutes) an infographic that you create in R using the World Bank API data. The infographic and presentation will count for 20% of your final grade.
Final Research Report/Presentation
The final project accounts for 50% of your final grade, and will ask you to explore a broad data-driven policy question. The instructor will provide access to various social, economic or environmental datasets for students to explore and analyze. This project is intended to provide students with the complete experience of going from a study question and a rich data set to a full statistical report.
Students will be expected to:
While students may work in small groups to decide on appropriate statistical methodology and graphical/tabular summaries, each student will be required to produce and submit their own code and final report.
Summary of Grade Distribution
Activity | Grade Contribution |
---|---|
1. Participation | 55% |
2. Mini-Project Infographic | 55% |
3. Final Research Report/Presentation | 50% |
A WeChat group will be created for the class to serve as discussion forums for the class in order to facilitate interaction between students and to promote broader participation. Students are expected to conduct themselves with respect by posting comments and replies only in the context of the course. It is encouraged to not email the instructor or TA directly, rather use the Wechat class group to ask general questions about specific problems with R, programmatic issues, and/or homework. Your question will probably help other classmates. You can also paste small snippets of code to clarify an idea. Students are encouraged to answer each others’ questions.
Class attendance is encouraged. As you can see from the grading rubric, 35% of your grade is related to your class attendance and participation through lab work and quizzes. If you consistently miss classes it will not be possible to obtain a high grade, and may even result in your failing the class. If you have a planned and excusable absence, please notify the instructor beforehand.
You are encouraged to discuss labwork and homework problems with your fellow students. However, the work you submit must be your own. The course collaboration policy allows you to discuss the problems with other students, but requires that you complete the work on your own. Every line of text and line of code that you submit must be written by you personally.
Submissions that fail to properly acknowledge help from other students or non-class sources will receive no credit. Copied work will receive no credit. Any and all violations will be reported to school administration.
In order to participate in class, students are expected bring their own laptops to class. Please see the instructor if you do not have access to a laptop.
I value students’ opinions regarding my teaching effectiveness and the content, pace and level of difficulty of the course. I will take student feedback in consideration to make this course as exciting and engaging as possible. You can also leave anonymous feedback in the form of a note in my departmental mail box.
Week | Date | Topic | Labs/Assignments |
---|---|---|---|
Section I: Data Visualization and Mapping | |||
1 | 9/17 | Course introduction and R basics | Lab 1 |
2 | 9/24 | Holiday - No Class | |
3 | 10/1 | Holiday - No Class | |
4 | 10/8 | Data manipulation and programming basics | Lab 2 |
5 | 10/15 | Tidyverse and ggplot | Lab 3 |
6 | 10/22 | Mapping with World Bank API | |
Section II: Statistical and Spatial Modeling | |||
7 | 10/29 | Regression with programming | Lab 4 |
8 | 11/5 | Casuality and identification | Lab 5 |
9 | 11/12 | Spatial Analysis and Moran’s I | |
10 | 11/19 | Infographic Presentation | Mini-Project |
Section III: Network Analysis | |||
11 | 11/26 | Creating and handling network data | Lab 6 |
12 | 12/3 | Topological properties and visualizing networks | Lab 7 |
13 | 12/10 | Statistical modeling of networks | |
14 | 12/17 | Final Project Prep. | |
15 | 12/24 | Student Presentations I | |
16 | 12/31 | Student Presentations II | Final Written Project: Due 1/3 |