• This document demonstrates what your project should look like. The results, tables and graphs I include here are only examples.

  • Each section includes a recommended word count to give you an idea about how much you should write.

  • Try to stay around the word limit (not too below it or not too above it).

  • Note that the word count does not need to be exact but try to avoid going too below the limit or going too above it. For example, if I indicate that you can write anywhere between 100-300 words, do not submit that section with only 20 words or with 1000 words!

  • For each section, make sure that you include all the requested information.

1 Introduction

Recommended word count for the introduction: 100-200 words.

What to include in the introduction:

What is the Research Question?

  • Clearly state the research question.
  • Example: “What is the effect of education on income?”

Why Does It Matter?

  • Provide motivation for the research.
  • Example: “Understanding this relationship helps policymakers design better education policies.”

2 Literature Review

Recommended word count for the literature review: 100-300 words.

What to include in the literature review section: Discuss relevant literature that is related to your research topic (3-4 papers will be enough, but you can include more as well)

3 Data

Recommended word count for the data section: 100 words.

What to include in the data section:

General Information

  • Describe the dataset.
  • Example: “The data comes from the Crime Survey of England, Scotland and Wales.”

Available Variables

  • List key variables.
  • Example:
    • education: Years of education
    • income: Annual income in GBP
    • age: Age of the respondent

Create a Summary Table

This is where you summarize some of the important variables that describe the data.

For example, what is the average age in your sample?

What is the average education level?

What does the distribution of income look like? etc.

# Simulate data (note that this is just for demonstration for this tutorial, 
# In your project, you do not need to simulate anything. You just use 
# the actual crime data and do not need to simualte anything)
set.seed(123)
education <- rnorm(100, mean = 16, sd = 2)
income <- 20000 + 3000 * education + rnorm(100, mean = 0, sd = 5000)
age <- sample(25:60, 100, replace = TRUE)
# Use the data to generate summary statistics
data <- data.frame(education, income, age)
stargazer(data, type = "html", title = "Summary Statistics",digits = 3)
Summary Statistics
Statistic N Mean St. Dev. Min Max
education 100 16.181 1.826 11.382 20.375
income 100 68,004.700 7,123.882 52,069.920 84,127.540
age 100 43.730 10.209 26 60
# Plot histogram of some of the important variables of interest (continuous variables e.g. income etc.)
ggplot(data, aes(x=income)) +
  geom_histogram(binwidth=5000, fill="blue", alpha=0.7) +
  theme_minimal() +
  labs(title="Distribution of Income",
       x="Income (GBP)",
       y="Frequency")


4 METHODOLOGY

Recommended word count for the methodology section: 100 words.

What to include in the Methodology section: Discuss what estimation methods you will be using to analyze the data and to answer your research question.

5 RESULTS

Recommended word count for the results/discussion section: 500-1000 words.

What to include in the results section: Here, you need to display your results and discuss their meaning. Please find more information/examples below.

Display estimation results:

# Fit a regression model
model <- lm(income ~ education + age, data = data)

# Create a regression table
stargazer(model, type = "html", title = "Regression Results",digits = 3)
Regression Results
Dependent variable:
income
education 2,859.074***
(262.371)
age -101.376**
(46.920)
Constant 26,175.750***
(4,770.707)
Observations 100
R2 0.562
Adjusted R2 0.553
Residual Std. Error 4,765.207 (df = 97)
F Statistic 62.131*** (df = 2; 97)
Note: p<0.1; p<0.05; p<0.01

You can also use various visualizations of your results:

# Scatter plot with regression line
ggplot(data, aes(x=education, y=income)) +
  geom_point(alpha=0.5) +
  geom_smooth(method="lm", se=FALSE, color="red") +
  theme_minimal() +
  labs(title="Effect of Education on Income",
       x="Years of Education",
       y="Income (GBP)")

Please make sure you do the following:

  • For each table or graph you display, you should include a brief discussion about what the results mean (interpretation, implications, statistical significance etc.)

  • For example, for a simple regression, you would need to interpret the regression coefficients and discuss their implications.

  • Example: “Each additional year of education increases income by approximately £3,000.”


Below, you can find some additional tips about how to generate tables:

  • Customizing Tables: Adding Labels and Notes
# Customize the table
stargazer(model, type = "html",
          title = "Customized Regression Table",
          dep.var.labels = "Income (GBP)",
          covariate.labels = c("Years of Education", "Age"),
          notes = "Standard errors in parentheses",digits = 3)
Customized Regression Table
Dependent variable:
Income (GBP)
Years of Education 2,859.074***
(262.371)
Age -101.376**
(46.920)
Constant 26,175.750***
(4,770.707)
Observations 100
R2 0.562
Adjusted R2 0.553
Residual Std. Error 4,765.207 (df = 97)
F Statistic 62.131*** (df = 2; 97)
Note: p<0.1; p<0.05; p<0.01
Standard errors in parentheses
  • Comparing Different Estimation Models
# Fit another model
model2 <- lm(income ~ education, data = data)

# Compare models
stargazer(model, model2, type = "html",
          title = "Model Comparison",
          column.labels = c("Model 1", "Model 2"),digits = 3)
Model Comparison
Dependent variable:
income
Model 1 Model 2
(1) (2)
education 2,859.074*** 2,868.821***
(262.371) (267.197)
age -101.376**
(46.920)
Constant 26,175.750*** 21,584.850***
(4,770.707) (4,350.615)
Observations 100 100
R2 0.562 0.541
Adjusted R2 0.553 0.536
Residual Std. Error 4,765.207 (df = 97) 4,853.574 (df = 98)
F Statistic 62.131*** (df = 2; 97) 115.278*** (df = 1; 98)
Note: p<0.1; p<0.05; p<0.01

6 Conclusion

Recommended word count: 100-200 words.

What to include in the conclusion:

Key Findings

  • Briefly summarize the main results.
  • Example: “Education has a significant positive effect on income.”

Policy Implications

  • Discuss how the findings can be used (if there are any such policy implications).
  • Example: “Investing in education can lead to higher earnings for individuals.”

Limitations

  • Acknowledge limitations.
  • Example: “The analysis does not account for unobserved factors like motivation.”