This document demonstrates what your project should look like. The results, tables and graphs I include here are only examples.
Each section includes a recommended word count to give you an idea about how much you should write.
Try to stay around the word limit (not too below it or not too above it).
Note that the word count does not need to be exact but try to avoid going too below the limit or going too above it. For example, if I indicate that you can write anywhere between 100-300 words, do not submit that section with only 20 words or with 1000 words!
For each section, make sure that you include all the requested information.

1 Introduction

Recommended word count for the introduction: 100-200 words.

What to include in the introduction:

What is the Research Question?

Clearly state the research question.
Example: “What is the effect of education on income?”

Why Does It Matter?

Provide motivation for the research.
Example: “Understanding this relationship helps policymakers design better education policies.”

2 Literature Review

Recommended word count for the literature review: 100-300 words.

What to include in the literature review section: Discuss relevant literature that is related to your research topic (3-4 papers will be enough, but you can include more as well)

3 Data

Recommended word count for the data section: 100 words.

What to include in the data section:

General Information

Describe the dataset.
Example: “The data comes from the Crime Survey of England, Scotland and Wales.”

Available Variables

List key variables.
Example:
- education: Years of education
- income: Annual income in GBP
- age: Age of the respondent

Create a Summary Table

This is where you summarize some of the important variables that describe the data.

For example, what is the average age in your sample?

What is the average education level?

What does the distribution of income look like? etc.

# Simulate data (note that this is just for demonstration for this tutorial, 
# In your project, you do not need to simulate anything. You just use 
# the actual crime data and do not need to simualte anything)
set.seed(123)
education <- rnorm(100, mean = 16, sd = 2)
income <- 20000 + 3000 * education + rnorm(100, mean = 0, sd = 5000)
age <- sample(25:60, 100, replace = TRUE)

# Use the data to generate summary statistics
data <- data.frame(education, income, age)
stargazer(data, type = "html", title = "Summary Statistics",digits = 3)

**Summary Statistics**

Statistic	N	Mean	St. Dev.	Min	Max

education	100	16.181	1.826	11.382	20.375
income	100	68,004.700	7,123.882	52,069.920	84,127.540
age	100	43.730	10.209	26	60

# Plot histogram of some of the important variables of interest (continuous variables e.g. income etc.)
ggplot(data, aes(x=income)) +
  geom_histogram(binwidth=5000, fill="blue", alpha=0.7) +
  theme_minimal() +
  labs(title="Distribution of Income",
       x="Income (GBP)",
       y="Frequency")

4 METHODOLOGY

Recommended word count for the methodology section: 100 words.

What to include in the Methodology section: Discuss what estimation methods you will be using to analyze the data and to answer your research question.

5 RESULTS

Recommended word count for the results/discussion section: 500-1000 words.

What to include in the results section: Here, you need to display your results and discuss their meaning. Please find more information/examples below.

Display estimation results:

# Fit a regression model
model <- lm(income ~ education + age, data = data)

# Create a regression table
stargazer(model, type = "html", title = "Regression Results",digits = 3)

**Regression Results**

	Dependent variable:

	income

education	2,859.074^***
	(262.371)

age	-101.376^**
	(46.920)

Constant	26,175.750^***
	(4,770.707)


Observations	100
R²	0.562
Adjusted R²	0.553
Residual Std. Error	4,765.207 (df = 97)
F Statistic	62.131^*** (df = 2; 97)

Note:	p<0.1; p<0.05; p<0.01

You can also use various visualizations of your results:

# Scatter plot with regression line
ggplot(data, aes(x=education, y=income)) +
  geom_point(alpha=0.5) +
  geom_smooth(method="lm", se=FALSE, color="red") +
  theme_minimal() +
  labs(title="Effect of Education on Income",
       x="Years of Education",
       y="Income (GBP)")

Please make sure you do the following:

For each table or graph you display, you should include a brief discussion about what the results mean (interpretation, implications, statistical significance etc.)
For example, for a simple regression, you would need to interpret the regression coefficients and discuss their implications.
Example: “Each additional year of education increases income by approximately £3,000.”

Below, you can find some additional tips about how to generate tables:

Customizing Tables: Adding Labels and Notes

# Customize the table
stargazer(model, type = "html",
          title = "Customized Regression Table",
          dep.var.labels = "Income (GBP)",
          covariate.labels = c("Years of Education", "Age"),
          notes = "Standard errors in parentheses",digits = 3)

**Customized Regression Table**

	Dependent variable:

	Income (GBP)

Years of Education	2,859.074^***
	(262.371)

Age	-101.376^**
	(46.920)

Constant	26,175.750^***
	(4,770.707)


Observations	100
R²	0.562
Adjusted R²	0.553
Residual Std. Error	4,765.207 (df = 97)
F Statistic	62.131^*** (df = 2; 97)

Note:	p<0.1; p<0.05; p<0.01
	Standard errors in parentheses

Comparing Different Estimation Models

# Fit another model
model2 <- lm(income ~ education, data = data)

# Compare models
stargazer(model, model2, type = "html",
          title = "Model Comparison",
          column.labels = c("Model 1", "Model 2"),digits = 3)

**Model Comparison**

	Dependent variable:

	income
	Model 1	Model 2
	(1)	(2)

education	2,859.074^***	2,868.821^***
	(262.371)	(267.197)

age	-101.376^**
	(46.920)

Constant	26,175.750^***	21,584.850^***
	(4,770.707)	(4,350.615)


Observations	100	100
R²	0.562	0.541
Adjusted R²	0.553	0.536
Residual Std. Error	4,765.207 (df = 97)	4,853.574 (df = 98)
F Statistic	62.131^*** (df = 2; 97)	115.278^*** (df = 1; 98)

Note:	p<0.1; p<0.05; p<0.01

6 Conclusion

Recommended word count: 100-200 words.

What to include in the conclusion:

Key Findings

Briefly summarize the main results.
Example: “Education has a significant positive effect on income.”

Policy Implications

Discuss how the findings can be used (if there are any such policy implications).
Example: “Investing in education can lead to higher earnings for individuals.”

Limitations

Acknowledge limitations.
Example: “The analysis does not account for unobserved factors like motivation.”

Writing Results for an Empirical Project

Your name

2024-12-03