Ch. 1 - Data cleaning and summarizing with dplyr

The United Nations voting dataset

Filtering rows

Adding a year column

Adding a country column

Grouping and summarizing

Summarizing the full dataset

Summarizing by year

Summarizing by country

Sorting and filtering summarized data

Sorting by percentage of “yes” votes

Filtering summarized output


Ch. 2 - Data visualization with ggplot2

Visualization with ggplot2

Choosing an aesthetic

Plotting a line over time

Other ggplot2 layers

Visualizing by country

Summarizing by year and country

Plotting just the UK over time

Plotting multiple countries

Faceting

Faceting by country

Faceting with free y-axis

Choose your own countries


Ch. 3 - Tidy modeling with broom

Linear regression

Linear regression on the United States

Finding the slope of a linear regression

Finding the p-value of a linear regression

Tidying models with broom

Tidying a linear regression model

Combining models for multiple countries

Nesting for multiple models

Nesting a data frame

List columns

Unnesting

Fitting multiple models

Performing linear regression on each nested dataset

Tidy each linear regression model

Unnesting a data frame

Working with many tidy models

Filtering model terms

Filtering for significant countries

Sorting by slope


Ch. 4 - Joining and tidying

Joining datasets

Joining datasets with inner_join

Filtering the joined dataset

Visualizing colonialism votes

Tidy data

Tidy data observations

Using gather to tidy a dataset

Recoding the topics

Summarize by country, year, and topic

Tidy modeling by topic and country

Nesting by topic and country

Interpreting tidy models

Checking models visually

Conclusion


About Michael Mallari

Michael is a hybrid thinker and doer—a byproduct of being a StrengthsFinder “Learner” over time. With nearly 20 years of engineering, design, and product experience, he helps organizations identify market needs, mobilize internal and external resources, and deliver delightful digital customer experiences that align with business goals. He has been entrusted with problem-solving for brands—ranging from Fortune 500 companies to early-stage startups to not-for-profit organizations.

Michael earned his BS in Computer Science from New York Institute of Technology and his MBA from the University of Maryland, College Park. He is also a candidate to receive his MS in Applied Analytics from Columbia University.

LinkedIn | Twitter | michaelmallari.com