Data Management and Visualizations

Introduction

In this blog post, I will detail the process of making data management decisions and showcase visualizations for three variables: Age, Income, and Education Level.

Step 1: Data Management Decisions

Handling Missing Data

For the Age variable, missing values were imputed using the median age. This decision ensures that we maintain the integrity of our dataset while accounting for any missing demographic information.

Variable Recoding

To provide a clearer understanding of age distribution, a new variable named ‘Age_Group’ was created by grouping ages into categories. This recoding allows for a more straightforward interpretation of the data.

# Load necessary libraries
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# Generate more random data
set.seed(123)
n <- 1500
df <- data.frame(
  Age = sample(c(18:80), n, replace = TRUE),
  Income = sample(c(25000:100000), n, replace = TRUE),
  Education_Level = sample(c("High School", "Bachelor's", "Master's", "PhD"), n, replace = TRUE)
)

# Introduce missing values in Age
df$Age[sample(1:n, n/10)] <- NA

# Handle missing data in Age
df$Age[is.na(df$Age)] <- median(df$Age, na.rm = TRUE)

# Recode variables
df$Age_Group <- cut(df$Age, breaks = c(18, 25, 35, 45, 55, 65, 100), labels = c('18-25', '26-35', '36-45', '46-55', '56-65', '66+'), right = FALSE)

Step 2: Run Frequency Distributions and Visualizations

Now, let’s explore the frequency distributions and create visualizations for the managed variables.

# Run frequency distributions
freq_age <- table(df$Age_Group)
freq_income <- table(df$Income)
freq_education <- table(df$Education_Level)

Frequency Distributions

Age Group

freq_age
## 
## 18-25 26-35 36-45 46-55 56-65   66+ 
##   138   236   219   381   209   317

The frequency distribution for Age Group reveals…

Income

freq_income
## 
## 25004 25205 25259 25351 25473 25527 25531 25542 25608 25662 25673 25717 25742 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 25886 25913 25922 25947 25969 26002 26003 26041 26054 26157 26190 26201 26273 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 26333 26451 26547 26606 26667 26673 26681 26761 26788 26846 26849 26872 26935 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 26961 26988 26992 27020 27037 27171 27172 27303 27314 27323 27355 27420 27421 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 27434 27436 27468 27472 27538 27664 27849 27872 27909 27935 27945 27958 27959 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 28003 28008 28017 28102 28177 28287 28301 28338 28347 28365 28385 28485 28492 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 28522 28526 28539 28631 28641 28642 28648 28797 28801 28830 28837 28854 28885 
##     1     1     1     1     1     1     2     1     1     1     1     1     1 
## 28886 28947 28956 29001 29097 29175 29306 29342 29355 29391 29398 29426 29546 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 29573 29587 29712 29797 29960 30007 30071 30086 30102 30111 30189 30246 30263 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 30300 30375 30391 30455 30482 30524 30750 30798 30820 30871 30917 30931 31062 
##     1     1     1     1     1     1     1     1     1     1     1     1     2 
## 31072 31077 31144 31186 31206 31238 31315 31332 31336 31365 31382 31383 31396 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 31401 31439 31501 31530 31600 31667 31874 31876 31950 32102 32110 32119 32187 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 32283 32333 32394 32431 32757 32758 32806 32830 32873 32901 33145 33192 33258 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 33312 33313 33384 33389 33411 33508 33522 33545 33560 33619 33829 33831 33844 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 33934 34007 34012 34041 34045 34072 34246 34264 34310 34320 34410 34467 34470 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 34541 34562 34722 34744 34812 34820 34939 35035 35074 35117 35173 35190 35246 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 35247 35249 35266 35329 35354 35387 35453 35470 35490 35556 35596 35624 35728 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 35789 35846 35864 35869 35942 35955 36003 36079 36116 36137 36158 36180 36203 
##     1     1     1     2     1     1     1     1     1     1     1     1     2 
## 36273 36275 36371 36389 36495 36607 36620 36682 36700 36705 36721 36798 36917 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 37017 37019 37082 37142 37150 37198 37300 37333 37425 37439 37466 37491 37529 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 37538 37640 37646 37723 37761 37780 37856 37870 37892 37928 38031 38070 38072 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 38238 38282 38343 38447 38451 38458 38595 38699 38740 38834 38887 38938 38999 
##     1     1     1     1     1     1     2     1     1     1     1     1     1 
## 39036 39129 39173 39256 39302 39317 39404 39477 39609 39631 39735 39774 39833 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 39850 39935 39947 39986 40016 40040 40126 40144 40158 40178 40214 40270 40397 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 40563 40570 40655 40785 40834 40849 40885 40970 40982 41006 41020 41122 41150 
##     1     1     1     1     1     1     1     1     1     1     1     2     1 
## 41161 41198 41224 41244 41293 41374 41437 41443 41490 41598 41647 41728 41855 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 41857 41924 41935 42015 42137 42155 42160 42193 42241 42343 42393 42426 42442 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 42489 42520 42598 42617 42673 42769 42815 42831 42848 42862 42889 42939 42954 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 43014 43058 43156 43216 43317 43383 43453 43534 43607 43668 43670 43770 43817 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 43906 43925 43936 43967 43969 44060 44125 44130 44135 44192 44348 44459 44648 
##     1     1     1     1     1     1     1     1     1     1     2     1     1 
## 44697 44710 44766 44771 44806 44953 45001 45198 45229 45315 45389 45438 45473 
##     1     1     1     1     1     1     1     1     1     2     1     1     1 
## 45516 45569 45577 45643 45695 45785 45815 45873 45875 45991 45996 46039 46258 
##     1     1     1     1     1     1     2     1     1     1     1     1     1 
## 46317 46328 46511 46555 46644 46649 46718 46726 46770 46810 46903 46984 47053 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 47093 47106 47157 47231 47252 47261 47311 47339 47341 47362 47381 47505 47509 
##     1     1     1     1     1     1     1     1     2     1     1     1     1 
## 47524 47541 47548 47557 47583 47596 47609 47659 47720 47822 47848 47854 47927 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 47998 48052 48171 48330 48505 48567 48752 48796 48946 48947 48978 49005 49007 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 49023 49026 49055 49162 49197 49225 49246 49255 49403 49446 49452 49457 49465 
##     1     1     1     1     1     1     1     1     1     2     1     1     1 
## 49481 49567 49673 49789 49798 49802 49852 49891 49949 49985 50011 50044 50081 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 50109 50196 50243 50305 50355 50478 50500 50604 50640 50648 50680 50741 50854 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 50872 50943 50984 50985 51023 51043 51197 51218 51233 51259 51350 51364 51378 
##     1     1     1     1     1     2     1     1     1     2     1     1     1 
## 51556 51686 51710 51752 51801 51807 51817 51850 51888 51891 51920 52057 52133 
##     2     1     1     1     1     1     1     1     1     1     1     1     1 
## 52250 52267 52293 52322 52350 52352 52440 52469 52523 52643 52648 52654 52681 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 52701 52742 52771 52836 52847 52848 52863 52948 52954 52966 52994 53034 53108 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 53259 53282 53359 53361 53413 53426 53437 53475 53510 53520 53527 53600 53622 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 53637 53649 53651 53783 53840 53977 54005 54048 54126 54168 54277 54283 54305 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 54334 54443 54465 54492 54531 54551 54648 54707 54719 54744 54781 54906 54926 
##     1     1     1     1     1     1     1     1     2     1     1     1     1 
## 54980 55091 55159 55192 55246 55317 55334 55401 55412 55454 55462 55480 55503 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 55528 55535 55551 55676 55695 55698 55773 55826 55855 55921 55932 55975 55980 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 56071 56131 56180 56233 56279 56321 56363 56388 56399 56421 56450 56459 56518 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 56565 56616 56635 56771 56844 57177 57327 57370 57410 57462 57484 57488 57637 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 57667 57673 57690 57709 57711 57726 57762 57780 57797 57799 57816 57861 57899 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 57928 58001 58003 58036 58057 58100 58126 58137 58280 58287 58328 58371 58414 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 58441 58444 58483 58494 58496 58559 58562 58601 58713 58776 58858 58869 58966 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 59051 59091 59147 59166 59255 59351 59366 59435 59443 59453 59461 59502 59564 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 59612 59616 59621 59647 59741 59867 59887 59889 60152 60254 60303 60317 60320 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 60424 60525 60543 60845 60849 60894 61007 61090 61181 61233 61289 61357 61501 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 61516 61521 61557 61572 61839 61853 61903 61932 61996 62005 62022 62097 62134 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 62174 62222 62224 62424 62490 62682 62806 62855 62970 62977 62982 62983 62997 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 63025 63047 63087 63180 63257 63268 63287 63325 63352 63442 63460 63556 63576 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 63702 63853 63930 63941 64001 64076 64112 64149 64195 64213 64370 64373 64395 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 64403 64459 64488 64545 64573 64643 64727 64738 64793 64797 64807 64841 64879 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 64906 64961 64986 65039 65101 65115 65126 65170 65196 65242 65268 65321 65331 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 65440 65450 65521 65632 65644 65702 65743 65810 65817 65858 65883 66060 66099 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 66124 66193 66232 66236 66276 66310 66385 66512 66569 66673 66683 66720 66755 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 66844 66938 67044 67045 67208 67245 67367 67433 67458 67480 67511 67552 67614 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 67622 67631 67751 67774 67841 67867 67950 68017 68024 68201 68298 68389 68390 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 68473 68535 68605 68649 68650 68800 68819 68971 68978 68992 69139 69141 69156 
##     2     1     1     1     1     1     1     1     1     1     1     1     1 
## 69197 69218 69287 69319 69333 69348 69359 69412 69469 69510 69586 69598 69779 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 69795 69806 69817 69849 70029 70034 70060 70074 70080 70173 70203 70228 70246 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 70268 70310 70323 70356 70369 70373 70532 70584 70664 70741 70775 70815 70850 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 71117 71124 71128 71151 71190 71225 71228 71338 71354 71363 71404 71414 71493 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 71599 71696 71701 71751 71775 71780 71803 71847 71912 71970 71987 71998 72154 
##     2     1     1     1     1     1     1     1     1     1     1     1     1 
## 72190 72295 72331 72431 72487 72740 72755 72769 72781 72785 72839 72867 72969 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 73036 73103 73122 73128 73186 73216 73219 73234 73267 73269 73276 73373 73448 
##     1     1     1     2     1     1     1     1     1     1     1     1     1 
## 73497 73501 73511 73513 73525 73530 73575 73696 73717 73756 73813 73908 73920 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 73955 73961 74245 74267 74375 74413 74602 74723 74754 74785 74787 74956 74982 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 75035 75049 75107 75188 75239 75240 75312 75365 75423 75496 75785 75813 75840 
##     1     1     1     1     2     1     1     1     1     1     1     1     1 
## 75853 75867 75877 75887 75938 76001 76024 76034 76047 76050 76051 76113 76185 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 76216 76259 76263 76395 76416 76441 76496 76647 76775 76804 76862 76892 76896 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 77158 77170 77191 77242 77257 77267 77349 77373 77473 77624 77744 77817 77919 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 78038 78041 78091 78221 78231 78277 78487 78509 78524 78546 78635 78846 78917 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 78936 78979 78997 79005 79014 79034 79046 79088 79157 79162 79182 79221 79334 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 79360 79380 79400 79427 79473 79515 79536 79578 79614 79742 79745 79788 79815 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 79859 79891 79905 80048 80080 80116 80152 80155 80167 80188 80208 80240 80246 
##     1     1     1     1     2     1     1     1     1     1     1     1     1 
## 80345 80394 80438 80512 80546 80586 80662 80676 80790 80791 80797 80820 80831 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 80885 80902 80934 80969 80987 81106 81134 81177 81191 81285 81309 81441 81510 
##     1     1     1     1     1     1     1     1     1     1     1     1     2 
## 81570 81657 81784 81986 82136 82200 82246 82309 82333 82379 82435 82478 82499 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 82524 82538 82550 82604 82634 82708 82765 82776 82781 82790 82911 82943 82973 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 83011 83056 83063 83084 83126 83203 83205 83213 83310 83368 83437 83454 83459 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 83542 83551 83556 83594 83643 83699 83788 83908 83968 83973 84011 84055 84084 
##     1     1     1     1     1     1     1     1     1     1     1     2     1 
## 84124 84279 84304 84406 84423 84426 84493 84554 84562 84641 84672 84856 84897 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 84923 84989 85042 85189 85223 85333 85388 85392 85399 85422 85491 85548 85641 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 85661 85903 85919 85941 85971 86022 86023 86033 86052 86110 86177 86188 86197 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 86221 86233 86261 86327 86334 86339 86475 86495 86594 86619 86636 86654 86722 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 86730 86910 86928 86972 87024 87105 87119 87125 87140 87241 87276 87284 87300 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 87338 87458 87497 87518 87538 87573 87608 87621 87623 87649 87700 87716 87789 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 87793 87876 87965 87986 88076 88092 88136 88161 88261 88275 88326 88398 88457 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 88520 88638 88729 88810 88874 88884 88953 88966 89013 89066 89089 89175 89203 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 89306 89340 89347 89352 89592 89686 89714 89747 89764 89771 89779 89789 89845 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 89860 89894 89934 89963 89964 89982 90027 90173 90227 90307 90345 90358 90590 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 90643 90648 90667 90713 90767 90809 90822 90838 90844 90851 90863 90901 90917 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 90919 91148 91236 91241 91328 91347 91389 91429 91445 91499 91736 91807 91810 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 91813 91843 91868 91983 92001 92024 92100 92139 92146 92249 92258 92281 92376 
##     1     1     1     2     1     1     1     1     1     1     1     1     1 
## 92505 92529 92537 92552 92671 92926 92938 92940 92948 93021 93116 93394 93445 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 93489 93496 93498 93505 93594 93601 93643 93653 93715 93738 93746 93749 93806 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 93812 93890 93981 94009 94145 94263 94288 94331 94382 94469 94492 94537 94579 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 94584 94694 94728 94823 94894 94936 94960 95173 95180 95211 95273 95344 95359 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 95381 95404 95440 95517 95527 95616 95618 95631 95759 95786 95826 95836 95994 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 96078 96111 96224 96258 96285 96286 96393 96394 96404 96560 96580 96598 96675 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 96938 96966 97028 97034 97041 97069 97229 97299 97348 97382 97409 97417 97454 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 97580 97618 97731 97735 97769 97777 97889 97964 98070 98225 98283 98286 98288 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 98363 98403 98417 98559 98597 98684 98690 98716 98795 98863 98909 98983 98999 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 99062 99159 99297 99335 99342 99399 99405 99466 99515 99598 99633 99665 99685 
##     1     1     1     1     1     1     1     1     1     1     1     1     1 
## 99783 99826 99828 99850 99867 99883 99884 99921 
##     1     1     1     1     1     1     1     1

The frequency distribution for Income shows…

Education Level

freq_education
## 
##  Bachelor's High School    Master's         PhD 
##         356         360         411         373

The frequency distribution for Education Level indicates…

Visualizations

Histogram for Age

ggplot(df, aes(x = Age)) +
  geom_histogram(binwidth = 5, fill = "skyblue", color = "black") +
  labs(title = "Age Distribution", x = "Age", y = "Frequency")

Scatterplot of Income vs. Age

ggplot(df, aes(x = Age, y = Income)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, color = "blue") +
  labs(title = "Scatterplot of Income vs. Age", x = "Age", y = "Income")
## `geom_smooth()` using formula = 'y ~ x'

Bar Chart for Education Level

ggplot(df, aes(x = Education_Level, fill = Education_Level)) +
  geom_bar() +
  labs(title = "Distribution of Education Level", x = "Education Level", y = "Count") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))