Checking and updating Chart 2

Data wrangling

library(tidyverse)
Warning: package 'dplyr' was built under R version 4.2.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.3     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readxl)

sht <- readxl::read_excel(  
  path = here::here("data", "Stock+by+Tenure+-+Scotland+1993-2020+%26+Local+Authorities+2013-2020 (1).xls"), 
  sheet = "Tbl Stock by Tenure",
  range = "A7:O34",
  col_names = FALSE
)
New names:
• `` -> `...1`
• `` -> `...2`
• `` -> `...3`
• `` -> `...4`
• `` -> `...5`
• `` -> `...6`
• `` -> `...7`
• `` -> `...8`
• `` -> `...9`
• `` -> `...10`
• `` -> `...11`
• `` -> `...12`
• `` -> `...13`
• `` -> `...14`
• `` -> `...15`
names(sht) <- c(
  "Year",
  "Total number of dwellings (000s)",
  "Total number of occupied dwellings (000s)",
  "Total number of vacant dwellings (000s)",
  "Privately owned dwellings | Total number occupied dwellings",
  "Privately owned dwellings | Owner occupied | Number",
  "Privately owned dwellings | Owner occupied | Percentage",
  "Privately owned dwellings | Rented property with a job/business | Number",
  "Privately owned dwellings | Rented property with a job/business | Percentage",
  "Privately owned dwellings | Vacant private dwellings and second homes | Number",
  "Privately owned dwellings | Vacant private dwellings and second homes | Percentage",
  "Social rented dwellings | From housing associations | Number",
  "Social rented dwellings | From housing associations | Percentage",
  "Social rented dwellings | From local authorities, New Towns, Scottish Homes | Number",
  "Social rented dwellings | From local authorities, New Towns, Scottish Homes | Percentage"
)

sht |> 
  mutate(finyear_start = lubridate::my(Year))
# A tibble: 28 × 16
   Year     Total number of dwel…¹ Total number of occu…² Total number of vaca…³
   <chr>                     <dbl>                  <dbl>                  <dbl>
 1 Decembe…                  2193                     NA                    NA  
 2 Decembe…                  2210                     NA                    NA  
 3 Decembe…                  2230                     NA                    NA  
 4 Decembe…                  2248                     NA                    NA  
 5 Decembe…                  2266                     NA                    NA  
 6 Decembe…                  2283                     NA                    NA  
 7 Decembe…                  2303                     NA                    NA  
 8 Decembe…                  2322                     NA                    NA  
 9 March 2…                  2312.                  2212.                   99.6
10 March 2…                  2329.                  2229.                  100. 
# ℹ 18 more rows
# ℹ abbreviated names: ¹​`Total number of dwellings (000s)`,
#   ²​`Total number of occupied dwellings (000s)`,
#   ³​`Total number of vacant dwellings (000s)`
# ℹ 12 more variables:
#   `Privately owned dwellings | Total number occupied dwellings` <dbl>,
#   `Privately owned dwellings | Owner occupied | Number` <dbl>, …

The categories are different to previous version. No distinction between with and without mortgate. But let’s see what it shows regardless:

sht |> 
  pivot_longer(-Year, names_to = "Category", values_to = "value")
# A tibble: 392 × 3
   Year          Category                                                  value
   <chr>         <chr>                                                     <dbl>
 1 December 1993 Total number of dwellings (000s)                        2193   
 2 December 1993 Total number of occupied dwellings (000s)                 NA   
 3 December 1993 Total number of vacant dwellings (000s)                   NA   
 4 December 1993 Privately owned dwellings | Total number occupied dwel…   NA   
 5 December 1993 Privately owned dwellings | Owner occupied | Number     1217   
 6 December 1993 Privately owned dwellings | Owner occupied | Percentage   55.5 
 7 December 1993 Privately owned dwellings | Rented property with a job…  154   
 8 December 1993 Privately owned dwellings | Rented property with a job…    7.02
 9 December 1993 Privately owned dwellings | Vacant private dwellings a…   NA   
10 December 1993 Privately owned dwellings | Vacant private dwellings a…   NA   
# ℹ 382 more rows

Total number of dwellings by occupied or not occupied

sht |> 
  pivot_longer(-Year, names_to = "Category", values_to = "value") |> 
  filter(
    Category %in% c("Total number of dwellings (000s)", "Total number of occupied dwellings (000s)", "Total number of vacant dwellings (000s)")
  ) |> 
  mutate(
    cat = janitor::make_clean_names(Category)
  ) |> 
  mutate(finyear_start = lubridate::my(Year)) |> 
  ggplot(aes(x = finyear_start, y = value, group = Category, colour = Category, shape = Category)) + 
  geom_line() + geom_point() + 
  labs(
    x = "Year",
    y = "Number of dwellings (000s)",
    title = "Total number of dwellings over time", 
    caption = "Source: 1991 Census; NRS Dwellings Count; Scottish Household Survey"
  )
Warning: Removed 16 rows containing missing values (`geom_line()`).
Warning: Removed 16 rows containing missing values (`geom_point()`).

The total number of dwellings continues to rise. There appears no clear trend in the number of vacant dwellings over time.

How about proportion of each type?

cats_of_interest <-
  c(
    "Privately owned dwellings | Owner occupied | Number",
    "Privately owned dwellings | Rented property with a job/business | Number",
    "Privately owned dwellings | Vacant private dwellings and second homes | Number",
    "Social rented dwellings | From housing associations | Number",
    "Social rented dwellings | From local authorities, New Towns, Scottish Homes | Number"
  )



sht |> 
  pivot_longer(-Year, names_to = "Category", values_to = "value") |> 
  filter(
    Category %in% cats_of_interest
  ) |> 
  mutate(
    simpler_categories = case_when(
      Category == "Privately owned dwellings | Owner occupied | Number" ~ "Owner Occupied",
      Category == "Privately owned dwellings | Rented property with a job/business | Number" ~ "Private Rented",
      Category %in% c(
    "Social rented dwellings | From housing associations | Number",
    "Social rented dwellings | From local authorities, New Towns, Scottish Homes | Number"
      ) ~ "Social Rented",
    TRUE ~ "Other"
    )
  ) |> 
  group_by(Year, simpler_categories) |> 
  summarise(value = sum(value)) |> 
  ungroup() |> 
  mutate(finyear_start = lubridate::my(Year)) |> 
  ggplot(aes(x = finyear_start, y = value, group = simpler_categories, colour = simpler_categories, shape = simpler_categories)) + 
  geom_line() + geom_point() + 
  labs(
    x = "Year",
    y = "Number of dwellings (000s)",
    title = "Total number of dwellings over time", 
    caption = "Source: 1991 Census; NRS Dwellings Count; Scottish Household Survey"
  )
`summarise()` has grouped output by 'Year'. You can override using the
`.groups` argument.
Warning: Removed 8 rows containing missing values (`geom_line()`).
Warning: Removed 8 rows containing missing values (`geom_point()`).

The number of social rented dwellings fell over the 1990s and 2000s to reach around 600,000 since around 2000. The number of private rented dwellings increased over this period. Owner occupied dwellings show less of a clear trend, but remain the primary type of dwelling in Scotland.

Now as share:

cats_of_interest <-
  c(
    "Privately owned dwellings | Owner occupied | Number",
    "Privately owned dwellings | Rented property with a job/business | Number",
    "Privately owned dwellings | Vacant private dwellings and second homes | Number",
    "Social rented dwellings | From housing associations | Number",
    "Social rented dwellings | From local authorities, New Towns, Scottish Homes | Number"
  )



sht |> 
  pivot_longer(-Year, names_to = "Category", values_to = "value") |> 
  filter(
    Category %in% cats_of_interest
  ) |> 
  mutate(
    simpler_categories = case_when(
      Category == "Privately owned dwellings | Owner occupied | Number" ~ "Owner Occupied",
      Category == "Privately owned dwellings | Rented property with a job/business | Number" ~ "Private Rented",
      Category %in% c(
    "Social rented dwellings | From housing associations | Number",
    "Social rented dwellings | From local authorities, New Towns, Scottish Homes | Number"
      ) ~ "Social Rented",
    TRUE ~ "Other"
    )
  ) |> 
  group_by(Year, simpler_categories) |> 
  summarise(value = sum(value)) |> 
  ungroup() |> 
  mutate(finyear_start = lubridate::my(Year)) |> 
  ggplot(aes(x = finyear_start, y = value, group = simpler_categories, colour = simpler_categories, fill = simpler_categories)) + 
  geom_col(width = 364) + 
  labs(
    x = "Year",
    y = "Number of dwellings (000s)",
    title = "Total number of dwellings over time", 
    caption = "Source: 1991 Census; NRS Dwellings Count; Scottish Household Survey"
  )
`summarise()` has grouped output by 'Year'. You can override using the
`.groups` argument.
Warning: Removed 8 rows containing missing values (`position_stack()`).
Warning: `position_stack()` requires non-overlapping x intervals

This shows how the categories accumulate.

cats_of_interest <-
  c(
    "Privately owned dwellings | Owner occupied | Number",
    "Privately owned dwellings | Rented property with a job/business | Number",
    "Privately owned dwellings | Vacant private dwellings and second homes | Number",
    "Social rented dwellings | From housing associations | Number",
    "Social rented dwellings | From local authorities, New Towns, Scottish Homes | Number"
  )



sht |> 
  pivot_longer(-Year, names_to = "Category", values_to = "value") |> 
  filter(
    Category %in% cats_of_interest
  ) |> 
  mutate(
    simpler_categories = case_when(
      Category == "Privately owned dwellings | Owner occupied | Number" ~ "Owner Occupied",
      Category == "Privately owned dwellings | Rented property with a job/business | Number" ~ "Private Rented",
      Category %in% c(
    "Social rented dwellings | From housing associations | Number",
    "Social rented dwellings | From local authorities, New Towns, Scottish Homes | Number"
      ) ~ "Social Rented",
    TRUE ~ "Other"
    )
  ) |> 
  group_by(Year, simpler_categories) |> 
  summarise(value = sum(value)) |> 
  ungroup() |> 
  mutate(finyear_start = lubridate::my(Year)) |> 
  ggplot(aes(x = finyear_start, y = value, group = simpler_categories, colour = simpler_categories, fill = simpler_categories)) + 
  geom_col(width = 364, position = position_fill()) + 
  labs(
    x = "Year",
    y = "Proportion of dwellings",
    title = "The changing proportions of dwelling housing types over time", 
    caption = "Source: 1991 Census; NRS Dwellings Count; Scottish Household Survey"
  )
`summarise()` has grouped output by 'Year'. You can override using the
`.groups` argument.
Warning: Removed 8 rows containing missing values (`position_stack()`).
Warning: `position_stack()` requires non-overlapping x intervals

This is perhaps the most similar chart to the existing one.