Slope index of inequality

Julian Flowers

26 August 2015

R Markdown

This is an R Markdown presentation. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.

Calculating slope of index inequality (SII) in R

Read data

Takes a data frame with area name/ code, population, deprivation score, health measure(s)

sii <- read.csv("~/Downloads/sii.csv")
head(sii)
##   Pr.code                                     CCG   Pop   IMD   Lem   Lef
## 1  A81001 NHS Hartlepool And Stockton-On-Tees CCG  4084 26.09 76.24 81.98
## 2  A81002 NHS Hartlepool And Stockton-On-Tees CCG 20111 28.44 77.12 81.81
## 3  A81003 NHS Hartlepool And Stockton-On-Tees CCG  3542 42.12 75.94 80.95
## 4  A81004                      NHS South Tees CCG  8513 31.01 76.72 81.36
## 5  A81005                      NHS South Tees CCG  7953 13.92 82.03 82.95
## 6  A81006 NHS Hartlepool And Stockton-On-Tees CCG 12221 29.81 76.76 81.40
dim(sii)
## [1] 7891    6

Install and load packages; look at data summary; remove NAs

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(broom)
library(Hmisc)
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Loading required package: ggplot2
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:dplyr':
## 
##     combine, src, summarize
## The following objects are masked from 'package:base':
## 
##     format.pval, round.POSIXt, trunc.POSIXt, units
library(ggplot2)
sii <- tbl_df(sii)
summary(sii)
##     Pr.code                                              CCG      
##  A81001 :   1   NHS Northern, Eastern And Western Devon CCG: 123  
##  A81002 :   1   NHS Birmingham Crosscity CCG               : 115  
##  A81003 :   1   NHS Cambridgeshire and Peterborough CCG    : 106  
##  A81004 :   1   NHS Sandwell And West Birmingham CCG       : 105  
##  A81005 :   1   NHS Dorset CCG                             :  99  
##  A81006 :   1   NHS Liverpool CCG                          :  94  
##  (Other):7885   (Other)                                    :7249  
##       Pop             IMD             Lem             Lef       
##  Min.   :    0   Min.   : 2.86   Min.   :70.01   Min.   :76.61  
##  1st Qu.: 3700   1st Qu.:13.56   1st Qu.:77.16   1st Qu.:81.66  
##  Median : 6211   Median :21.41   Median :78.79   Median :83.04  
##  Mean   : 7053   Mean   :23.51   Mean   :78.69   Mean   :83.00  
##  3rd Qu.: 9552   3rd Qu.:31.74   3rd Qu.:80.32   3rd Qu.:84.34  
##  Max.   :52386   Max.   :68.47   Max.   :90.07   Max.   :91.88  
##                  NA's   :15      NA's   :260     NA's   :261
sii <- na.omit(sii)
summary(sii)
##     Pr.code                                              CCG      
##  A81001 :   1   NHS Northern, Eastern And Western Devon CCG: 120  
##  A81002 :   1   NHS Birmingham Crosscity CCG               : 112  
##  A81003 :   1   NHS Cambridgeshire and Peterborough CCG    : 104  
##  A81004 :   1   NHS Sandwell And West Birmingham CCG       : 100  
##  A81005 :   1   NHS Dorset CCG                             :  94  
##  A81006 :   1   NHS Liverpool CCG                          :  87  
##  (Other):7610   (Other)                                    :6999  
##       Pop             IMD             Lem             Lef       
##  Min.   :    0   Min.   : 2.86   Min.   :70.01   Min.   :76.61  
##  1st Qu.: 3703   1st Qu.:13.48   1st Qu.:77.16   1st Qu.:81.66  
##  Median : 6210   Median :21.23   Median :78.79   Median :83.04  
##  Mean   : 7058   Mean   :23.38   Mean   :78.69   Mean   :83.00  
##  3rd Qu.: 9564   3rd Qu.:31.43   3rd Qu.:80.31   3rd Qu.:84.34  
##  Max.   :52386   Max.   :68.47   Max.   :90.07   Max.   :91.88  
## 

Calculate mean IMD and life expectancy values per CCG (for use later)

aggsii <- summarise(group_by(sii, CCG), mean(IMD), mean(Lem), mean(Lef))
head(aggsii)
## # A tibble: 6 × 4
##                                      CCG `mean(IMD)` `mean(Lem)`
##                                   <fctr>       <dbl>       <dbl>
## 1 NHS Airedale, Wharfdale And Craven CCG    24.13765    77.61882
## 2                        NHS Ashford CCG    33.34071    78.61429
## 3                 NHS Aylesbury Vale CCG    15.43619    80.21000
## 4           NHS Barking And Dagenham CCG    21.22400    78.85450
## 5                         NHS Barnet CCG    13.09299    80.00672
## 6                       NHS Barnsley CCG    29.44405    77.43676
## # ... with 1 more variables: `mean(Lef)` <dbl>

Call cumsum function (written by Hugh Mallinson)

source('~/Downloads/findXvals.R')
cumrank
## function (xvals) 
## {
##     prop <- xvals/sum(xvals)
##     cumprop <- numeric(length(xvals))
##     output <- numeric(length(xvals))
##     cumprop[1] <- prop[1]
##     output[1] <- prop[1]/2
##     for (i in 2:length(xvals)) {
##         cumprop[i] <- sum(prop[1:i])
##         output[i] <- prop[i]/2 + cumprop[i - 1]
##     }
##     FindXVals <- output
## }

Calculate relative rank

sii1 <- sii %>% arrange(CCG, IMD) %>% group_by(CCG) %>% mutate(relrank = cumrank(Pop))
head(sii1); tail(sii1)
## Source: local data frame [6 x 7]
## Groups: CCG [1]
## 
##   Pr.code                                    CCG   Pop   IMD   Lem   Lef
##    <fctr>                                 <fctr> <int> <dbl> <dbl> <dbl>
## 1  B83006 NHS Airedale, Wharfdale And Craven CCG 10758  7.00 80.45 83.39
## 2  B83021 NHS Airedale, Wharfdale And Craven CCG 12748  9.84 77.68 78.64
## 3  B82028 NHS Airedale, Wharfdale And Craven CCG 13912 10.31 80.36 85.18
## 4  B83002 NHS Airedale, Wharfdale And Craven CCG  4526 10.70 80.17 85.45
## 5  B82099 NHS Airedale, Wharfdale And Craven CCG  4109 11.39 81.02 85.27
## 6  B82007 NHS Airedale, Wharfdale And Craven CCG  9309 12.04 75.77 80.71
## # ... with 1 more variables: relrank <dbl>
## Source: local data frame [6 x 7]
## Groups: CCG [1]
## 
##   Pr.code                 CCG   Pop   IMD   Lem   Lef   relrank
##    <fctr>              <fctr> <int> <dbl> <dbl> <dbl>     <dbl>
## 1  M81073 NHS Wyre Forest CCG  8503 13.82 79.63 85.55 0.6573086
## 2  M81068 NHS Wyre Forest CCG 13106 17.28 78.98 84.99 0.7535265
## 3  M81608 NHS Wyre Forest CCG  2921 18.94 80.71 85.69 0.8248896
## 4  M81090 NHS Wyre Forest CCG  3217 23.17 81.11 85.49 0.8522201
## 5  M81027 NHS Wyre Forest CCG  6922 44.47 76.46 82.30 0.8973658
## 6  M81015 NHS Wyre Forest CCG  8064 50.11 74.99 81.25 0.9640936

Set up variables for regression analysis (for SII for male life expectancy)

sii1 <- sii %>% arrange(CCG, IMD) %>% group_by(CCG) %>% mutate(relrank = cumrank(Pop), 
tot = sum(Pop), prop = Pop/tot, Y = sqrt(prop)*Lem, a = sqrt(prop), b = relrank * a)

head(sii1[,5:12])
## # A tibble: 6 × 8
##     Lem   Lef   relrank    tot       prop        Y         a           b
##   <dbl> <dbl>     <dbl>  <int>      <dbl>    <dbl>     <dbl>       <dbl>
## 1 80.45 83.39 0.0344245 156255 0.06884900 21.10935 0.2623909 0.009032676
## 2 77.68 78.64 0.1096413 156255 0.08158459 22.18775 0.2856302 0.031316860
## 3 80.36 85.18 0.1949506 156255 0.08903395 23.97826 0.2983856 0.058170435
## 4 80.17 85.45 0.2539503 156255 0.02896547 13.64433 0.1701925 0.043220422
## 5 81.02 85.27 0.2815814 156255 0.02629676 13.13843 0.1621628 0.045662013
## 6 75.77 80.71 0.3245176 156255 0.05957569 18.49404 0.2440813 0.079208690

Calculate SIIs

siiM <- sii1 %>% group_by(CCG) %>% do(tidy(lm(Y ~ 0 + a + b, data=.))) ##set intercept at 0
 
siiM <- filter(siiM, term == "b") %>% mutate (lci = estimate - 1.96*std.error,
  uci = estimate + 1.96*std.error) %>% 
  select(CCG, estimate, lci, uci) ##extract SII values from regression and calculate 95% CIs
 
head(siiM)
## Source: local data frame [6 x 4]
## Groups: CCG [6]
## 
##                                      CCG  estimate        lci       uci
##                                   <fctr>     <dbl>      <dbl>     <dbl>
## 1 NHS Airedale, Wharfdale And Craven CCG -6.388822  -9.381419 -3.396226
## 2                        NHS Ashford CCG -4.618791  -6.727257 -2.510325
## 3                 NHS Aylesbury Vale CCG -7.403199 -10.178371 -4.628027
## 4           NHS Barking And Dagenham CCG -6.277636  -7.384998 -5.170273
## 5                         NHS Barnet CCG -4.050214  -5.146874 -2.953554
## 6                       NHS Barnsley CCG -6.445051  -7.874487 -5.015614

Same for female LE

sii2 <- sii %>% arrange(CCG, IMD) %>% group_by(CCG) %>% mutate(relrank = cumrank(Pop), 
tot = sum(Pop), prop = Pop/tot, Y = sqrt(prop)*Lef, a = sqrt(prop), b = relrank * a)
siiF <- sii2 %>% group_by(CCG) %>% do(tidy(lm(Y ~ 0 + a + b, data=.))) ##set intercept at 0
 
siiF <- filter(siiF, term == "b") %>% mutate (lci = estimate - 1.96*std.error,
  uci = estimate + 1.96*std.error) %>% 
  select(CCG, estimate, lci, uci) ##extract SII values from regression and calculate 95% CIs

head(siiF)
## Source: local data frame [6 x 4]
## Groups: CCG [6]
## 
##                                      CCG   estimate       lci         uci
##                                   <fctr>      <dbl>     <dbl>       <dbl>
## 1 NHS Airedale, Wharfdale And Craven CCG -3.9265523 -7.753242 -0.09986274
## 2                        NHS Ashford CCG -0.2193803 -1.826304  1.38754370
## 3                 NHS Aylesbury Vale CCG -3.7221070 -5.674428 -1.76978574
## 4           NHS Barking And Dagenham CCG -5.2663517 -6.472143 -4.06056026
## 5                         NHS Barnet CCG -3.2725833 -4.340735 -2.20443173
## 6                       NHS Barnsley CCG -4.4419603 -5.805199 -3.07872185

Extract SIIs and plot

tidy(model)
##          term  estimate std.error statistic      p.value
## 1 (Intercept) -1.704137 0.1652248 -10.31405 2.156211e-20
## 2           x  0.886490 0.0427591  20.73220 2.064050e-52

Compare distributions

##            x         y
## 1 -3.9265523 -6.388822
## 2 -0.2193803 -4.618791
## 3 -3.7221070 -7.403199
## 4 -5.2663517 -6.277636
## 5 -3.2725833 -4.050214
## 6 -4.4419603 -6.445051

Show values

## # A tibble: 6 × 11
##                                      CCG  estimate        lci       uci
##                                   <fctr>     <dbl>      <dbl>     <dbl>
## 1 NHS Airedale, Wharfdale And Craven CCG -6.388822  -9.381419 -3.396226
## 2                        NHS Ashford CCG -4.618791  -6.727257 -2.510325
## 3                 NHS Aylesbury Vale CCG -7.403199 -10.178371 -4.628027
## 4           NHS Barking And Dagenham CCG -6.277636  -7.384998 -5.170273
## 5                         NHS Barnet CCG -4.050214  -5.146874 -2.953554
## 6                       NHS Barnsley CCG -6.445051  -7.874487 -5.015614
## # ... with 7 more variables: CCG.1 <fctr>, estimate.1 <dbl>, lci.1 <dbl>,
## #   uci.1 <dbl>, `mean(IMD)` <dbl>, `mean(Lem)` <dbl>, `mean(Lef)` <dbl>
## [1] 209  11
## Classes 'tbl_df', 'tbl' and 'data.frame':    209 obs. of  11 variables:
##  $ CCG       : Factor w/ 209 levels "NHS Airedale, Wharfdale And Craven CCG",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ estimate  : num  -6.39 -4.62 -7.4 -6.28 -4.05 ...
##  $ lci       : num  -9.38 -6.73 -10.18 -7.38 -5.15 ...
##  $ uci       : num  -3.4 -2.51 -4.63 -5.17 -2.95 ...
##  $ CCG.1     : Factor w/ 209 levels "NHS Airedale, Wharfdale And Craven CCG",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ estimate.1: num  -3.927 -0.219 -3.722 -5.266 -3.273 ...
##  $ lci.1     : num  -7.75 -1.83 -5.67 -6.47 -4.34 ...
##  $ uci.1     : num  -0.0999 1.3875 -1.7698 -4.0606 -2.2044 ...
##  $ mean(IMD) : num  24.1 33.3 15.4 21.2 13.1 ...
##  $ mean(Lem) : num  77.6 78.6 80.2 78.9 80 ...
##  $ mean(Lef) : num  82.1 83.5 84.1 82.7 83.9 ...