R Markdown

Measuring Social Class

Want to define a scale that incorporates multiple factors, acknowledging financial, occupational and educational factors, as well as the social construct (alignment) of social class. Below are the current factors:

- NSECC Occupation Type
- SIMD Deprivation Index
- Education Level (ordinal scale)
- Household Income (ordinal scale)… potentially problematic
- Self-identification (ordinal)
**I have also considered parental occupation, but not in below example…

Could have given weightings (proportions) of these 5 separate factors to create a final score. From this final score, we can assign them with a given social class?

Some details on the index factors:

Occupation Type (NSECC)

  • Already has an analytic class depending on occupation type: 1.1,1.2,2,3,4
  • full-time students, not stated/described are classed with an asterisk
  • Would need a work around for the * (9?): could assign students as 9.
  • This metric would need to be inversed to match all others, where 1 is starting at lowest
  • Achieved with inversing the scale in the code.

SIMD - Deprivation

  • With given postcode, can get a ranking of deprivation.
  • 1 = most deprived and goes to 10 in the overall rank.

Educational Level

  • ‘What is your highest educational level?’

  • Can create an ordinal scale from this

  • 1 (Did not leave high school with qualifications)

  • 2 High school qual

  • 3 College Qual

  • 4 Undergraduate degree

  • 5 Postgraduate degree

  • 6 PhD


Household Income

  • Create an ordinal scale for household income
  • 1 = (below 15k)
  • 2 = 16-25k (and upwards)

Self-identification

  • ‘How would you describe your social class’

  • I have asked this in open-box question before and it leaves a lot of deliberation. Was more successful when given a drop-down box to choose.

  • Could be more fine-grained boxes with: lower WC, upper WC, lower MC, upper MC, lower UC, upper UC, not sure

  • From this, assigned ordinal values to be used.


Let’s get started!

sc <- data.frame(
  PPS = c("LJ", "SW", "KJ", "RJ", "TJ"),
  NSECC = c(7, 4, 3, 8, 2), # this will need to be inversed. 
  SIMD = c(1, 1, 9, 3, 7),
  Education = c(5, 5, 4, 2, 6),
  HI = c(4, 4, 5, 1, 3),
  Self = c(2, 2, 3, 6, 1)
)
# Print the dataframe
print(sc)
##   PPS NSECC SIMD Education HI Self
## 1  LJ     7    1         5  4    2
## 2  SW     4    1         5  4    2
## 3  KJ     3    9         4  5    3
## 4  RJ     8    3         2  1    6
## 5  TJ     2    7         6  3    1

Starting to convert the details

# Inverse the NSECC score so that 1 = lowest, rather than current 9 = lowest
sc$NSECC <- (9 + 1) - sc$NSECC

# Standardising the scores across the ranking systems
  # Maximum possible scores for each factor

max_scores <- c(
  NSECC = 9,
  SIMD = 10,
  Education = 6,
  Parent_occ = 9,
  HI = 5,
  Self = 6
)

Normalize the raw scores to a 0-1 scale

#Normalising the scores
sc$NSECC_norm <- sc$NSECC / max_scores["NSECC"]
sc$SIMD_norm <- sc$SIMD / max_scores["SIMD"]
sc$Education_norm <- sc$Education / max_scores["Education"]
sc$HI_norm <- sc$HI / max_scores["HI"]
sc$Self_norm <- sc$Self / max_scores["Self"]

Now to think about what factors have the biggest contribution towards overall score…hmmm

Note, I’m not sure whether we need to give self-identification a higher weighting or not?

Can go down the rabbit hole here of social class as a performative identity, and self-realisation of class will inherently have an impact on projected semiotic resources speaker believes to be aligning with their class (linguistic variables/perception)

# Assign weights of factors as percentages
weights <- c(
  NSECC = 0.3,      # Occupation = 30%
  SIMD = 0.2,       # Deprivation index = 30%
  HI = 0.15,         # Household income = 15%
  #Parent_occ = 0.10,
  Education = 0.15,  # Education level = 15%
  Self = 0.20        # self-identification = 10% - this will need to raise as its a performative measure
)

##Ensure it does indeed count to 1, because I struggle with basic maths!

print(sum(weights)) #hoorah!
## [1] 1

Calculating final score with the weighted factors, using normalised values

# Calculate the weighted score using normalized values
sc$FinalScore <- with(sc,
                      NSECC_norm * weights["NSECC"] +
                        SIMD_norm * weights["SIMD"] +
                        Education_norm * weights["Education"] +
                        HI_norm * weights["HI"] +
                        Self_norm * weights["Self"]
)

print(sc)
##   PPS NSECC SIMD Education HI Self NSECC_norm SIMD_norm Education_norm HI_norm
## 1  LJ     3    1         5  4    2  0.3333333       0.1      0.8333333     0.8
## 2  SW     6    1         5  4    2  0.6666667       0.1      0.8333333     0.8
## 3  KJ     7    9         4  5    3  0.7777778       0.9      0.6666667     1.0
## 4  RJ     2    3         2  1    6  0.2222222       0.3      0.3333333     0.2
## 5  TJ     8    7         6  3    1  0.8888889       0.7      1.0000000     0.6
##   Self_norm FinalScore
## 1 0.3333333  0.4316667
## 2 0.3333333  0.5316667
## 3 0.5000000  0.7633333
## 4 1.0000000  0.4066667
## 5 0.1666667  0.6800000

The final score

Current thinking: that upper 20% (0:80 above) will be upper class. I then have 4 categories for working class and middle class that people will be assigned to.

*Now let’s classify our participants’ social class, based on these updated rules below:

sc$SocialClass <- ifelse(
  sc$FinalScore >= 0.80, "Upper-Class",  # Upper Class for FinalScores 0.81 and above
  ifelse(sc$FinalScore <= 0.2, "Lower_Working-Class",  # Lower Working Class
         ifelse(sc$FinalScore <= 0.4, "Upper_Working-Class",  # Upper Working Class
                ifelse(sc$FinalScore <= 0.6, "Lower_Middle-Class",  # Lower Middle Class
                       "Upper_Middle-Class"  # Upper Middle Class for 0.61 to 0.80
                )
         )
  )
)

#Tadaaaaa… Here, participants are given a final scoring, and then a respective ranking of Social Class. I have inputted my own data (LJ) and seems to be quite accurate. But we could wiggle around the weightings of factors, and add/delete some of the factors.

# Print the dataframe with SocialClass assignment
print(sc)
##   PPS NSECC SIMD Education HI Self NSECC_norm SIMD_norm Education_norm HI_norm
## 1  LJ     3    1         5  4    2  0.3333333       0.1      0.8333333     0.8
## 2  SW     6    1         5  4    2  0.6666667       0.1      0.8333333     0.8
## 3  KJ     7    9         4  5    3  0.7777778       0.9      0.6666667     1.0
## 4  RJ     2    3         2  1    6  0.2222222       0.3      0.3333333     0.2
## 5  TJ     8    7         6  3    1  0.8888889       0.7      1.0000000     0.6
##   Self_norm FinalScore        SocialClass
## 1 0.3333333  0.4316667 Lower_Middle-Class
## 2 0.3333333  0.5316667 Lower_Middle-Class
## 3 0.5000000  0.7633333 Upper_Middle-Class
## 4 1.0000000  0.4066667 Lower_Middle-Class
## 5 0.1666667  0.6800000 Upper_Middle-Class