ess<-read.csv("C:/Users/13640/Desktop/ESS11e04_1.csv", header=TRUE)
This study examines the relationship between educational attainment and subjective evaluations of household income. Unlike in many East Asian societies, where educational credentials often function as strict entry requirements for the labor market and are strongly linked to improvements in income and living standards, European countries tend to have more diverse education systems and employment structures. As a result, higher education does not necessarily translate directly into higher economic returns. Using EU-wide data, this study therefore explores whether higher levels of education can still be consistently associated with a stronger sense of financial well-being in the European context, or whether their effect is more limited and shaped by other factors.
To maintain the broadest possible geographical coverage, the sample was constructed by grouping observations by country and randomly selecting up to five respondents from each country. This approach ensures that all available countries are represented in the analysis, rather than allowing the results to be driven by a small number of countries with large sample sizes.
By controlling the overall sample size while preserving cross-national diversity, this sampling strategy allows the association rule analysis to capture patterns that are common across different national contexts, rather than reflecting the characteristics of any single country. As a result, the extracted rules are better suited for describing general relationships between education and subjective income perceptions at the European level.
data <- ess[, c("hincfel", "eisced")]
set.seed(123)
data <- do.call(
rbind,
lapply(
split(data, ess$cntry),
function(x) x[sample(nrow(x), min(5, nrow(x))), ]
)
)
str(data)
## 'data.frame': 150 obs. of 2 variables:
## $ hincfel: int 2 1 2 1 1 1 8 2 1 2 ...
## $ eisced : int 3 5 5 2 5 7 1 5 7 6 ...
The variables were relabeled and converted into factor form prior to the analysis. Specifically, hincfel was recoded into four categories describing subjective income conditions, and eisced was recoded into seven levels of educational attainment. This step improves the interpretability of the variables and makes it easier to relate different education levels to income perceptions. In addition, converting the variables into factors allows them to be treated as discrete items, which is necessary for the subsequent association rule analysis.
data$hincfel <- factor(
data$hincfel,
levels = c(1, 2, 3, 4),
labels = c(
"Living comfortably",
"Coping",
"Difficult",
"Very difficult"
)
)
data$eisced <- factor(
data$eisced,
levels = c(1, 2, 3, 4, 5, 6, 7),
labels = c(
"Less than lower secondary",
"Lower secondary",
"Upper secondary",
"Post-secondary non-tertiary",
"Short-cycle tertiary",
"Bachelor or equivalent",
"Master or above"
)
)
str(data)
## 'data.frame': 150 obs. of 2 variables:
## $ hincfel: Factor w/ 4 levels "Living comfortably",..: 2 1 2 1 1 1 NA 2 1 2 ...
## $ eisced : Factor w/ 7 levels "Less than lower secondary",..: 3 5 5 2 5 7 1 5 7 6 ...
Looking at the frequency distribution, Living comfortably and Coping are the two most common income perception categories in the sample and together account for the large majority of observations. This suggests that, across the EU countries covered in the sample, most respondents do not perceive their household financial situation as seriously difficult. In contrast, relatively few respondents report feeling that their situation is Difficult or Very difficult, indicating that strong subjective financial strain is less common in the overall sample.
In addition, the variable contains only a very small number of missing values and shows no clear pattern of systematic missingness. As a result, the data remain usable and reasonably representative without the need for complex missing-value treatments. This distribution also supports the use of hincfel as the outcome variable in the association rule analysis, as it allows differences in income perceptions across social and educational groups to be meaningfully identified.
trans <- as(data, "transactions")
summary(trans)
## transactions as itemMatrix in sparse format with
## 150 rows (elements/itemsets/transactions) and
## 11 columns (items) and a density of 0.1793939
##
## most frequent items:
## hincfel=Living comfortably hincfel=Coping
## 61 61
## eisced=Post-secondary non-tertiary eisced=Master or above
## 40 30
## eisced=Lower secondary (Other)
## 27 77
##
## element (itemset/transaction) length distribution:
## sizes
## 1 2
## 4 146
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 2.000 1.973 2.000 2.000
##
## includes extended item information - examples:
## labels variables levels
## 1 hincfel=Living comfortably hincfel Living comfortably
## 2 hincfel=Coping hincfel Coping
## 3 hincfel=Difficult hincfel Difficult
##
## includes extended transaction information - examples:
## transactionID
## 1 AT.2227
## 2 AT.526
## 3 AT.195
head(data$hincfel)
## [1] Coping Living comfortably Coping Living comfortably
## [5] Living comfortably Living comfortably
## Levels: Living comfortably Coping Difficult Very difficult
str(data$hincfel)
## Factor w/ 4 levels "Living comfortably",..: 2 1 2 1 1 1 NA 2 1 2 ...
summary(data$hincfel)
## Living comfortably Coping Difficult Very difficult
## 61 61 20 6
## NA's
## 2
During the generation of association rules, items starting with hincfel= were first identified from the transaction labels, so that subjective household income (hincfel) was explicitly set as the outcome variable on the right-hand side (RHS) of the rules. The Apriori algorithm was then applied with a minimum support of 0.01, a minimum confidence of 0.1, and a minimum rule length of two, ensuring that each rule includes at least one explanatory condition and one outcome.
Based on the algorithm output, a total of 18 valid association rules were generated from a sample of 150 transactions and 11 items.
rhs_items <- itemLabels(trans)[
grepl("^hincfel=", itemLabels(trans))
]
rules <- apriori(
trans,
parameter = list(
supp = 0.01,
conf = 0.1,
minlen = 2
),
appearance = list(
rhs = rhs_items,
default = "lhs"
)
)
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.1 0.1 1 none FALSE TRUE 5 0.01 2
## maxlen target ext
## 10 rules TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 1
##
## set item appearances ...[4 item(s)] done [0.00s].
## set transactions ...[11 item(s), 150 transaction(s)] done [0.00s].
## sorting and recoding items ... [11 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 done [0.00s].
## writing ... [18 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
Looking at the overall distribution, most of the generated association rules show relatively low support. This is partly due to the limited sample size, and partly a consequence of the sampling strategy, which deliberately aims to maintain broad geographical coverage. As a result, specific education–income perception combinations are less frequent in the overall sample. Therefore, low support should not be interpreted as a lack of relevance, but rather as reflecting the diversity of individual characteristics in cross-national data.
In terms of confidence, a considerable number of rules exhibit medium to relatively high confidence values (approximately 0.45–0.65). This indicates that, given certain educational conditions, the probability of observing a particular household income perception is fairly stable. In other words, education level shows some ability to differentiate subjective income evaluations, rather than these outcomes being purely random.
With respect to lift, most rules have lift values greater than 1, suggesting that the observed relationships are not driven solely by marginal distributions, but involve a degree of non-random co-occurrence. Rules with high lift but low support tend to reflect relative advantages or risks within specific groups, whereas rules with moderate lift and higher support are more informative for describing broader patterns in the sample.
Based on the overall results of the association rule analysis, {Master or above} ⇒ {Living comfortably} emerges as one of the most representative patterns. This rule indicates that, within the EU sample, individuals with a master’s degree or higher are more likely to describe their household income situation as “living comfortably.” The stability of this relationship across countries suggests that higher education is often associated with more positive subjective economic perceptions, beyond its effects on objective income or employment outcomes. Overall, this rule highlights the role of educational attainment in shaping individual life experiences and economic satisfaction.
The rule {Short-cycle tertiary} ⇒ {Living comfortably} further shows that individuals who complete short-cycle tertiary education are also more likely to report positive income perceptions. This finding suggests that the subjective benefits of higher education are not limited to traditional academic pathways, and that vocational or application-oriented forms of tertiary education can also contribute to more favorable evaluations of one’s economic situation.
Taken together, these rules point to a clear general pattern: as education levels increase, subjective household income perceptions tend to shift from a state of “coping” toward “living comfortably.” In the European context, education therefore not only yields economic returns, but is also associated with systematic differences in how individuals perceive their quality of life.
rules.by.conf<-sort(rules, by="confidence", decreasing=TRUE)
inspect(head(rules.by.conf))
## lhs rhs support confidence coverage lift count
## [1] {eisced=Short-cycle tertiary} => {hincfel=Living comfortably} 0.06000000 0.6428571 0.09333333 1.580796 9
## [2] {eisced=Upper secondary} => {hincfel=Coping} 0.05333333 0.6153846 0.08666667 1.513241 8
## [3] {eisced=Master or above} => {hincfel=Living comfortably} 0.12000000 0.6000000 0.20000000 1.475410 18
## [4] {eisced=Post-secondary non-tertiary} => {hincfel=Coping} 0.13333333 0.5000000 0.26666667 1.229508 20
## [5] {eisced=Bachelor or equivalent} => {hincfel=Living comfortably} 0.04666667 0.4666667 0.10000000 1.147541 7
## [6] {eisced=Bachelor or equivalent} => {hincfel=Coping} 0.04666667 0.4666667 0.10000000 1.147541 7
rules.by.conf<-sort(rules, by="confidence", decreasing=TRUE)
inspect(head(rules.by.conf))
## lhs rhs support confidence coverage lift count
## [1] {eisced=Short-cycle tertiary} => {hincfel=Living comfortably} 0.06000000 0.6428571 0.09333333 1.580796 9
## [2] {eisced=Upper secondary} => {hincfel=Coping} 0.05333333 0.6153846 0.08666667 1.513241 8
## [3] {eisced=Master or above} => {hincfel=Living comfortably} 0.12000000 0.6000000 0.20000000 1.475410 18
## [4] {eisced=Post-secondary non-tertiary} => {hincfel=Coping} 0.13333333 0.5000000 0.26666667 1.229508 20
## [5] {eisced=Bachelor or equivalent} => {hincfel=Living comfortably} 0.04666667 0.4666667 0.10000000 1.147541 7
## [6] {eisced=Bachelor or equivalent} => {hincfel=Coping} 0.04666667 0.4666667 0.10000000 1.147541 7
rules.by.lift<-sort(rules, by="lift", decreasing=TRUE) # sorting by lift
inspect(head(rules.by.lift))
## lhs rhs support confidence coverage lift count
## [1] {eisced=Upper secondary} => {hincfel=Very difficult} 0.01333333 0.1538462 0.08666667 3.846154 2
## [2] {eisced=Less than lower secondary} => {hincfel=Difficult} 0.01333333 0.2222222 0.06000000 1.666667 2
## [3] {eisced=Lower secondary} => {hincfel=Difficult} 0.04000000 0.2222222 0.18000000 1.666667 6
## [4] {eisced=Short-cycle tertiary} => {hincfel=Living comfortably} 0.06000000 0.6428571 0.09333333 1.580796 9
## [5] {eisced=Upper secondary} => {hincfel=Coping} 0.05333333 0.6153846 0.08666667 1.513241 8
## [6] {eisced=Master or above} => {hincfel=Living comfortably} 0.12000000 0.6000000 0.20000000 1.475410 18
从这张散点图里可以直观地看到,大部分规则并不常出现(支持度不高),但只要出现,结果往往比较明确(置信度不低)。这说明在跨国样本中,教育和收入感受的具体组合并不是人人都会遇到,但一旦落到某些教育水平上,人们对自己经济状况的判断往往比较一致。
颜色显示的提升度也能看出一个特点:提升度高的规则大多集中在支持度较低的位置。这类规则更像是在揭示某些特定教育群体中“更容易出现”的收入感受,而不是整个样本的主流情况。相反,那些出现频率较高的规则,关联强度反而比较接近平均水平,更多只是反映整体分布。
plot(rules)
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.
rules.df <- as(rules, "data.frame")
head(rules.df)
## rules support
## 1 {eisced=Upper secondary} => {hincfel=Very difficult} 0.01333333
## 2 {eisced=Less than lower secondary} => {hincfel=Difficult} 0.01333333
## 3 {eisced=Less than lower secondary} => {hincfel=Living comfortably} 0.01333333
## 4 {eisced=Less than lower secondary} => {hincfel=Coping} 0.02000000
## 5 {eisced=Upper secondary} => {hincfel=Living comfortably} 0.01333333
## 6 {eisced=Upper secondary} => {hincfel=Coping} 0.05333333
## confidence coverage lift count
## 1 0.1538462 0.08666667 3.8461538 2
## 2 0.2222222 0.06000000 1.6666667 2
## 3 0.2222222 0.06000000 0.5464481 2
## 4 0.3333333 0.06000000 0.8196721 3
## 5 0.1538462 0.08666667 0.3783102 2
## 6 0.6153846 0.08666667 1.5132409 8
We divided the scatter plot into four regions using a support threshold of 0.07 and a confidence threshold of 0.4. The upper-left quadrant, characterized by low support and high confidence, is considered the most important area. Although the rules in this region do not occur frequently, once their conditions are met, the outcomes are highly consistent and clearly stronger than random co-occurrence. For this reason, these rules have the strongest structural explanatory value and can be regarded as what we refer to as “deterministic” rules.
Within this upper-left quadrant, a clear pattern emerges: the rules are almost entirely concentrated at medium and higher levels of education, and the associated income perceptions are limited to “Living comfortably” and “Coping.” This suggests that when these education-related conditions hold, individuals tend to evaluate their household economic situation in a relatively consistent way, and are unlikely to report severe financial difficulty.
Among these rules, the link between short-cycle tertiary education and “Living comfortably” stands out most clearly. This indicates that vocational or application-oriented educational pathways can, in many cases, lead to more positive economic experiences. In contrast, income perceptions associated with bachelor-level education are more dispersed: some individuals report living comfortably, while others describe their situation as merely coping. This variation suggests that the returns to a bachelor’s degree are not uniform and may depend on factors such as country context, labor market conditions, or industry differences.
rules_left_up <- subset(
rules.df,
support < 0.07 & confidence >= 0.4
)
rules_left_up_sorted <- rules_left_up[
order(rules_left_up$lift, decreasing = TRUE),
]
rules_left_up_sorted
## rules support
## 7 {eisced=Short-cycle tertiary} => {hincfel=Living comfortably} 0.06000000
## 6 {eisced=Upper secondary} => {hincfel=Coping} 0.05333333
## 9 {eisced=Bachelor or equivalent} => {hincfel=Living comfortably} 0.04666667
## 10 {eisced=Bachelor or equivalent} => {hincfel=Coping} 0.04666667
## confidence coverage lift count
## 7 0.6428571 0.09333333 1.580796 9
## 6 0.6153846 0.08666667 1.513241 8
## 9 0.4666667 0.10000000 1.147541 7
## 10 0.4666667 0.10000000 1.147541 7
Overall, in the European context, higher educational attainment does appear to be associated with more positive subjective evaluations of living standards, but this effect is neither linear nor universal. Education is more likely to translate into favorable income perceptions only at higher levels, such as a master’s degree or above, or through vocationally oriented tertiary education and training pathways. In contrast, the benefits of secondary education or a general bachelor’s degree are less consistent and do not reliably lead to improved perceptions of financial well-being.
At the same time, education seems to reduce the likelihood of extremely negative economic evaluations, but it does not guarantee that individuals will perceive themselves as “living comfortably.” Even among people with the same level of education, subjective income perceptions vary substantially, indicating that education alone is not sufficient to determine individual life experiences.
Taken together, the findings suggest that in European societies, education functions more as a form of “baseline protection” rather than a decisive factor in shaping perceived prosperity. Subjective financial well-being is likely influenced by a combination of factors, including national institutions, labor market structures, industry characteristics, and individual career trajectories.