Preamble

This has been a long time coming! Last month, a paper on the prevalence of English in K-Pop lyrics came out open access in the journal Asian Englishes. You can read through it here. It provides relatively robust evidence that the prevalence of English, defined in multiple ways, has increased in K-Pop lyrics over the early twenty first century as well as some case studies looking into whether the reasons for this are due to artists that use less English being replaced by those that use more and whether Korean words are being replaced by their English translation equivalents. Click through if you want the details.

Here we develop and supplement the line of thinking found in this article by looking at a slightly expanded version of the collection of lyrics upon which it is based and seeing whether there are differences in the proportions of English used by artists who present exclusively as female or male.

The Data

A comprehensive account of the data as well as its selection and inclusion criteria can be found in the paper itself. In summary, though, it is a collection of user-generated transcriptions of song lyrics (largely from the website Bugs!) that appeared on the top ten albums of the years 2000 to 2020 in the Korean charts. Here, we supplement it to include up to the year 2024 for a nice, round 25 years worth of songs on top ten albums.

Proportion of English

As we are dealing with thousands of songs here, we present summary visualisations that impactfully and immediately demonstrate the increase in prevalence of English in K-Pop over the surveyed period. The proportion of English words relative to Korean words (OK, given the differences between English and Korean and what counts as a ‘word’ it’s a bit more complicated than that, so for full details see the paper ) in each song’s lyrics ranging from 0 (entirely in Korean, no English at all) to 1 (entirely in English, no Korean at all) was calculated. Songs were then grouped into five-year bins it is these that we visualise in a density plot.

Density plots can be tricky to interpret. They can be, indeed they are, sometimes characterised as a sort of alternative visualisation to a histogram, smoothed using kernel density estmitation. The units along the x-axis represent the proportion of English lyrics in a given song as defined above. On the y-axis, rather than showing something like the count of songs with a particular proportion of English lyrics, as we might in a histogram, we have the density. A property of density plots is that the area under the curve must have a total value of 1, and it is this property which determines the values that appear on the y-axis. If the values of the x-axis are proportionately larger, those on the y-axis are smaller, and vice-versa.

Just to demonstrate this, consider the chart below, which multiplies the value displayed on the x-axis by 100 to give us a percentage score of the proportion of tokens of English, in comparison to the one above. Note especially the corresponding decrease in the numbers used to calculate density on the y-axis by a factor of 100, but the otherwise identical appearance of the distributions in the two visualisations.

By looking at the peaks of the distributions shown in these density plots, we can see how the prevalence of English in K-Pop lyrics changed over these successive five-year periods. While bearing all the caveats in mind above, higher values on the y-axis do indicate a greater number of songs with or around the proportion of English in their lyrics specified by the corresponding point on the x-axis. Thus, the high peaks towards the left side of the visualisation for the periods 2000-2004 and 2005-2009 tell us that a very large proportion of the songs of those eras featured very little English relative to Korean in their lyrics. We can see the peaks of the distributions for the subsequent periods move rightwards along the graph, meaning that increasing relative proportions of English appeared in ever more songs. For the most recent period, 2020-2024, it is even possible to discern a smaller second peak in the distribution towards the right edge of the graph. This represents the lyrics of songs with almost entirely English lyrics in which Korean only occasionally features.

Proportion of English by Gender

Although this analysis includes some solo artists, for expository convenience, we will use the terms girl groups and boy bands throughout to encompass all artists who present a particular gender. Mixed, or ‘co-ed’ groups are excluded from this analysis. Below, we present density plots of the prevalence of English lyrics for the songs recorded by the boy bands and girl groups appearing in the collection of texts described above, once more divided into five year bins.

The extent to which girl groups appear to use a greater quantity of English in their lyrics in comparison to boy bands is striking in these visualisations. Girl groups have shifted more rapidly and more extremely to using a greater proportion of English in their lyrics than boy bands. This is most vividly illustrated by the most recent 5 year period 2020-2024, which appears to show a greater proportion of releases by girl groups entirely or overwhelmingly in English (arbitrarily defined here as with a proportion of English lyrics over 0.75) than entirely or overwhelmingly in Korean (again, arbitrarily defined here as having a proprtion of English lyrics under 0.25). This interpretation, however, comes with a couple of caveats. Most significantly, not only is the collection of texts quite unbalanced in terms of the relative number of songs by boy bands (2077) and girl groups (399) it contains, but the number of girl groups represented here is relatively small (25 girl groups in comparison to 107 boy bands). This means that it is not warranted to firmly conclude that the magnitude of the change in the prevalence of English in girl groups’ lyrics is generalisable outside of this collection. It should not, however, be simply dismissed. It may instead be taken as suggestive, rather than representative, of a tendency for English to be more prevalent in the lyrics of girl groups in comparison to boy bands.

Conclusion

While it may be impressionistically obvious that the prevalence of English in K-Pop lyrics has risen over the first 25 years of the twenty first century, especially given the increasing popularity of K-Pop in throughout the English-speaking world, such claims require rigorous interrogation and empirical verification. That is precisely what the paper aims to do. By observing variation in the increase in the prevalence of English between boy bands and girl groups here, we demonstrate that simply establishing the existence of the global, aggregated increase is just a starting point. This increase is not internally homogeneous. The contours of its distribution increase across artists, genres, and chart positions is certainly worth exploring. Furthermore, establishing facts on the ground and noting their co-presence with the Korean Wave does not in and of itself have much explanatory power. Looking into whether and how English content and its function has changed over the surveyed period(s) are some immediately obviously possible avenues for future exploration, but are not necessarily amenable to quantitative investigation. For the foreseeable future, though, we will continue to monitor the prevalence of English in K-Pop lyrics and, no doubt, return to this collection of texts to examine other questions, albeit not necessarily from a purely quantitative perspective.

Acknowledgement
This work was supported by the Core University Program for Korean Studies of the Ministry of Education of the Republic of Korea and Korean Studies Promotion Service at the Academy of Korean Studies (AKS-2021-OLU-2250004)