Ukrainian vs Russian Language Use in Ukraine
I Introduction
Language has long shaped Ukrainian national identity and its internal political struggles. Ukrainian was historically repressed under foreign rule, while Russian dominated cities and public life, often seen as a language of prestige and progress. After independence in 1991, overt suppression of Ukrainian ended — but the status of Russian remained deeply contested.
In the 2001 census, 67.5% of Ukrainians identified Ukrainian as their native language, and 29.6% named Russian. But “native language” often reflected identity more than use. Most Ukrainians are bilingual and move between languages Ukrainian, Russian, or a hybrid known as Surzhyk. Understanding everyday language use is difficult. Traditional surveys capture what people claim they speak not how they act.
This project uses Google Trends to track the relative usage of Ukrainian and Russian across time and region. By comparing the search frequencies of common queries in both languages I examine patterns in digital behavior as a proxy for language preference in daily life.
The goal is to understand whether Ukrainians are shifting their linguistic habits online especially in response to key political events such as the 2013–2014 Maidan Revolution and the 2022 Russian invasion. If search behavior reflects cultural or identity shifts, this approach may offer a window into Ukraine’s evolving national consciousness.
II Data and Methodology
This project uses data from Google Trends, accessed via the gtrendsR package in R, to estimate language usage in Ukraine. I compare the relative search interest of equivalent queries spelled differently in Ukrainian and Russian that over time and across regions.
Google Trends provides normalized search scores (0–100). I compute a Ukrainian-to-Russian ratio to capture relative language preference:
Ratio = Ukrainian score / Russian score
Data was collected:
Nationally from 2010 to 2025 (weekly).
Regionally using Ukrainian ISO subregion codes (e.g., UA-30 for Kyiv).
Limitations
Relative scores: Google Trends data is scaled within queries and not comparable across unrelated searches.
Opacity: Sampling methods and geographic boundaries are not publicly disclosed.
User behavior bias: Search language reflects usage, not identity or fluency.
Content availability bias: Historically, more content was available in Russian. This may overstate Russian usage, especially in earlier years or content-heavy queries.
Digital divide: Results reflect internet users, possibly underrepresenting rural or older populations.
III National Level Trends
Below displays the national level ratio for basic search terms in Ukrainian v Russian from January 1, 2010 to May 1, 2025. A ratio above 1 means there were more searches in Ukrainian. Below 1 more searches in Russian. For each of the terms there seems to be clear upward trend that seems to spike after the invasion. Depending on the exact search terms there is some instability in the ratio over time and overall ratio varying. For terms like What is the ratio is well over 1 in 2025 while for money it’s still not even above 1.
Idea: Table for different ratios from huge composite of different terms? Should try a larger swath of terms in general if not limited by cap
Below displays a composite index of search terms that includes the terms: what, news, games, recipes, how, price, why and money that are all spelled differently in the two languages showing a bit of a smoother trend with a sharp increase seeming to begin around the full scale invasion.
EXPAND ON BELOW MORE
The graphs indicate that there was a break around the time of the full scale Russian invasion. Using R’s breakpoints function (GO INTO MORE) indicates that there are signifigant breakpoints afterMay 1 2020, when the ratio aburptly decreased, and after August 1 2022. The former was around covid and seemed to be a blimp while the later has been sustained.
IV Regional Trends
Below has the ranking of the ratios over the full period of the search for each oblast.
| Ukrainian Oblasts and Cities Ranked by Ukrainian-to-Russian Search Ratio1 | |||
|---|---|---|---|
| Composite Ukrainian v Russian Search Terms from Google Trends | |||
| Oblast | Hits (UA) | Hits (RU) | UA/RU Ratio |
| Ternopil's'ka oblast | 100.00 | 20.00 | 5.00 |
| Ivano-Frankivs'ka oblast | 93.00 | 19.00 | 4.89 |
| Volyns'ka oblast | 97.00 | 23.00 | 4.22 |
| Lviv Oblast | 84.00 | 25.00 | 3.36 |
| Rivnens'ka oblast | 94.00 | 31.00 | 3.03 |
| Khmel'nyts'ka oblast | 71.00 | 38.00 | 1.87 |
| Zakarpats'ka oblast | 65.00 | 38.00 | 1.71 |
| Chernivets'ka oblast | 63.00 | 41.00 | 1.54 |
| Vinnyts'ka oblast | 61.00 | 42.00 | 1.45 |
| Zhytomyrs'ka oblast | 55.00 | 51.00 | 1.08 |
| Cherkas'ka oblast | 54.00 | 52.00 | 1.04 |
| Kyivs'ka oblast | 44.00 | 55.00 | 0.80 |
| Poltavs'ka oblast | 41.00 | 61.00 | 0.67 |
| Chernihivs'ka oblast | 37.00 | 65.00 | 0.57 |
| Kirovohrads'ka oblast | 36.00 | 66.00 | 0.55 |
| Sums'ka oblast | 32.00 | 67.00 | 0.48 |
| Kyiv city | 27.00 | 57.00 | 0.47 |
| Mykolaivs'ka oblast | 22.00 | 83.00 | 0.27 |
| Dnipropetrovsk Oblast | 18.00 | 78.00 | 0.23 |
| Khersons'ka oblast | 16.00 | 88.00 | 0.18 |
| Odessa Oblast | 13.00 | 82.00 | 0.16 |
| Zaporiz'ka oblast | 13.00 | 83.00 | 0.16 |
| Kharkiv Oblast | 12.00 | 81.00 | 0.15 |
| Donetsk Oblast | 6.00 | 99.00 | 0.06 |
| Luhans'ka oblast | 4.00 | 99.00 | 0.04 |
| Crimea | 0.00 | 100.00 | 0.00 |
| Sevastopol' city | 0.00 | 95.00 | 0.00 |
| 1 Crimea and Sevastopol may have suppressed or missing data. | |||
And visual map of the same data
Belows looks at the composite over time which shows that across essentially every region there has been a clear upward trend following the same trajectory as the national trend with a few excetpions. The clearest exceptions being Luhansk, Sevastopol and Crimea. The later two have been Russian controlled since 2014 and have had very low Ukrainian search rates over time with a decline in Ukrainian likely occuring after Russian occupation.