ICD 10 codes translation using R

Author
Affiliation
Abdullah Abdelaziz

PhD student at Pharmacy Systems Outcomes and Policy, UIC

Updates
  • January 2023: Please see Section 5 for how to deal with recent ICD-10 codes

About

In this tutorial, I will demonstrate how to translate ICD-10-CM to their meanings easily without the need to do laborious and repetitive if_else() or case_when() calls.

Necessary packages

We will need to load tidyverse and icd.data packages for this tutorial. If you don’t have them installed, you would need to run install.packages(c("tidyverse","icd.data)) in your console before running the following chunk. After the installation is done, run the following chunk.

# Load necessary packages 
library(tidyverse)
library(icd.data)
library(archive)
library(janitor)

Create dummy data

I will demonstrate the method by creating a dummy data with two columns:

  1. id : Just random identifiers, nothing special!
  2. code: ICD-10 codes that you might have in your dataset and want to translate.

You don’t need to create this data. I am just doing this for illustration.

# Create dummy data having some ids and codes (I assume your data is somewhat similar to this dummy_data)
dummy_data <- tibble(
  id = 1:20, 
  # Just borrowing random 20 ICD-10 codes from the icd10cm2016 dataset from the icd.data package
  code = sample_n(as_tibble(icd10cm2016$code),20) |> as_vector() 
)

dummy_data
id code
1 S52135N
2 T2071
3 E750
4 T375X2S
5 F11151
6 Y801
7 S91211S
8 S65211S
9 S62616S
10 Y37320A
11 M1A172
12 S00252D
13 M651
14 X38XXXA
15 T17400
16 S8254XQ
17 S52515F
18 T730XXS
19 S61356S
20 M1A1220

Translation

Now let’s translate the codes to their literal meanings. We will use the icd10cm2016 dataset from the icd.data package. This dataset should be available to you once you have the icd.data loaded. Before the translation, we need to change the variable code in the icd10cm2016 dataset from icd10cm type to character (Not sure why the package’s author has this strange type of variables).

icd10cm2016 <- icd10cm2016 |> 
  mutate(code = as.character(code))

Now, we will do the translation simply using left_join

left_join(dummy_data, icd10cm2016) |> 
  # keep the relevant variables
  select(id,code, short_desc, long_desc)
id code short_desc long_desc
1 S52135N Nondisp fx of nk of l rad, 7thN Nondisplaced fracture of neck of left radius, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with nonunion
2 T2071 Corrosion of third degree of ear [any part, except ear drum] Corrosion of third degree of ear [any part, except ear drum]
3 E750 GM2 gangliosidosis GM2 gangliosidosis
4 T375X2S Poisoning by antiviral drugs, intentional self-harm, sequela Poisoning by antiviral drugs, intentional self-harm, sequela
5 F11151 Opioid abuse w opioid-induced psychotic disorder w hallucin Opioid abuse with opioid-induced psychotic disorder with hallucinations
6 Y801 Theraputc and rehab physical medicine devices assoc w incdt Therapeutic (nonsurgical) and rehabilitative physical medicine devices associated with adverse incidents
7 S91211S Lac w/o fb of right great toe w damage to nail, sequela Laceration without foreign body of right great toe with damage to nail, sequela
8 S65211S Laceration of superficial palmar arch of right hand, sequela Laceration of superficial palmar arch of right hand, sequela
9 S62616S Disp fx of proximal phalanx of right little finger, sequela Displaced fracture of proximal phalanx of right little finger, sequela
10 Y37320A Milt op involving incendiary bullet, milt, init Military operations involving incendiary bullet, military personnel, initial encounter
11 M1A172 Lead-induced chronic gout, left ankle and foot Lead-induced chronic gout, left ankle and foot
12 S00252D Superficial fb of left eyelid and periocular area, subs Superficial foreign body of left eyelid and periocular area, subsequent encounter
13 M651 Other infective (teno)synovitis Other infective (teno)synovitis
14 X38XXXA Flood, initial encounter Flood, initial encounter
15 T17400 Unspecified foreign body in trachea causing asphyxiation Unspecified foreign body in trachea causing asphyxiation
16 S8254XQ Nondisp fx of med malleolus of r tibia, 7thQ Nondisplaced fracture of medial malleolus of right tibia, subsequent encounter for open fracture type I or II with malunion
17 S52515F Nondisp fx of l radial styloid pro, 7thF Nondisplaced fracture of left radial styloid process, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with routine healing
18 T730XXS Starvation, sequela Starvation, sequela
19 S61356S Open bite of right little finger w damage to nail, sequela Open bite of right little finger with damage to nail, sequela
20 M1A1220 Lead-induced chronic gout, left elbow, without tophus Lead-induced chronic gout, left elbow, without tophus (tophi)

ICD-10 codes after 2016

  • The demonstration above might not be beneficial if you have new ICD-10 codes after 2016.

  • Below, I will demonstrate how to solve this issue


Get the most recent version of ICD-10

  • Since we don’t have ready packages, we will use the CDC website to get the most recent release.
  • I downloaded the file called icd10cm-code descriptions- April 1 2023.zip
#|warning: false
#| output: false

# Download the zip file and extract dataset 
archive::archive_extract(
"https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/ICD10CM/April-1-2023-Update/icd10cm-code%20descriptions-%20April%201%202023.zip",
files= "icd10cm-codes- April 1 2023.txt",dir = getwd())

# read the dataset
df_raw <- tibble(read_lines("icd10cm-codes- April 1 2023.txt")) |>
clean_names()
# clean the dataset
df_clean <- df_raw |>
rename(code_and_desc = read_lines_icd10cm_codes_april_1_2023_txt) |>
mutate(code = str_sub(code_and_desc,start = 1, end =7)) |>
mutate(description= str_sub(code_and_desc,start = 8)) |>
select(-code_and_desc) |>
mutate_all(str_squish)
# Let's rename it
icd10_dictionary <- df_clean

Testing the new dataset

  • From the previous step, we generated icd10_dictionary which we will use for generating translation. Now, I will use dummy_data that we made before, but I will add a new code that was not used in 2016
new_dummy_data <- dummy_data |> 
  add_row(id = 21, code = "U071") # COVID-19 code

Now, let’s translate as we did before

new_dummy_data |> 
  left_join(icd10_dictionary)
id code description
1 S52135N Nondisplaced fracture of neck of left radius, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with nonunion
2 T2071 NA
3 E750 NA
4 T375X2S Poisoning by antiviral drugs, intentional self-harm, sequela
5 F11151 Opioid abuse with opioid-induced psychotic disorder with hallucinations
6 Y801 Therapeutic (nonsurgical) and rehabilitative physical medicine devices associated with adverse incidents
7 S91211S Laceration without foreign body of right great toe with damage to nail, sequela
8 S65211S Laceration of superficial palmar arch of right hand, sequela
9 S62616S Displaced fracture of proximal phalanx of right little finger, sequela
10 Y37320A Military operations involving incendiary bullet, military personnel, initial encounter
11 M1A172 NA
12 S00252D Superficial foreign body of left eyelid and periocular area, subsequent encounter
13 M651 NA
14 X38XXXA Flood, initial encounter
15 T17400 NA
16 S8254XQ Nondisplaced fracture of medial malleolus of right tibia, subsequent encounter for open fracture type I or II with malunion
17 S52515F Nondisplaced fracture of left radial styloid process, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with routine healing
18 T730XXS Starvation, sequela
19 S61356S Open bite of right little finger with damage to nail, sequela
20 M1A1220 Lead-induced chronic gout, left elbow, without tophus (tophi)
21 U071 COVID-19
  • This detected COVID successfully, but you can see that we missed some codes too. I believe the reason behind this is that some ICD-10 codes got deleted, retired, or modified by time. In such scenarios, I would apply the same steps demonstrated before using different versions of ICD-10-codes to ensure that I captured as many as I can.

And that’s it! Hope you found this useful. Don’t hesitate to reach out by email for questions: aabdel51@uic.edu