# Load necessary packages
library(tidyverse)
library(icd.data)
library(archive)
library(janitor)ICD 10 codes translation using R
- January 2023: Please see Section 5 for how to deal with recent ICD-10 codes
About
In this tutorial, I will demonstrate how to translate ICD-10-CM to their meanings easily without the need to do laborious and repetitive if_else() or case_when() calls.
Necessary packages
We will need to load tidyverse and icd.data packages for this tutorial. If you don’t have them installed, you would need to run install.packages(c("tidyverse","icd.data)) in your console before running the following chunk. After the installation is done, run the following chunk.
Create dummy data
I will demonstrate the method by creating a dummy data with two columns:
id: Just random identifiers, nothing special!code: ICD-10 codes that you might have in your dataset and want to translate.
You don’t need to create this data. I am just doing this for illustration.
# Create dummy data having some ids and codes (I assume your data is somewhat similar to this dummy_data)
dummy_data <- tibble(
id = 1:20,
# Just borrowing random 20 ICD-10 codes from the icd10cm2016 dataset from the icd.data package
code = sample_n(as_tibble(icd10cm2016$code),20) |> as_vector()
)
dummy_data| id | code |
|---|---|
| 1 | S52135N |
| 2 | T2071 |
| 3 | E750 |
| 4 | T375X2S |
| 5 | F11151 |
| 6 | Y801 |
| 7 | S91211S |
| 8 | S65211S |
| 9 | S62616S |
| 10 | Y37320A |
| 11 | M1A172 |
| 12 | S00252D |
| 13 | M651 |
| 14 | X38XXXA |
| 15 | T17400 |
| 16 | S8254XQ |
| 17 | S52515F |
| 18 | T730XXS |
| 19 | S61356S |
| 20 | M1A1220 |
Translation
Now let’s translate the codes to their literal meanings. We will use the icd10cm2016 dataset from the icd.data package. This dataset should be available to you once you have the icd.data loaded. Before the translation, we need to change the variable code in the icd10cm2016 dataset from icd10cm type to character (Not sure why the package’s author has this strange type of variables).
icd10cm2016 <- icd10cm2016 |>
mutate(code = as.character(code))Now, we will do the translation simply using left_join
left_join(dummy_data, icd10cm2016) |>
# keep the relevant variables
select(id,code, short_desc, long_desc)| id | code | short_desc | long_desc |
|---|---|---|---|
| 1 | S52135N | Nondisp fx of nk of l rad, 7thN | Nondisplaced fracture of neck of left radius, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with nonunion |
| 2 | T2071 | Corrosion of third degree of ear [any part, except ear drum] | Corrosion of third degree of ear [any part, except ear drum] |
| 3 | E750 | GM2 gangliosidosis | GM2 gangliosidosis |
| 4 | T375X2S | Poisoning by antiviral drugs, intentional self-harm, sequela | Poisoning by antiviral drugs, intentional self-harm, sequela |
| 5 | F11151 | Opioid abuse w opioid-induced psychotic disorder w hallucin | Opioid abuse with opioid-induced psychotic disorder with hallucinations |
| 6 | Y801 | Theraputc and rehab physical medicine devices assoc w incdt | Therapeutic (nonsurgical) and rehabilitative physical medicine devices associated with adverse incidents |
| 7 | S91211S | Lac w/o fb of right great toe w damage to nail, sequela | Laceration without foreign body of right great toe with damage to nail, sequela |
| 8 | S65211S | Laceration of superficial palmar arch of right hand, sequela | Laceration of superficial palmar arch of right hand, sequela |
| 9 | S62616S | Disp fx of proximal phalanx of right little finger, sequela | Displaced fracture of proximal phalanx of right little finger, sequela |
| 10 | Y37320A | Milt op involving incendiary bullet, milt, init | Military operations involving incendiary bullet, military personnel, initial encounter |
| 11 | M1A172 | Lead-induced chronic gout, left ankle and foot | Lead-induced chronic gout, left ankle and foot |
| 12 | S00252D | Superficial fb of left eyelid and periocular area, subs | Superficial foreign body of left eyelid and periocular area, subsequent encounter |
| 13 | M651 | Other infective (teno)synovitis | Other infective (teno)synovitis |
| 14 | X38XXXA | Flood, initial encounter | Flood, initial encounter |
| 15 | T17400 | Unspecified foreign body in trachea causing asphyxiation | Unspecified foreign body in trachea causing asphyxiation |
| 16 | S8254XQ | Nondisp fx of med malleolus of r tibia, 7thQ | Nondisplaced fracture of medial malleolus of right tibia, subsequent encounter for open fracture type I or II with malunion |
| 17 | S52515F | Nondisp fx of l radial styloid pro, 7thF | Nondisplaced fracture of left radial styloid process, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with routine healing |
| 18 | T730XXS | Starvation, sequela | Starvation, sequela |
| 19 | S61356S | Open bite of right little finger w damage to nail, sequela | Open bite of right little finger with damage to nail, sequela |
| 20 | M1A1220 | Lead-induced chronic gout, left elbow, without tophus | Lead-induced chronic gout, left elbow, without tophus (tophi) |
ICD-10 codes after 2016
The demonstration above might not be beneficial if you have new ICD-10 codes after 2016.
Below, I will demonstrate how to solve this issue
Get the most recent version of ICD-10
- Since we don’t have ready packages, we will use the CDC website to get the most recent release.
- I downloaded the file called
icd10cm-code descriptions- April 1 2023.zip
#|warning: false
#| output: false
# Download the zip file and extract dataset
archive::archive_extract(
"https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/ICD10CM/April-1-2023-Update/icd10cm-code%20descriptions-%20April%201%202023.zip",
files= "icd10cm-codes- April 1 2023.txt",dir = getwd())
# read the dataset
df_raw <- tibble(read_lines("icd10cm-codes- April 1 2023.txt")) |>
clean_names()
# clean the dataset
df_clean <- df_raw |>
rename(code_and_desc = read_lines_icd10cm_codes_april_1_2023_txt) |>
mutate(code = str_sub(code_and_desc,start = 1, end =7)) |>
mutate(description= str_sub(code_and_desc,start = 8)) |>
select(-code_and_desc) |>
mutate_all(str_squish)
# Let's rename it
icd10_dictionary <- df_cleanTesting the new dataset
- From the previous step, we generated
icd10_dictionarywhich we will use for generating translation. Now, I will usedummy_datathat we made before, but I will add a new code that was not used in 2016
new_dummy_data <- dummy_data |>
add_row(id = 21, code = "U071") # COVID-19 codeNow, let’s translate as we did before
new_dummy_data |>
left_join(icd10_dictionary)| id | code | description |
|---|---|---|
| 1 | S52135N | Nondisplaced fracture of neck of left radius, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with nonunion |
| 2 | T2071 | NA |
| 3 | E750 | NA |
| 4 | T375X2S | Poisoning by antiviral drugs, intentional self-harm, sequela |
| 5 | F11151 | Opioid abuse with opioid-induced psychotic disorder with hallucinations |
| 6 | Y801 | Therapeutic (nonsurgical) and rehabilitative physical medicine devices associated with adverse incidents |
| 7 | S91211S | Laceration without foreign body of right great toe with damage to nail, sequela |
| 8 | S65211S | Laceration of superficial palmar arch of right hand, sequela |
| 9 | S62616S | Displaced fracture of proximal phalanx of right little finger, sequela |
| 10 | Y37320A | Military operations involving incendiary bullet, military personnel, initial encounter |
| 11 | M1A172 | NA |
| 12 | S00252D | Superficial foreign body of left eyelid and periocular area, subsequent encounter |
| 13 | M651 | NA |
| 14 | X38XXXA | Flood, initial encounter |
| 15 | T17400 | NA |
| 16 | S8254XQ | Nondisplaced fracture of medial malleolus of right tibia, subsequent encounter for open fracture type I or II with malunion |
| 17 | S52515F | Nondisplaced fracture of left radial styloid process, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with routine healing |
| 18 | T730XXS | Starvation, sequela |
| 19 | S61356S | Open bite of right little finger with damage to nail, sequela |
| 20 | M1A1220 | Lead-induced chronic gout, left elbow, without tophus (tophi) |
| 21 | U071 | COVID-19 |
- This detected COVID successfully, but you can see that we missed some codes too. I believe the reason behind this is that some ICD-10 codes got deleted, retired, or modified by time. In such scenarios, I would apply the same steps demonstrated before using different versions of ICD-10-codes to ensure that I captured as many as I can.
And that’s it! Hope you found this useful. Don’t hesitate to reach out by email for questions: aabdel51@uic.edu