# Load necessary packages
library(tidyverse)
library(icd.data)
library(archive)
library(janitor)
ICD 10 codes translation using R
- January 2023: Please see Section 5 for how to deal with recent ICD-10 codes
About
In this tutorial, I will demonstrate how to translate ICD-10-CM to their meanings easily without the need to do laborious and repetitive if_else()
or case_when()
calls.
Necessary packages
We will need to load tidyverse
and icd.data
packages for this tutorial. If you don’t have them installed, you would need to run install.packages(c("tidyverse","icd.data))
in your console before running the following chunk. After the installation is done, run the following chunk.
Create dummy data
I will demonstrate the method by creating a dummy data with two columns:
id
: Just random identifiers, nothing special!code
: ICD-10 codes that you might have in your dataset and want to translate.
You don’t need to create this data. I am just doing this for illustration.
# Create dummy data having some ids and codes (I assume your data is somewhat similar to this dummy_data)
<- tibble(
dummy_data id = 1:20,
# Just borrowing random 20 ICD-10 codes from the icd10cm2016 dataset from the icd.data package
code = sample_n(as_tibble(icd10cm2016$code),20) |> as_vector()
)
dummy_data
id | code |
---|---|
1 | S52135N |
2 | T2071 |
3 | E750 |
4 | T375X2S |
5 | F11151 |
6 | Y801 |
7 | S91211S |
8 | S65211S |
9 | S62616S |
10 | Y37320A |
11 | M1A172 |
12 | S00252D |
13 | M651 |
14 | X38XXXA |
15 | T17400 |
16 | S8254XQ |
17 | S52515F |
18 | T730XXS |
19 | S61356S |
20 | M1A1220 |
Translation
Now let’s translate the codes to their literal meanings. We will use the icd10cm2016
dataset from the icd.data
package. This dataset should be available to you once you have the icd.data
loaded. Before the translation, we need to change the variable code
in the icd10cm2016
dataset from icd10cm
type to character
(Not sure why the package’s author has this strange type of variables).
<- icd10cm2016 |>
icd10cm2016 mutate(code = as.character(code))
Now, we will do the translation simply using left_join
left_join(dummy_data, icd10cm2016) |>
# keep the relevant variables
select(id,code, short_desc, long_desc)
id | code | short_desc | long_desc |
---|---|---|---|
1 | S52135N | Nondisp fx of nk of l rad, 7thN | Nondisplaced fracture of neck of left radius, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with nonunion |
2 | T2071 | Corrosion of third degree of ear [any part, except ear drum] | Corrosion of third degree of ear [any part, except ear drum] |
3 | E750 | GM2 gangliosidosis | GM2 gangliosidosis |
4 | T375X2S | Poisoning by antiviral drugs, intentional self-harm, sequela | Poisoning by antiviral drugs, intentional self-harm, sequela |
5 | F11151 | Opioid abuse w opioid-induced psychotic disorder w hallucin | Opioid abuse with opioid-induced psychotic disorder with hallucinations |
6 | Y801 | Theraputc and rehab physical medicine devices assoc w incdt | Therapeutic (nonsurgical) and rehabilitative physical medicine devices associated with adverse incidents |
7 | S91211S | Lac w/o fb of right great toe w damage to nail, sequela | Laceration without foreign body of right great toe with damage to nail, sequela |
8 | S65211S | Laceration of superficial palmar arch of right hand, sequela | Laceration of superficial palmar arch of right hand, sequela |
9 | S62616S | Disp fx of proximal phalanx of right little finger, sequela | Displaced fracture of proximal phalanx of right little finger, sequela |
10 | Y37320A | Milt op involving incendiary bullet, milt, init | Military operations involving incendiary bullet, military personnel, initial encounter |
11 | M1A172 | Lead-induced chronic gout, left ankle and foot | Lead-induced chronic gout, left ankle and foot |
12 | S00252D | Superficial fb of left eyelid and periocular area, subs | Superficial foreign body of left eyelid and periocular area, subsequent encounter |
13 | M651 | Other infective (teno)synovitis | Other infective (teno)synovitis |
14 | X38XXXA | Flood, initial encounter | Flood, initial encounter |
15 | T17400 | Unspecified foreign body in trachea causing asphyxiation | Unspecified foreign body in trachea causing asphyxiation |
16 | S8254XQ | Nondisp fx of med malleolus of r tibia, 7thQ | Nondisplaced fracture of medial malleolus of right tibia, subsequent encounter for open fracture type I or II with malunion |
17 | S52515F | Nondisp fx of l radial styloid pro, 7thF | Nondisplaced fracture of left radial styloid process, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with routine healing |
18 | T730XXS | Starvation, sequela | Starvation, sequela |
19 | S61356S | Open bite of right little finger w damage to nail, sequela | Open bite of right little finger with damage to nail, sequela |
20 | M1A1220 | Lead-induced chronic gout, left elbow, without tophus | Lead-induced chronic gout, left elbow, without tophus (tophi) |
ICD-10 codes after 2016
The demonstration above might not be beneficial if you have new ICD-10 codes after 2016.
Below, I will demonstrate how to solve this issue
Get the most recent version of ICD-10
- Since we don’t have ready packages, we will use the CDC website to get the most recent release.
- I downloaded the file called
icd10cm-code descriptions- April 1 2023.zip
#|warning: false
#| output: false
# Download the zip file and extract dataset
::archive_extract(
archive"https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/ICD10CM/April-1-2023-Update/icd10cm-code%20descriptions-%20April%201%202023.zip",
files= "icd10cm-codes- April 1 2023.txt",dir = getwd())
# read the dataset
<- tibble(read_lines("icd10cm-codes- April 1 2023.txt")) |>
df_raw clean_names()
# clean the dataset
<- df_raw |>
df_clean rename(code_and_desc = read_lines_icd10cm_codes_april_1_2023_txt) |>
mutate(code = str_sub(code_and_desc,start = 1, end =7)) |>
mutate(description= str_sub(code_and_desc,start = 8)) |>
select(-code_and_desc) |>
mutate_all(str_squish)
# Let's rename it
<- df_clean icd10_dictionary
Testing the new dataset
- From the previous step, we generated
icd10_dictionary
which we will use for generating translation. Now, I will usedummy_data
that we made before, but I will add a new code that was not used in 2016
<- dummy_data |>
new_dummy_data add_row(id = 21, code = "U071") # COVID-19 code
Now, let’s translate as we did before
|>
new_dummy_data left_join(icd10_dictionary)
id | code | description |
---|---|---|
1 | S52135N | Nondisplaced fracture of neck of left radius, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with nonunion |
2 | T2071 | NA |
3 | E750 | NA |
4 | T375X2S | Poisoning by antiviral drugs, intentional self-harm, sequela |
5 | F11151 | Opioid abuse with opioid-induced psychotic disorder with hallucinations |
6 | Y801 | Therapeutic (nonsurgical) and rehabilitative physical medicine devices associated with adverse incidents |
7 | S91211S | Laceration without foreign body of right great toe with damage to nail, sequela |
8 | S65211S | Laceration of superficial palmar arch of right hand, sequela |
9 | S62616S | Displaced fracture of proximal phalanx of right little finger, sequela |
10 | Y37320A | Military operations involving incendiary bullet, military personnel, initial encounter |
11 | M1A172 | NA |
12 | S00252D | Superficial foreign body of left eyelid and periocular area, subsequent encounter |
13 | M651 | NA |
14 | X38XXXA | Flood, initial encounter |
15 | T17400 | NA |
16 | S8254XQ | Nondisplaced fracture of medial malleolus of right tibia, subsequent encounter for open fracture type I or II with malunion |
17 | S52515F | Nondisplaced fracture of left radial styloid process, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with routine healing |
18 | T730XXS | Starvation, sequela |
19 | S61356S | Open bite of right little finger with damage to nail, sequela |
20 | M1A1220 | Lead-induced chronic gout, left elbow, without tophus (tophi) |
21 | U071 | COVID-19 |
- This detected COVID successfully, but you can see that we missed some codes too. I believe the reason behind this is that some ICD-10 codes got deleted, retired, or modified by time. In such scenarios, I would apply the same steps demonstrated before using different versions of ICD-10-codes to ensure that I captured as many as I can.
And that’s it! Hope you found this useful. Don’t hesitate to reach out by email for questions: aabdel51@uic.edu