flowchart LR A[Image Segmentation] --> B[Semantic Segmentation] A --> C[Instance Segmentation] A --> D[Panoptic Segmentation]
Is it reliable ?
A computer vision technique that deals with classification of pixels of images into classes or categories or labels.1
Computers can do this using a variety of methods but essentially they rely on the features of the pixel and it’s surrounding pixels to identify it.
It is a combination of image classification and object detection - classifies a pixel as belonging to a particular label and defines the boundary of the particular label.
The output of the model is known as a segmentation mask
flowchart LR A[Image Segmentation] --> B[Semantic Segmentation] A --> C[Instance Segmentation] A --> D[Panoptic Segmentation]
Types of image segmentation
Image segmentation vs object detection vs Image classification
Simplest type of image segmentation where the segmentation model will assign a label to each pixel in the image to categorize them into objects
Does not have information on other aspects like the number of the objects
Instance segmentation models will delineate the shape of each object instance.
It therefore gives information about the number of objects
This is more difficult as it has to distinguish between similiar objects which may be partly overlapping.
Will not classify other pixels at all.
Combination of semantic segmentation and instance segmentation.
Will classify each object in the image and also provides a discrete object ID thereby allowing distinction between objects of the same type
The most difficult type of image segmentation
Issues with segmentation
Parameter | Volume Overlap Based Metrics | Surface Based Metrics | Moment Based Metrics |
---|---|---|---|
Examples | Dice Similiarity Jaccard Index |
Hausdorff distance MSD |
Centroid distance |
Advantage | Easy to compute | Sensitive to boundaries | Useful for smaller volumes |
Disadvantage | Low sensitivity to complex boundaries | Requires pre-specified tolerance thresholds | Insensitive to contour shape |
timeline section Simple Methods 1990: Intensity based : Edge detection : Active shape models section Atlas based 2000: Deformable image registration section Non deep learning 2003: Support vector machines 2012: Random forest section Deep learning 2016: Convolution Neural Networks 2019: Generative Adversarial Models
Note
Timeline adopted from Harrison K et al. Machine Learning for Auto-Segmentation in Radiotherapy Planning. Clin Oncol 2022;34:74–88. (Read here)
Parameter | Intensity Based Methods | Deformable Registration Based | Simple Machine Learning | Deep Learning |
---|---|---|---|---|
Computation | Simple | Simple | Simple | High |
Cases | None | 20 - 30 | 50 - 100 | >100 |
How it works | Relies on differences in HU. | Uses deformable registration | Feature selection of geometric features | Millions of parameters “learned”. |
Context “Learned” | -NA- | -NA- | Geometric features | Anatomic features |
Limitation | Limited utility in soft tissue | Limited capability to adapt to anatomy | Feature selection is not robust | Difficult to optimize and explain how it works |
Intensity based methods are simple to use:
Rely on intensity difference between anatomical structures and homogenity of intensity in the structure
Examples : Lung automatic segmentation, Bladder segmentation using points
Shape based methods start with a shape which is then deformed to match to a anatomical structure.
Active contour methods start with a predefined shape that deforms to the organ
Shape models use pre-existing manually drawn shapes to build a statistical model of the shape of a volume
Active appearance models extend this by including information about the HU values and textures of the volume
Relies on creation of a set of cases with manually segmented volumes desired - called an atlas
A deformable image registration algorithm is used to deform the “atlas” image onto the new image.
The contours on the “atlas” image are then deformed based on the deformation vectors obtained from the image registration
Performance depends on the anatomical variability
flowchart TD A[Atlas Based automatic segmentation] --> B[Single patient atlas] A --> C[Multi Patient Atlas] A --> D[STAPLE*]
*STAPLE = Simultaneous Truth & Performance Level Estimation Algorithm
Convolution refers to a mathematical operation (typically multiplication) on two functions that produces a third function
Confusingly convolution refers to both the result of the operation as well as the process of the operation. 😱
CNNs are a special type of neural networks designed for grids / arrays of numbers
Reduce the number of input layers
Take into account the correlation between adjacent pixels of data
Building a CNN
In the image above we show a hypothetical image with white pixels (with value 255) and dark pixels with value 0.
Also shown is a “kernel” or “filter”. This is an array of numbers that is designed to extract correlated features in the image.
Images from Visualizing and Understanding Convolutional Networks3
In the images we can see what each convolution layer of a CNN will be detecting best.
Progressively across layers, features become grouped into like objects.
The variations reduce.
The discriminating part of the images are “enhanced”
Heilemann et al (2023)4 evaluated three AI algorithms for generating OARs
100 patients of 6 treatment sites.
Organ at risks segmented only
Inter-rater variability assessed (clinical manual contour as reference)
Class I | Class II | Class III |
---|---|---|
Good performance | Reasonable performance | Unacceptable clinically |
Accepted directly | Small adjustments | Not worthwhile to correct |
Usually minimal corrections | Some corrections but load acceptable | Manual delineation faster |
Brain, eye, femoral head, kidney etc | Rectum, mandible, lips, heart, bladder etc | Stomach, Bowel etc |
Bits and pieces of contours
Disconnected segments
Systematic Shift
Automatic segmentation systems currently available have a reasonable performance for delineation of organs at risk
Target volume delineation accuracy remains challenging and uncertain.
Modern nnUnet based architectures give very good performance in most scenarios
However, these models are not “intelligent” and have no “understanding” of anatomy - fail spectacularly and sometimes insidiously
Will not replace the human in the loop at any point in the near future.
What is image segmentation - https://www.ibm.com/topics/image-segmentation
Nielsen CP, Lorenzen EL, Jensen K, Eriksen JG, Johansen J, Gyldenkerne N, et al. Interobserver variation in organs at risk contouring in head and neck cancer according to the DAHANCA guidelines. Radiother Oncol 2024;197:110337. https://doi.org/10.1016/j.radonc.2024.110337.
Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. arXiv [CsCV] 2013.
Heilemann G, Buschmann M, Lechner W, Dick V, Eckert F, Heilmann M, et al. Clinical implementation and evaluation of auto-segmentation tools for multi-site contouring in radiotherapy. Phys Imaging Radiat Oncol 2023;28:100515. https://doi.org/10.1016/j.phro.2023.100515.