Background
Spontaneous intracerebral haemorrhage is bleeding within the brain parenchyma in the absence of trauma or surgery, which may extend into the ventricles and subarachnoid space. Volumes of intracerebral haemorrhage (ICH), perihaematomal oedema (PHE) and intraventricular haemorrhage (IVH) are well-established biomarkers, consistent independent predictors of functional outcome of spontaneous ICH. Manual delineation of these biomarkers is labour intensive and prone to human error. Thus, an efficient and effective automated segmentation tool could provide quantitative outcome measures for clinical trials and accelerate the studies of the disease in large patient cohort.
Objective
To benchmark and refine deep learning algorithms for the semantic segmentation of ICH, PHE and IVH on non-contrast computed tomography (NCCT) scans of spontaneous ICH patients.
Methods
Performance comparison of multiple deep learning models on 1,732 annotated baseline NCCT scans obtained from the TICH-2 international multicentre trial. Furthermore, different loss functions using 3D nnUNet were examined to address the problem of class imbalance.
Quantitative performance of the lesion volume was measured using the automated-versus-human concordance and Bland–Altman plots. Accuracy overlay between ground truth and predicted lesion was quantified using the Dice similarity coefficient (DSC).
Results
On the test cohort (n=174, 10% of dataset), the top-performing models achieved median DSC of 0.92 (IQR of 0.89-0.94), 0.66 (0.58-0.71) and 1.00 (0.87-1.00) respectively for ICH, PHE and IVH. UNet-based networks showed satisfactory performances on ICH and PHE segmentations, with no significant difference between them (p>0.05), yet all nnUNet variants obtained significantly higher accuracy than BLAST-CT and DeepLabv3+ for all labels (p<0.05). In particular, the Focal model showed significant performance improvement in IVH segmentation compared to Tversky, 2D nnUNet, UNet, BLAST-CT, and DeepLabv3+ (p<0.05).
With reference to the Focal model, ICH, PHE and IVH volumes achieved concordance of 0.98 (substantial), 0.88 (poor), and 0.99 (substantial) respectively. The mean volumetric differences between ground truth and prediction were 0.32 mL (95% CI −8.35 to 9.00), 1.14mL (-9.53 to 11.8) and 0.06mL (-1.71 to 1.84) respectively.
Conclusions
UNet-based networks provide robust segmentation results on CT images of spontaneous ICH and Focal loss can address the problem of class imbalance in which there is only 30% of IVH in the dataset.