Quantify structure reconstruction accuracy
The datasets used for this tutorial is the reconstruction of human metastatic lymph node and can be downloaded at:
https://drive.google.com/drive/folders/1SwzdBRVat83-pmN3RTKLBuivWv5l7lfs?usp=sharing
This module provides comprehensive metrics for evaluating structural accuracy of interpolation results, supporting both 2D slice-level and 3D volume-level evaluation with boundary-based and voxel-based metrics.
Features
2D and 3D Support: Evaluate both individual slices and entire volumes
Boundary-based Metrics: Hausdorff Distance, Average Surface Distance, Chamfer Distance, Boundary IoU
Voxel-based Metrics: Dice Coefficient, Jaccard Index (IoU), FPR, FNR
Batch Evaluation: Process multiple slice pairs with automatic file matching
Metrics Overview
Boundary-based Metrics
Boundary IoU: Intersection over Union on boundary bands
Hausdorff Distance (HD): Maximum distance between boundaries
HD95: 95th percentile Hausdorff Distance (more robust to outliers)
Average Surface Distance (ASD): Symmetric mean surface distance
Chamfer Distance: Average distance from each boundary point to the nearest point on the other boundary
Voxel-based Metrics
Dice Coefficient: F1 score for overlap
Jaccard Index (IoU): Intersection over Union
Hausdorff Distance (Voxel): HD computed on all voxels
HD95 (Voxel): 95th percentile HD on all voxels
Average Surface Distance (Voxel): ASD computed on all voxels
Chamfer Distance (Voxel): Chamfer distance computed on all voxels
False Positive Rate (FPR): Rate of false positives
False Negative Rate (FNR): Rate of false negatives
import unist
! pip install imagecodecs
Requirement already satisfied: imagecodecs in /opt/anaconda3/envs/unist/lib/python3.10/site-packages (2025.3.30)
Requirement already satisfied: numpy in /opt/anaconda3/envs/unist/lib/python3.10/site-packages (from imagecodecs) (1.26.4)
2D
from unist.metrics import evaluate_slices
results_df = evaluate_slices(
true_dir="/Users/shuilan/Documents/GitHub/UniST/tumor_occupancy_tifs_gray", # path to ground truth images (.tif)
pred_dir="/Users/shuilan/Documents/GitHub/UniST/tumor_occupancy_tifs_rgb_10slices_avg_thresh", # path to predicted images (.tif)
true_pattern=r'_bin(\d+)_gray\.tif$',
pred_pattern=r'^occupancy(\d+)\.tif$',
output_csv="/Users/shuilan/Documents/GitHub/UniST/evaluation_results.csv",
compute_boundary=True,
compute_voxel=True,
boundary_thickness=1,
boundary_tolerance=1,
verbose=True
)
Found 15 matched pairs for evaluation.
--------------------------------------------------
Processing pair 1: occupancy_z57.97_bin1_gray.tif vs occupancy1.tif
✓ Computed 13 metrics
Processing pair 2: occupancy_z86.96_bin2_gray.tif vs occupancy2.tif
✓ Computed 13 metrics
Processing pair 3: occupancy_z115.94_bin3_gray.tif vs occupancy3.tif
✓ Computed 13 metrics
Processing pair 4: occupancy_z144.93_bin4_gray.tif vs occupancy4.tif
✓ Computed 13 metrics
Processing pair 8: occupancy_z260.87_bin8_gray.tif vs occupancy8.tif
✓ Computed 13 metrics
Processing pair 9: occupancy_z318.84_bin9_gray.tif vs occupancy9.tif
✓ Computed 13 metrics
Processing pair 15: occupancy_z492.75_bin15_gray.tif vs occupancy15.tif
✓ Computed 13 metrics
Processing pair 16: occupancy_z521.74_bin16_gray.tif vs occupancy16.tif
✓ Computed 13 metrics
Processing pair 17: occupancy_z550.72_bin17_gray.tif vs occupancy17.tif
✓ Computed 13 metrics
Processing pair 21: occupancy_z666.67_bin21_gray.tif vs occupancy21.tif
✓ Computed 13 metrics
Processing pair 22: occupancy_z695.65_bin22_gray.tif vs occupancy22.tif
✓ Computed 13 metrics
Processing pair 24: occupancy_z753.62_bin24_gray.tif vs occupancy24.tif
✓ Computed 13 metrics
Processing pair 26: occupancy_z811.59_bin26_gray.tif vs occupancy26.tif
✓ Computed 13 metrics
Processing pair 31: occupancy_z956.52_bin31_gray.tif vs occupancy31.tif
✓ Computed 13 metrics
Processing pair 32: occupancy_z985.51_bin32_gray.tif vs occupancy32.tif
✓ Computed 13 metrics
✓ Results saved to: /Users/shuilan/Documents/GitHub/UniST/evaluation_results.csv
Summary Statistics:
Dice IoU HD_voxel HD95_voxel ASD_voxel ChamferDistance_voxel FPR FNR BoundaryIoU HD HD95 ASD ChamferDistance
mean 0.7501 0.7144 22.0124 1.1132 0.3955 0.7910 0.0293 0.2605 0.8271 22.0124 1.1132 0.3987 0.7975
std 0.3663 0.4182 44.9466 1.6347 0.5811 1.1622 0.0455 0.3819 0.2544 44.9466 1.6347 0.5858 1.1715
3D
This step is computationally intensive; we recommend running it on a machine with sufficient memory.
Here I test on a subset of the volume which contains 4 slices.
from unist.metrics import evaluate_volume_from_dirs
results = evaluate_volume_from_dirs(
true_dir="/Users/shuilan/Documents/GitHub/UniST/tumor_occupancy_tifs_gray_sub",
pred_dir="/Users/shuilan/Documents/GitHub/UniST/tumor_occupancy_tifs_rgb_10slices_avg_thresh_sub",
true_pattern=r'_bin(\d+)_gray\.tif$',
pred_pattern=r'^occupancy(\d+)\.tif$',
compute_boundary=True,
compute_voxel=True,
verbose=True,
)
print(f"3D Dice: {results['Dice']:.4f}")
print(f"3D HD95: {results['HD95']:.4f}")
Stacked 4 slices into volume shape (4, 266, 200)
3D Dice: 0.6160
3D HD95: 1.7321
results
{'Dice': 0.6160497682219381,
'IoU': 0.44513867196205564,
'HD_voxel': 27.910571473905726,
'HD95_voxel': 1.7320508075688772,
'ASD_voxel': 0.46173223021893384,
'ChamferDistance_voxel': 0.9234644604378677,
'FPR': 0.029736855982703158,
'FNR': 0.44652567975830815,
'BoundaryIoU': 0.44484987489574646,
'HD': 27.910571473905726,
'HD95': 1.7320508075688772,
'ASD': 0.4620237222467427,
'ChamferDistance': 0.9240474444934854}