Comprehensive Summary
Nazine et al. tested observer agreement in interpreting DMSA scintigraphy scans for renal scarring in pediatric patients. Technetium-99m dimercaptosuccinic acid (DMSA) scintigraphy was used on 220 patients chosen from 693 individuals less than 18 years old based on image clarity. The resulting images were analyzed by 4 experienced radiologists in order to calculate the levels of agreement in observations between each radiologist observer (inter-observer) and between each observer 3-4 weeks apart (intra-observer). Observers were not aware of the patients’ medical information, the other observers’ interpretations, nor that they were shown the same images twice in the 3-4 week period. Observers sorted scans into groups of scar, no scar, as well as location of a possible kidney defect (not affected, upper pole, mid-zone, lower pole, multiple zones), followed by percentage of involvement in the renal parenchymal tissue (0%, <10%, 10-24%, 25-49%, 50-74%, and >74%). The study noted that observers used the Make Sense AI platform for visual annotations according to their prognosis of the location and severity of kidney defects. Concerning intra-observer agreement, Cohen’s kappa scores (scale of 0.01 to 1.00 with 1.00 being near perfect agreement) ranged from 0.704 to 0.95 for the observers’ overall impressions (scar, no scar) while Kendall’s tau-b values (scale of -1 to +1) varied from 0.43 to 0.91 between observers’ concordance on percentage of kidney involvement. Concerning inter-observer agreement, comparing all collected data between radiologists showed a low kappa score (0.0801-0.22) and Kendall’s tau-b value (0.12-0.303) for left kidney interpretations. It was noted that observers 1 and 4 had substantial agreement (0.64-0.82 kappa score and 0.57-0.71 Kendall’s tau-b values), and that observer 2 showed lowest agreement for both types of agreement (when compared to themself and to others). Overall, intra-observer agreement was substantial while inter-observer agreement severely varied in all 3 categories between pairs demonstrating significant variability in DMSA scintigraphy interpretation. Naznine et al. noted limitations including a limited number of radiologists and the use of single-photo emission computed tomography (SPECT) without 3D resolution.
Outcomes and Implications
Detection of UTIs in younger children is challenging due to nonspecific symptoms and are consequently often undetected by medical professionals. These children with untreated UTIs have an increased risk of renal scarring and other complications. DMSA scans are vital for stratifying these risks and planning appropriate treatments. Standardized interpretations of DMSA scans are crucial for this stratification, however, the observer variability in this study demonstrates a lack in standardization. Therefore, more uniform training between individuals is needed for DMSA scans to be a reliable diagnostic tool. In the future, machine learning may be able to improve efficiency, accuracy, objectivity, and consistency of DMSA image evaluation.