Comprehensive Summary
This study assesses the ability of Vision Transformers (ViT) to diagnose knee osteoarthritis (OA). 750 MRI images of various OA grades (Grades 1, 2, 3, and 4) were collected and analyzed by a ViT, followed by data analysis to evaluate the performance of OA grading. The metrics used were sensitivity, accuracy, specificity, precision, and F1 score, which is a weighted average of precision and recall (the Ratio of accurately predicted patients to all relevant occurrences). Overall, the model performed well, despite varying accuracy across grades, with all groups except Grades 0 and 2 achieving over 90% accuracy. With the increasing availability of MRI imaging, this study presents a highly accurate model compared with previously developed X-ray-based models. Despite the accuracy of ViT’s however, there are limitations in computational and data requirements, since large datasets are required to train and run a novel model.
Outcomes and Implications
Osteoarthritis is often difficult to grade in severity, and previous X-ray-based models have fallen short of high accuracy in pursuit of shorter diagnostic times. This study presents a highly capable and efficient model using MRI scans, which, with availability and affordability in clinical settings, can mean greater accuracy and lower diagnostic times in the clinic. However, limitations inherent to ViT development, such as large data set requirements and computational cost, still pose challenges for implementation. Further research is needed to optimize and implement a ViT model, but it remains promising for practical applications based on quality alone.