Recently, The Smart Lab team at the Hong Kong University of Science and Technology collaborated with The Third Affiliated Hospital of Southern Medical University and proposed a deep learning method that incorporates Co-Plane Attention across MRI Sequences (CoPAS) to classify knee abnormalities. The paper is published in Nature Communications. Following is the summary of the paper.
Various knee abnormalities can arise from aging or injury, resulting in knee pain and dysfunction. The accurate diagnosis of knee abnormalities is crucial for developing personalized treatment plans and improving the overall quality of life for patients.
Researchers collected the largest multi-sequence knee magnetic resonance imaging dataset involving the most comprehensive range of abnormalities, comprising 1748 subjects and 12 types of abnormalities. The model outperforms junior radiologists and remains competitive with senior radiologists. With the assistance of model output, the diagnosis accuracy of all radiologists was improved significantly. The interpretability analysis demonstrated that the model decision-making process is consistent with the clinical knowledge, enhancing its credibility and reliability in clinical practice.
Figure 1. Overview of this study. A multi-center knee MRI dataset was collected from five clinical centers and encompassed 1748 patients in total. Twelve knee abnormalities are covered in this study, where the patient-level annotations are obtained by combining MRI and arthroscopy results.
The data in this study was collected from five clinical centers in China: The Third Affiliated Hospital of Southern Medical University (denoted as the internal dataset), The Seventh Affiliated Hospital of Southern Medical University, Zhujiang Hospital of Southern Medical University, Foshan Hospital of Traditional Chinese Medicine, and The Fifth Affiliated Hospital of Sun Yat-sen University. 12 types of knee abnormalities were included: meniscal tear (MENI); anterior cruciate ligament tear (ACL); cartilage damage (CART); posterior cruciate ligament injury (PCL); medial collateral ligament injury (MCL); lateral collateral ligament injury (LCL); joint effusion (EFFU); bone contusion (CONT); synovial plica (PLICA); cyst (CYST); infrapatellar fat pad injury (IFP); and patellar retinaculum injury (PR). Patient-level annotations of the abnormalities were presented by experienced doctors using both arthroscopy and MRI. Up to six human radiologists, including junior and senior professionals, are involved in simulating and verifying the effectiveness of the model’s assistance in the clinical environment. Their involvement includes independently providing their assessments initially and subsequently re-evaluating along with our model’s results.
CoPAS outperforms other existing deep learning models with an average AUC-ROC of 0.812, 0.721, and 0.726 in our internal and external dataset evaluation across different sequence settings, showing its robustness in adapting to multi-center variations. When compared with junior and senior radiologists using internal data, CoPAS surpasses the junior radiologists in all abnormalities. Although it exhibits minor shortcomings in average accuracy (ACC 0.78 to 0.80), it still outperforms the senior radiologists in 5 out of the 12 abnormalities (specifically MENI, ACL, CART, PCL, and MCL). Notably, the model demonstrates higher accuracy in tasks involving complex spatial morphology, such as the cruciate ligament, which indicates it excels in handling spatial relations between slices and sequences compared to the radiologists.
Figure 2. The comparison between CoPAS and radiologists in the internal dataset. CoPAS surpasses the junior radiologist on all abnormalities and outperforms the senior radiologist in nearly half of the classes.
To evaluate the model’s ability to assist human radiologists in diagnosis, radiologists’ performance was compared with and without the model’s classification output in external data. The evaluation involved six radiologists: four junior radiologists (with 2, 2, 5, and 7 years of experience) and two senior radiologists (with 10 and 30 years of experience). CoPAS consistently outperformed the junior radiologists in most instances. Most importantly, the radiologists’ accuracy was significantly improved when aided by the model, especially in cases of MENI, CART, PLIA, CYST, and IFP. These results underline the model’s effectiveness in providing substantial assistance in diagnosing complex knee abnormalities in clinical settings.
Figure 3. ROC curve for each abnormality of the model’s prediction and the (FPR, TPR) pair from the radiologists’. Each radiologist is denoted by points with and without cross indicating their results before and after AI assistance. CoPAS outperforms junior radiologists and makes significant improvements when assisting all radiologists.
Images from different planes can offer varying perspectives, with each knee abnormality having its optimal diagnostic MRI plane. This is highlighted when comparing an empirical table provided by radiologists with the certainty level and accuracy measurements of the model output from each MRI plane. From both qualitative and quantitative comparisons, there is a high correlation between the model results and clinical preference, as the prediction certainty of each plane matches the probability of observation in the clinic. The model exhibits plane-related preferences similar to those of human radiologists. It has derived a set of corresponding rules that maximizes the prediction probability during decision-making, which allows it to produce more reliable results during clinical implementation.
Figure 4. Correlation derived by (a) clinical observation probability, (b) model prediction certainty, and (c) model prediction accuracy. Thicker lines indicate the stronger relations between the abnormality and plane. The similar patterns in black boxes suggest that the model has learned a diagnosis preference that aligns with clinical observations.
The deep-learning model CoPAS is competitive when compared to senior radiologists and can improve the accuracy of radiologists’ diagnoses. The findings, which indicate consistency between clinical observations and the model’s predictions, not only demonstrate its interpretability but also showcase the potential to identify and validate new clinical insights.
For more details, interested readers can refer to the published paper.