Authors: Xiang Fang, Tao Cui, Jianqing Qiu, Hong Duan, Wenli Zhang (firstname.lastname@example.org)
Affiliation: Sichuan University, West China Hospital, Chengdu, China
Dear Editor in Chief,
Basha and colleagues  have conducted a prospective study on 143 patients to highlight the diagnostic value of ultrasonography in assessment of patients with anterior knee pain (AKP). Although high diagnostic accuracy was reported, we would caution against the widely use of ultrasonography as an alternative to MRI for the following two reasons.
First, the gold standard (MRI), as stated by the authors, was performed during a timeframe of 1 week after the ultrasonography, which would make the two tests less comparable and also lead to a major dilemma about whether those patients should receive any treatment during that week. If these ultrasonography-confirmed AKP patients were left untreated, an ethical issue would appear. However, the therapeutic intervention would have a potential impact on the diagnostic validity of the gold standard, and further, the ultrasonography. Moreover, the 7-day timeframe itself might bias the operating characteristics (false positive fraction, true positive fraction) of the tests . Hence, we propose the authors should present all these details about how they made the tests comparable and how they handled this dilemma.
Second, the authors used the Cohen’s kappa statistics to determine the agreement between ultrasonography and MRI. However, this test is not always appropriate for assessing the agreement between qualitative variables. The kappa value is highly dependent on the concordant cell prevalence. This implies that there can be an 80% consensus between concordant cells, but the values of kappa are not reflective of this as it may have been affected by the difference in terms of prevalence [3, 4]. For example, 40% and 40% versus 70% and 10% do not have equivalent values of kappa. In such an instance, Cohen’s kappa values will be 0.61 (good consensus) and 0.30 (fair consensus), respectively. Moreover, it is also dependent on the category numbers. It is important to mention here that when a variable, which has more than two categories, or an ordinal scale, is applied (which has more than three or more ordered categories), a weighted kappa becomes the good option [3, 4]. Finally, the two raters had inequivalent marginal distributions in their responses constituting another critical flaw. Hence, utilizing the Cohen’s kappa has consequential conceptual issues resulting in an unreliable agreement evaluation. Reader should interpret the ultrasonography-MRI agreement results cautiously.
 Basha MAA, Eldib DB, Aly SA et al (2020) Diagnostic accuracy of ultrasonography in the assessment of anterior knee pain. Insights Imaging 11(1):107.
 Margaret SP (2003) The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press, New York
 Sabour S (2020) Agreement between questionnaires and registry data on routes to diagnosis and milestone dates of the cancer diagnostic pathway; Methodological issues. Cancer Epidemiol 67:101741.
 Szklo M, Nieto FJ (2014) Epidemiology Beyond the Basics, 3rd edn. Jones and Bartlett Publisher, New York