统计问题第148问:Cohen's Kappa统计系数

Question

Chest radiographs are the best method for diagnosing pneumonia but are often not available in developing countries. Therefore, in 1990 the World Health Organization developed guidelines for diagnosis of non-severe pneumonia that comprised clinical symptoms of fast breathing alone. However, fast breathing can have causes other than pneumonia, and thus children who are given a diagnosis of non-severe pneumonia on the basis of fast breathing alone may receive antibiotics unnecessarily.

Children aged 2 to 59 months with non-severe pneumonia diagnosed on the basis of the WHO guidelines were invited to participate from outpatient departments of six hospitals in Pakistan. In total 2000 children were enrolled, for whom 1848 chest radiographs were available for assessment. Two consultant radiologists used standardised criteria to evaluate the chest radiographs, with no clinical information available to them. The primary outcome was diagnosis of pneumonia (absent or present) from chest radiographs.

Cohen’s coefficient κ for agreement between the two radiologists in their diagnoses was 0.46. A small number of children were given a diagnosis of bronchiolitis. The researchers concluded that most children with non-severe pneumonia diagnosed on the basis of the current WHO definition had normal chest radiographs.

Which of the following statements, if any, are true?

·a) Cohen’s κ was calculated as the proportion of overall agreement between radiologists in their diagnoses.

·b) If no agreement existed between the radiologists, κ would equal zero.

·c) The agreement between the radiologists in their diagnoses can be interpreted as very good.

提示:正确答案只有一个。

Answer

Statement b is true, while a and c are false.

Cohen’s coefficient κ is a measure of agreement between the two radiologists in their diagnoses made on the basis of the chest radiographs. The coefficient is not calculated as the proportion of radiographs for which there was overall agreement in diagnosis (a is false). Cohen’s κ is a measure of agreement between the radiologists taking into account agreement that would have occurred through random variation—that is, the expected agreement even if the radiologists showed no concordance in their diagnostic criteria. It was derived by comparing the overall observed and expected proportions of agreement between the radiologists. The formula for Cohen’s κ can be found in most statistical texts.

The cross tabulation of the radiologists’ diagnoses for the 1848 children made on the basis of their chest radiographs is shown in the table. The overall percentage of agreement in diagnoses was [(176+1252+52)/1848]×100=80.1%. However, the overall percentage agreement may be misleading as a measure of agreement. Some agreement would have been expected through random variation—that is, if the two radiologists showed no concordance in how they achieved a diagnosis. To illustrate this, radiologist B diagnosed pneumonia in 416 children (22.5% of the total number) who had pneumonia according to the WHO guidelines. If the two radiologists showed no coherence in their diagnosis, we would expect radiologist A to diagnose pneumonia, no pneumonia, or bronchiolitis in these 416 children in the same proportions shown overall by radiologist A. In particular, radiologist A diagnosed pneumonia in 14% of the children, bronchiolitis in 5%, and no pneumonia in 81%. Therefore, if no concordance existed between the radiologists in their diagnoses, then simply at random we would expect radiologist A to diagnose pneumonia in 14% of the 416 children diagnosed as having pneumonia by radiologist B—that is, 58.24 children. Expected frequencies rarely take integer values but will always sum to the row and column marginal totals. Furthermore, of the 1366 children who radiologist B diagnosed as being without pneumonia, radiologist A would also be expected to diagnose 0.81×1366=1106.46 children without pneumonia simply at random. Finally, if no concordance existed between the radiologists in their diagnoses, then of the 66 children in whom radiologist B diagnosed bronchiolitis, radiologist A would also be expected to diagnose it in 0.05×66=3.3 children. The two radiologists would therefore have been expected to agree in their diagnoses for 58.24+1106.46+3.3=1168 children in total, even if they showed no concordance in their criteria for diagnosis. Therefore the overall expected percentage agreement that would have occurred randomly was (1168÷1848)×100=63.2%.

Radiologists’ diagnoses made on basis of chest radiographs of children who had non-severe pneumonia diagnosed according to WHO guidelines. Values are frequencies (percentages)

71a1a497a18ba65dddb160095ba81434.png

If perfect agreement existed between the radiologists then Cohen’s κ would have equalled 1, and if no agreement existed in the sense there was no concordance over that expected through random variation then κ would equal 0 (b is true). Theoretically Cohen’s κ can take any negative value, and this would occur if there was no agreement—for example, if the radiologists completely disagreed in their diagnoses.

It is difficult to quantify what value κ should be so as to constitute good agreement. This in part depends on the context of assessment and the clinical importance of agreement. It has been suggested that agreement is poor if κ is less than 0.2, fair if it is between 0.21 and 0.4, moderate if it is between 0.41 and 0.6, good if it is between 0.61 and 0.8, and very good if it is between 0.81 and 1 (c is false). It is possible to calculate a 95% confidence interval for the population parameter Cohen’s κ to provide a measure of accuracy of the sample estimate, although the researchers above did not do so.

Cohen’s κ was originally developed to provide a measure of agreement between two raters for a classification where disagreement between categories was equally likely. If the classification is on an ordinal scale and the categories have order, such as “poor,” “fair,” “good,” and “excellent,” then the κ coefficient will not provide the best measure of agreement. A weighted Cohen’s κ may be used that provides more weight to observations that disagree but are closer on the ordinal scale, with less weight given to the observations that disagree the most. The weight given to observations that represent disagreement between raters depends on how serious such disagreements are judged to be.

所以答案是选择 b

每天学习一点,你会更强大!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
自动控制节水灌溉技术的高低代表着农业现代化的发展状况,灌溉系统自动化水平较低是制约我国高效农业发展的主要原因。本文就此问题研究了单片机控制的滴灌节水灌溉系统,该系统可对不同土壤的湿度进行监控,并按照作物对土壤湿度的要求进行适时、适量灌水,其核心是单片机和PC机构成的控制部分,主要对土壤湿度与灌水量之间的关系、灌溉控制技术及设备系统的硬件、软件编程各个部分进行了深入的研究。 单片机控制部分采用上下位机的形式。下位机硬件部分选用AT89C51单片机为核心,主要由土壤湿度传感器,信号处理电路,显示电路,输出控制电路,故障报警电路等组成,软件选用汇编语言编程。上位机选用586型以上PC机,通过MAX232芯片实现同下位机的电平转换功能,上下位机之间通过串行通信方式进行数据的双向传输,软件选用VB高级编程语言以建立友好的人机界面。系统主要具有以下功能:可在PC机提供的人机对话界面上设置作物要求的土壤湿度相关参数;单片机可将土壤湿度传感器检测到的土壤湿度模拟量转换成数字量,显示于LED显示器上,同时单片机可采用串行通信方式将此湿度值传输到PC机上;PC机通过其内设程序计算出所需的灌水量和灌水时间,且显示于界面上,并将有关的灌水信息反馈给单片机,若需灌水,则单片机系统启动鸣音报警,发出灌水信号,并经放大驱动设备,开启电磁阀进行倒计时定时灌水,若不需灌水,即PC机上显示的灌水量和灌水时间均为0,系统不进行灌水。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值