统计问题第176问: 数据的对数变换

Question

Researchers evaluated the effectiveness of early abdominopelvic computed tomography in patients with acute abdominal pain of unknown cause. A randomised controlled trial study design was used. The intervention was early computed tomography (within 24 hours of admission). The control treatment was standard practice (radiological investigations as indicated). In total, 55 patients were randomised to early computed tomography and 55 to control treatment.

The main outcome measures included length of hospital stay. The distribution of length of hospital stay was positively skewed. The logarithm function was used to transform the observations, and the Student’st test was then used to compare the treatment groups. The length of hospital stay for the standard practice group was on average 1.1 days longer than that in the early computed tomography group (geometric mean 6.4 days (range 1 to 60) versus 5.3 days (1 to 31). The ratio of geometric means (standard treatment versus early computed tomography) was 1.21 (95% confidence interval 0.92 to 1.56).

The overall conclusions of the study were that early abdominopelvic computed tomography for acute abdominal pain may reduce length of hospital stay and mortality. Furthermore, it could also identify unforeseen conditions and potentially serious complications.

Which of the following statements, if any, are true?

·a) The purpose of the logarithm transformation of length of hospital stay was to achieve a normal distribution.

·b) In each treatment group, the geometric mean of length of hospital stay was larger than the arithmetic mean.

·c) The standard practice group spent on average 21% longer in hospital than the early computed tomography group.

·d) The difference between treatments in length of hospital stay was significant at the 5% level.

提示:正确答案不止一个。

Answer

Statements a and c are true, while b and d are false.

Student’st test compares the mean of a variable measured on a continuous scale between two independent groups. Described in a previous question, Student’s t test is sometimes referred to as the independent samples ttest, the two sample t test, or simply the t test. Student’s t test is a parametric test, and assumptions are made about the data for the test to be applied. Parametric tests have been described in a previous question. It is assumed that the variable to be compared is approximately normally distributed in both groups and that the variances in the two groups are equal.

In the example above, the aim was to establish whether there was a significant difference between the treatment groups in length of hospital stay. The distribution of length of hospital stay was skewed, and therefore Student’sttest could not be used. Two options were available. Firstly, a non-parametric test could have been performed that did not make assumptions about the distribution of the data. The Wilcoxon rank sum test or Mann-Whitney U test, described in a previous question, could have been used. Alternatively, the observations of length of hospital stay could have been transformed, and hopefully the transformed data would then meet the assumptions of Student’s ttest. The outcome variable, length of hospital stay, was transformed using the logarithm function (referred to simply as “log transformed”, which involved obtaining the logarithm of each observation. It was not indicated whether natural logarithms (to base e, a mathematical constant, ≈2.718) or common logarithms (to base 10) were obtained, but either would have been suitable. After transformation the data were approximately normally distributed (a is true), permitting Student’s t test to be used. Although it was not indicated whether the second assumption of equal variances required for Student’s t test was met after transformation, a logarithmic transformation typically achieves equality of variances between two groups if it was not already present.

The distribution of length of hospital stay was skewed to the right, and therefore the arithmetic mean would be disproportionally raised by a small number of high values in the right hand tail of the distribution. The median length of hospital stay would be a better measure of central location than the arithmetic mean. However, the length of hospital stay was normally distributed after the log transformation, and in such circumstances the geometric mean is a good measure of central location. The geometric mean for each treatment group was derived by anti-logging (that is, back transforming) the arithmetic mean of the log transformed length of hospital stay for each treatment group. Anti-logging involves raising e or 10 (depending on whether natural logarithms or logarithms to base 10 were used to transform length of hospital stay) to the power of the group mean of the log transformed data. The geometric means are on the same scale and with the same units as the original outcome measure.

The distribution of length of hospital stay was skewed to the right, and therefore the geometric mean will be larger in value than the median yet smaller than the arithmetic mean (b is false). It was reported that both treatment groups had a median length of stay of five days, while the mean on the original scale (untransformed) was 6.6 (SD=5.8) days in the early computed tomography group and 9.2 (9.8) days in the standard practice group. The geometric mean was 5.3 days in the early computed tomography group and 6.4 days in the standard practice group.

The anti-log of the mean difference between treatment groups for the log transformed data is equivalent to the ratio of the geometric means. Anti-logging the limits of the 95% confidence interval for the mean difference of the log transformed data gives a 95% confidence interval for the ratio of the geometric means. The geometric means were 6.4 days for the standard practice group and 5.3 days for the early computed tomography group. The ratio of these two means (6.4÷5.3) was 1.21, with a corresponding 95% confidence interval of 0.92 to 1.56. This ratio is interpreted in a similar fashion to a relative risk, and therefore the standard practice group stayed in hospital on average 21% longer than the early computed tomography group (c is true). The 95% confidence interval is an interval estimate for the population parameter of the ratio of geometric means, representing the uncertainty of the sample in estimating the population parameter as a result of sampling error. Therefore, with a probability of 0.95 it was estimated that in the population the standard practice group, when compared with the early computed tomography group, may have as much as an 8% shorter stay or a 56% longer stay in hospital. The 95% confidence interval for the ratio of geometric means straddled unity and therefore, as described in a previous question, the difference between treatment groups in length of hospital stay was not significant at the 5% level of significance (d is false). This was reflected by the reported P value for the test of this ratio, which was 0.17.

In medicine many variables have a distribution that is skewed to the right, and the logarithm transformation is typically used to achieve a normal distribution in the data. Such a data transformation is important in statistical analysis; although it may appear as a way of manipulating data to get the desired result, a logarithm scale is simply an alternative means of representing data originally measured on a linear scale. It has advantages if the distribution of a variable is normal after transformation, since it permits parametric statistical tests to be performed rather than non-parametric ones. Parametric statistical tests allow the estimation of treatment effects using confidence intervals in addition to hypothesis testing, whereas non-parametric tests are generally limited to hypothesis testing alone. Sometimes it is possible to obtain estimates of treatment effects when performing non-parametric tests, but generally it is not straightforward.

The logarithm transformation is one of several transformations that may be applied in statistical analysis. Generally, a data transformation will be applied so that the data satisfy the assumptions of a statistical test or procedure that is to be applied. The choice of transformation typically depends on the type of variable, scale of measurement, or shape of the distribution of the variable.

所以答案是选择 a c

每天学习一点,你会更强大!


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
本系统的研发具有重大的意义,在安全性方面,用户使用浏览器访网站时,采用注册和密码等相关的保护措施,提高系统的可靠性,维护用户的个人信息和财产的安全。在方便性方面,促进了校园失物招领网站的信息化建设,极大的方便了相关的工作人员对校园失物招领网站信息进行管理。 本系统主要通过使用Java语言编码设计系统功能,MySQL数据库管理数据,AJAX技术设计简洁的、友好的网址页面,然后在IDEA开发平台中,编写相关的Java代码文件,接着通过连接语言完成与数据库的搭建工作,再通过平台提供的Tomcat插件完成信息的交互,最后在浏览器中打开系统网址便可使用本系统。本系统的使用角色可以被分为用户和管理员,用户具有注册、查看信息、留言信息等功能,管理员具有修改用户信息,发布寻物启事等功能。 管理员可以选择任一浏览器打开网址,输入信息无误后,以管理员的身份行使相关的管理权限。管理员可以通过选择失物招领管理,管理相关的失物招领信息记录,比如进行查看失物招领信息标题,修改失物招领信息来源等操作。管理员可以通过选择公告管理,管理相关的公告信息记录,比如进行查看公告详情,删除错误的公告信息,发布公告等操作。管理员可以通过选择公告类型管理,管理相关的公告类型信息,比如查看所有公告类型,删除无用公告类型,修改公告类型,添加公告类型等操作。寻物启事管理页面,此页面提供给管理员的功能有:新增寻物启事,修改寻物启事,删除寻物启事。物品类型管理页面,此页面提供给管理员的功能有:新增物品类型,修改物品类型,删除物品类型。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值