预测验证方法-第二部分

最新推荐文章于 2022-03-04 20:38:35 发布

遥远的星辰

最新推荐文章于 2022-03-04 20:38:35 发布

阅读量7.3k

点赞数 6

分类专栏：学习文章标签：预报评分

学习专栏收录该内容

4 篇文章 0 订阅

订阅专栏

预测验证方法-第二部分

Methods:
方法

http://www.cawcr.gov.au/projects/verification/?spm=5176.11409106.555.15.4a101e8b5HOXT5#Methods_for_spatial_forecasts
本文为以上链接网页的翻译，用于个人学习研究。如有不妥之处请指正。

Methods:

方法

Standard verification methods

标准验证方法

"Eyeball" verification “眼球”验证
One of the oldest and best verification methods is the good old fashioned visual, or “eyeball”, method: look at the forecast and observations side by side and use human judgment to discern the forecast errors. Common ways to present data are as time series and maps.
最古老和最好的验证方法之一是良好的传统可视化，也称“眼球”法：并排查看预测和观测数据，使用人的判断来识别预测误差。常见的数据呈现方式有时间序列和地图。
The eyeball method is great if you only have a few forecasts, or you have lots of time, or you’re not interested in quantitative verification statistics. Even when you do want statistics, it is a very good idea to look at the data from time to time!
如果你只有很少的预测，或者你有很多时间，或者你对定量验证统计不感兴趣，眼球法就太好了。即使你想要统计数据，这是一个很好的方法，可以不时看看数据！
However, the eyeball method is not quantitative, and it is very prone to individual, subjective biases of interpretation. Therefore it must be used with caution in any formal verification procedure.
然而，眼球法不是定量的，它很容易产生个体的、主观的解释偏差。因此，在任何形式的验证过程中必须谨慎使用。
The following sections give fairly brief descriptions of the standard verification methods and scores for dichotomous, multi-category, continuous, and probabilistic forecasts. For greater detail and discussion of the standard methods see Stanski et al. (1989) or one of the excellent books on forecast verification and statistics.
以下各节对二分类、多类别、连续和概率预测的标准验证方法和评分进行了相当简要的描述。想获取标准方法更加详细的细节和讨论，请参阅斯坦斯凯等（1989）或优秀教材《预测检验与统计》一书。

Methods for dichotomous (yes/no) forecasts

二分类（Yes/No）预测的验证方法

A dichotomous forecast says, “yes, an event will happen”, or “no, the event will not happen”. Rain and fog prediction are common examples of yes/no forecasts. For some applications a threshold may be specified to separate “yes” and “no”, for example, winds greater than 50 knots.
二分类预测是说，“是的，一个事件将会发生”，或者是说“不，这个事件不会发生”。雨雾的预测是“是/否”预测常见的例子。在某些应用中，可以规定阈值来区分"是"和"否"，例如，风速大于50节。
To verify this type of forecast we start with a contingency table that shows the frequency of “yes” and “no” forecasts and occurrences. The four combinations of forecasts (yes or no) and observations (yes or no), called the joint distribution, are:
为了验证这种类型的预测，我们从列联表开始，列联表显示了“是”和“否”预测和发生的频率。预测（是或否）和观测（是或否）的四种组合，也称为联合分布，如下：
** hit** - event forecast to occur, and did occur
命中–预测事件发生，确实发生
miss – event forecast not to occur, but did occur
漏报–预测事件不发生，但确实发生了
false alarm - event forecast to occur, but did not occur
虚报–预测事件发生，但是未发生
correct negative - event forecast not to occur, and did not occur
真否定–预测事件不发生，确实未发生

The total numbers of observed and forecast occurrences and non-occurences are given on the lower and right sides of the contingency table, and are called the marginal distribution.
在列联表的下侧和右侧给出观测和预测发生的总数和未发生的总数，称为边际分布。
The contingency table is a useful way to see what types of errors are being made. A perfect forecast system would produce only hits and correct negatives, and no misses or false alarms.
列联表是一种查看产生错误类型的有用方法。一个完美的预测系统只会产生命中和真否定，没有漏报或虚报。
A large variety of categorical statistics are computed from the elements in the contingency table to describe particular aspects of forecast performance. We will illustrate these statistics using a (made-up) example. Suppose a year’s worth of official daily rain forecasts and observations produced the following contingency table:
从列联表中的元素计算各种分类统计量可以描述预测性能的特定表现。我们将使用一个（虚构的）例子来说明这些统计量。假设一年的官方每日降雨预测和观测值产生如下的列联表：
Categorical statistics that can be computed from the yes/no contingency table are given below. Sometimes these scores are known by alternate names shown in parentheses.
“是/否”列联表可计算出的分类统计量如下。有时，这些评分用括号中的交替名称来表示。

Accuracy (fraction correct)
准确性（正确预测占比）
Answers the question: Overall, what fraction of the forecasts were correct?
回答了这样的问题：总的来说，预测占比有多少是正确的呢？
Range: 0 to 1. Perfect score: 1.
预测分数的范围：0至1。满分：1。
Characteristics: Simple, intuitive. Can be misleading since it is heavily influenced by the most common category, usually “no event” in the case of rare weather.
特点：简单，直观。由于受最常见的类别很大影响，在罕见事件的情况下通常“无事件”，因此会产生误导。
In the example above, Accuracy = (82+222) / 365 = 0.83, indicating that 83% of all forecasts were correct.
在上面的例子中，准确性=(82+222)/365=0.83，表示所有预测中有83%是正确的。
Bias score (frequency bias)
偏差评分(频率偏差)
Answers the question: How did the forecast frequency of “yes” events compare to the observed frequency of “yes” events?
回答了这样的问题：发生事件的预测频率与发生事件的观测频率进行比较怎么样？
Range: 0 to âˆž. Perfect score: 1.
范围：0到âˆž。满分：1。
Characteristics: Measures the ratio of the frequency of forecast events to the frequency of observed events. Indicates whether the forecast system has a tendency to underforecast (BIAS<1) or overforecast (BIAS>1) events. Does not measure how well the forecast corresponds to the observations, only measures relative frequencies.
特点：衡量预测事件的频率与观测事件的频率之比率。指示预报系统是否有低估（bias<1）事件或高估（bias>1）事件倾向。不衡量预测与观测的对应程度，仅衡量相对频率。
In the example above, BIAS = (82+38) / (82+23) = 1.14, indicating slight overforecasting of rain frequency.
在上面的例子中，BIAS =(82+38)/(82+23)=1.14，表明对降雨频率略有高估。
Probability of detection (hit rate)
(also denoted H)
检测概率（命中率），也表示为H。
Answers the question: What fraction of the observed “yes” events were correctly forecast?
回答了这样的问题：所观察到的“是”事件的多大占比被正确预测？
Range: 0 to 1. Perfect score: 1.
范围：从0到1。满分：1。
Characteristics: Sensitive to hits, but ignores false alarms. Very sensitive to the climatological frequency of the event. Good for rare events.Can be artificially improved by issuing more “yes” forecasts to increase the number of hits. Should be used in conjunction with the false alarm ratio (below). POD is also an important component of the Relative Operating Characteristic (ROC) used widely for probabilistic forecasts.
特点：对命中敏感，但忽略虚报。对事件的气候频率非常敏感。对稀有事件有好处，可以通过发布更多的“是”预测来提高命中次数。应与虚报率（见下文）结合使用。POD也是相对概率特性（ROC）重要组成部分，被广泛应用于概率预测。
In the example above, POD = 82 / (82+23) = 0.78, indicating that roughly 3/4 of the observed rain events were correctly predicted.
在上面的例子中，POD=82/(82+23)=0.78，表明观测到的降雨事件中大约有3/4是正确预测的。
False alarm ratio
虚报率
Answers the question: What fraction of the predicted “yes” events actually did not occur (i.e., were false alarms)?
回答了这样的问题：预测的“是”事件中的多大占比实际上没有发生（即虚报）？
Range: 0 to 1. Perfect score: 0.
范围：从0到1。满分：0。
Characteristics: Sensitive to false alarms, but ignores misses. Very sensitive to the climatological frequency of the event. Should be used in conjunction with the probability of detection (above).
特点：对虚报敏感，但忽略漏报。对事件的气候频率非常敏感。应与检测概率结合使用（见上文）。
In the example above, FAR = 38/(82+38) = 0.32, indicating that in roughly 1/3 of the forecast rain events, rain was not observed.
在上面的例子中，FAR=38/(82+38)=0.32，表明在大约1/3的预测降雨事件中，没有观测到降雨。
Probability of false detection (false alarm rate)
(also denoted F)
错误检测概率（误报率），也表示为F。
Answers the question: What fraction of the observed “no” events were incorrectly forecast as “yes”?
回答了这样的问题：观察到的“不”事件的哪些部分被错误地预测为“是”？
Range: 0 to 1. Perfect score: 0.
范围：从0到1。满分：0。
Characteristics: Sensitive to false alarms, but ignores misses. Can be artificially improved by issuing fewer “yes” forecasts to reduce the number of false alarms. Not often reported for deterministic forecasts, but is an important component of the Relative Operating Characteristic (ROC) used widely for probabilistic forecasts.
特点：对虚报敏感，但忽略漏报。可以通过发布更少的“是”预测来减少虚报的数量。不经常报告确定性预测，但它是广泛用于概率预测的相对操作特性（ROC）的重要组成部分。
In the example above, POFD = 38/(222+38) = 0.15, indicating that for 15% of the observed “no rain” events the forecasts were incorrect.
在上面的例子中，POFD=38/(222+38)=0.15，表明对于观测到的15%的“无雨”事件，预测是不正确的。
Success ratio
成功率
Answers the question: What fraction of the forecast “yes” events were correctly observed?
回答了这样的问题：预测的“是”事件中哪些部分被正确观察到？
Range: 0 to 1. Perfect score: 1.
范围：0到1。满分：1.
Characteristics: Gives information about the likelihood of an observed event, given that it was forecast. It is sensitive to false alarms but ignores misses. SR is equal to 1-FAR. POD is plotted against SR in the categorical performance diagram.
特点：给出了预测的事件中观测到事件的可能性的信息。它对虚报很敏感，但忽略了漏报。SR等于1-FAR。在分类性能图中，绘制的POD与SR相对应。
In the example above, SR = 82/(82+38) = 0.68, indicating that for 68% of the forecast rain events, rain was actually observed.
在上面的例子中，SR=82/(82+38)=0.68，表明68%的预测降雨事件是实际观测到的。
Threat score (critical success index)
(also denoted CSI)
威胁评分（关键成功指数），也表示为CSI。
Answers the question: How well did the forecast “yes” events correspond to the observed “yes” events?
回答了这样的问题：预测的“是”事件与观察到的“是”事件对应程度如何？
Range: 0 to 1, 0 indicates no skill. Perfect score: 1.
范围：0到1, 0表示没有技能。完美得分：1。
Characteristics: Measures the fraction of observed and/or forecast events that were correctly predicted. It can be thought of as the accuracy when correct negatives have been removed from consideration, that is, TS is only concerned with forecasts that count. Sensitive to hits, penalizes both misses and false alarms. Does not distinguish source of forecast error. Depends on climatological frequency of events (poorer scores for rarer events) since some hits can occur purely due to random chance.
特性：衡量正确预测的观测和/或预测的事件的占比。可以认为它是排除真否定后的准确性，也就是说，TS只关心重要的预测。对命中敏感，惩罚漏报和虚报。不区分预测误差来源。取决于事件的气候频率（罕见事件的得分较低），因为有些事件的发生纯粹是由于偶然性。
In the example above, TS= 82/(82+23+38) = 0.57, meaning that slightly more than half of the “rain” events (observed and/or predicted) were correctly forecast.
在上面的示例中，TS=82/(82+23+38)=0.57，这意味着稍微超过一半的“降雨”事件(观察和/或预测)被正确预测。
Equitable threat score (Gilbert skill score)
(also denoted GSS) where
公平威胁评分（吉尔伯特技巧评分），也表示为GSS。
Answers the question: How well did the forecast “yes” events correspond to the observed “yes” events (accounting for hits due to chance)?
回答了这样的问题：预测的“是”事件与观察到的“是”事件的对应程度有多好（考虑了因偶然而产生的命中次数）？
Range: -1/3 to 1, 0 indicates no skill. Perfect score: 1.
范围：-1/3到1，0表示没有技能。满分：1.
Characteristics: Measures the fraction of observed and/or forecast events that were correctly predicted, adjusted for hits associated with random chance (for example, it is easier to correctly forecast rain occurrence in a wet climate than in a dry climate). The ETS is often used in the verification of rainfall in NWP models because its “equitability” allows scores to be compared more fairly across different regimes. Sensitive to hits. Because it penalises both misses and false alarms in the same way, it does not distinguish the source of forecast error.
特点：衡量正确预测的观测和/或预测的事件的占比，并根据随机概率进行调整(例如，在潮湿气候中准确预测降雨比在干燥气候中更容易)。在NWP模式中，ETS经常被用来验证降雨量，因为它的“公平性”允许在不同的状态下更公平地比较分数。对命中敏感。因为它以同样的方式惩罚漏报和虚报，所以它没有区分预测误差的来源。
In the example above, ETS = (82-34)/(82+23+38-34) = 0.44. ETS gives a lower score than TS.
在上面的例子中，ETS=(82-34)/(82+23+38-34)=0.44。ETS给出的分数比TS低。

Hanssen and Kuipers discriminant (true skill statistic, Peirce’s skill score)
(also denoted TSS and PSS)
汉森和柯伊珀斯判别（真正的技能统计，皮尔士的技巧得分），HK也表示为TSS和PSS。
Answers the question: How well did the forecast separate the “yes” events from the “no” events?
回答了这样的问题：预测将“是”事件与“否”事件区分开来的效果如何？
Range: -1 to 1, 0 indicates no skill. Perfect score: 1.
范围：-1到1，0表示没有技能。满分：1.
Characteristics: Uses all elements in contingency table. Does not depend on climatological event frequency. The expression is identical to HK = POD - POFD, but the Hanssen and Kuipers score can also be interpreted as (accuracy for events) + (accuracy for non-events) - 1. For rare events HK is unduly weighted toward the first term (same as POD), so this score may be more useful for more frequent events. Can be expressed in a form similar to the ETS except the hits random term is unbiased. See Woodcock (1976) for a comparison of HK with other scores.
特点：使用列联表中的所有元素。不依赖于气候事件的频率。该表达式与HK=POD-POFD相同，但是Hanssen和Kuipers评分也可以解释为（事件的准确性）+（非事件的准确性）-1。对于罕见事件，HK在第一项（与POD相同）被过度加权，因此这个评分对于更频繁的事件可能更有用。可以用类似于ETS的形式来表达，但命中随机项是无偏的。想比较HK与其他评分请参阅woodcock(1976)。
In the example above, HK = 82 / (82+23) - 38 / (38+222) = 0.63
在上面的例子中，HK＝82/(82＋23)-38/(38＋222)＝0.63。

Heidke skill score (Cohen’s k)
where
海德克技巧评分（科恩K）
Answers the question: What was the accuracy of the forecast relative to that of random chance?
回答了这样的问题：与随机概率相比，预测的准确性如何？
Range: -1 to 1, 0 indicates no skill. Perfect score: 1.
范围：-1到1，0表示没有技能。满分：1.
Characteristics: Measures the fraction of correct forecasts after eliminating those forecasts which would be correct due purely to random chance. This is a form of the generalized skill score, where the score in the numerator is the number of correct forecasts, and the reference forecast in this case is random chance. In meteorology, at least, random chance is usually not the best forecast to compare to - it may be better to use climatology (long-term average value) or persistence (forecast = most recent observation, i.e., no change) or some other standard.
特点：衡量排除了那些纯粹由于随机机会带来的正确的预测后正确预测的占比。这是广义技能评分的一种形式，其中分子中是正确预测的数目，在这种情况下，参考预测是随机机会。至少，在气象学中，随机机会通常不是最好的预测，相比之下，使用气候学（长期平均值）或持续性（预测=最近的观测，即没有变化）或其他一些标准可能更好。
In the example above, HSS = 0.61.
在上面的例子中，HSS＝0.61。

Odds ratio
优势比
Answers the question: What is the ratio of the odds of a “yes” forecast being correct, to the odds of a “yes” forecast being wrong?
回答了这个问题：“是”预测的正确概率与“是”预测的错误概率的比率是多少？
Odds ratio - Range: 0 to âˆž, 1 indicates no skill. Perfect score: âˆž
优势比范围：0到âˆž，1表示没有技能。满分：âˆž
Log odds ratio - Range: -âˆž to âˆž, 0 indicates no skill. Perfect score: âˆž
对数优势比范围：- âˆž 到âˆž，0表示没有技能。完美的分数：âˆž
Characteristics: Measures the ratio of the odds of making a hit to the odds of making a false alarm. The logarithm of the odds ratio is often used instead of the original value. Takes prior probabilities into account. Gives better scores for rarer events. Less sensitive to hedging. Do not use if any of the cells in the contingency table are equal to 0. Used widely in medicine but not yet in meteorology – see Stephenson (2000) for more information.
特点：衡量命中几率与虚报几率的比值。优势比的对数常被用来代替原来的值。考虑了先验概率。对比较罕见的事件会给出更好的评分。对套期保值不太敏感。如果列联表中的任何一个单元等于0，则不使用。广泛应用于医学，但尚未在气象学方面应用——获取更多信息请参见史蒂芬森（2000）。
Note that the odds ratio is not the same as the ratio of the probability of making a hit (hits / # forecasts) to the probability of making a false alarm (false alarms / # forecasts), since both of those can depend on the climatological frequency (i.e., the prior probability) of the event.
注意，优势比不同于成功率(命中/\预测)与虚报率(错误警报/预测)的比率，尽管两者都取决于事件的气候频率(即，先验概率)。
In the example above, OR = (82 x 222) / (23 x 38) = 20.8, indicating that the odds of a “yes” prediction being correct are over 20 times greater than the odds of a “yes” forecast being incorrect.
在上面的例子中，OR=(82 X 222)/(23 X 38)=20.8，表明"是"预测正确的概率比"是"预测不正确的概率高20倍以上。

Odds ratio skill score (Yule’s Q)
优势比技巧评分（Yule’s Q）
Answers the question: What was the improvement of the forecast over random chance?
回答了这个问题：预测比随机概率有什么改进？
Range: -1 to 1, 0 indicates no skill. Perfect score: 1
范围：-1到1，0表示没有技能。满分：1
Characteristics: Independent of the marginal totals (i.e., of the threshold chosen to separate “yes” and “no”), so is difficult to hedge. See Stephenson (2000) for more information.
特点：独立于边际总和（即选择分开“是”和“否”的阈值），所以很难对冲。更多信息，请参见史蒂芬森（2000）。
In the example above, ORSS = [(82 x 222)-(23 x 38)] / [(82 x 222)+(23 x 38)] = 0.91。
在上面的例子中，ORSS=[(82 X 222)-(23 X 38)]/[(82 X 222)+(23 X 38)]=0.91。

Methods for multi-category forecasts

多类别预测的验证方法

Methods for verifying multi-category forecasts also start with a contingency table showing the frequency of forecasts and observations in the various bins. It is analogous to a scatter plot for categories.
验证多类别预测的方法也从一个列联表开始，该表显示了各单元格中预测和观测的频率。它类似于分类的散点图。
Multi-category Contingency Table 多类别列联表
(表略）
In this table n(Fi,Oj) denotes the number of forecasts in category i that had observations in category j, N(Fi) denotes the total number of forecasts in category i, N(Oj) denotes the total number of observations in category j, and N is the total number of forecasts.
在该表中，n(Fi，Oj)表示j类中观测到的i类中的预测的数目，N(Fi)表示在类别i的预测总数，N(Oj)表示在类别j的观测总数，N是预测的总数。
The distributions approach to forecast verification examines the relationship among the elements in the multi-category contingency table. A perfect forecast system would have values of non-zero elements only along the diagonal, and values of 0 for all entries off the diagonal. The off-diagonal elements give information about the specific nature of the forecast errors. The marginal distributions (N’s at right and bottom of table) show whether the forecast produces the correct distribution of categorical values when compared to the observations. Murphy and Winkler (1987), Murphy et al. (1989) and Brooks and Doswell (1996) develop this approach in detail.
预测验证的分布方法检验多类别列联表中各要素之间的关系。一个完美的预测系统只有沿着对角线的非零元素的值，并且所有非对角线的条目的值都是0。非对角线元素给出关于预测误差的特定性质的信息。边际分布（N在表的右边和底部）表明当与观测值比较时，预测是否产生分类值的正确分布。Murphy和温克勒（1987），墨菲等人。（1989）和布鲁克斯和DoSWess（1996）详细地介绍了这种方法。
The advantage of the distributions approach is that the nature of the forecast errors can more easily be diagnosed. The disadvantage is that it is more difficult to condense the results into a single number. There are fewer statistics that summarize the performance of multi-category forecasts. However, any multi-category forecast verification can be converted to a series of K-1 yes/no-type verifications by defining “yes” to be “in category i” or “in category i or higher”, and “no” to be “not in category i” or “below category i”.
分布方法的优点是可以更容易地诊断预测误差的性质。缺点是更难将结果浓缩成单个数。总结多类别预测性能的统计量较少。然而，通过将“是”定义为“在类别i”或“在类别i或更高”，将“否”定义为“不在类别i”或“在类别i之下”，任何多类别预测验证可以转换为一系列K-1“是/否”类型的验证。

Histogram - Plot the relative frequencies of forecast and observed categories
直方图-绘制预测和观测类别的相对频率
Answers the question: How well did the distribution of forecast categories correspond to the distribution of observed categories?
回答了这样的问题：预测类别的分布与观察到的类别的分布有何对应？
Characteristics: Shows similarity between location, spread, and skewness of forecast and observed distributions. Does not give information on the correspondence between the forecasts and observations. Histograms give information similar to box plots.
特点：显示了预测分布和观测分布的位置、扩散和偏性之间的相似性。没有提供关于预测与观测之间对应关系的信息。直方图给出了类似于方块图的信息。
Accuracy
准确性
Answers the question: Overall, what fraction of the forecasts were in the correct category?
回答了这样的问题：总体而言，哪些部分的预测是正确的？
Range: 0 to 1. Perfect score: 1.
范围：0至1。完美得分：1。
Characteristics: Simple, intuitive. Can be misleading since it is heavily influenced by the most common category.
特点：简单，直观。因为它受到最常见的类别的严重影响，所以会产生误导。
Heidke skill score
海德克技巧评分
Answers the question: What was the accuracy of the forecast in predicting the correct category, relative to that of random chance?
回答了这样的问题：相对于随机概率，预测正确类别的准确性是多少？
Range: -âˆž to 1, 0 indicates no skill. Perfect score: 1.
范围：-âˆž到1，0表示没有技巧。满分：1.
Characteristics: Measures the fraction of correct forecasts after eliminating those forecasts which would be correct due purely to random chance. This is one form of a generalized skill score, where the score in the numerator is the number of correct forecasts, and the reference forecast in this case is random chance. Requires a large sample size to make sure that the elements of the contingency table are all adequately sampled. In meteorology, at least, random chance is usually not the best forecast to compare to - it may be better to use climatology (long-term average value) or persistence (forecast is most recent observation, i.e., no change) or some other standard.
特征：衡量在消除那些纯粹由于随机机会而正确的预测之后的正确预测的占比。这是广义技能分数的一种形式，其中分子中的分数是正确预测的数目，在这种情况下，参考预测是随机机会。需要大样本量以确保列联表的元素都被充分采样。至少，在气象学中，随机机会通常不是最好的预测，相比之下，使用气候学（长期平均值）或持续性（预测是最近的观测，即没有变化）或其他一些标准可能更好。
Hanssen and Kuipers discriminant (true skill statistic, Peirce’s skill score)
汉森和柯伊珀斯判别（真正的技能统计，皮尔士的技巧得分）
Answers the question: What was the accuracy of the forecast in predicting the correct category, relative to that of random chance?
回答了这样的问题：相对于随机机会预测在预测正确类别方面的准确性是多少？
Range: -1 to 1, 0 indicates no skill. Perfect score: 1
范围：-1到1,，0表示没有技能。完美得分：1
Characteristics: Similar to the Heidke skill score (above), except that in the denominator the fraction of correct forecasts due to random chance is for an unbiased forecast.
特点：类似于Heidke技巧评分（见上文），但在分母中，随机概率导致的正确预测的分数为无偏预测。
Gerrity score
where sij are elements of a scoring matrix given by (i = j, diagonal), (i ≠ j, off-diagonal), and with the sample probabilities (observed frequencies) given by pi =N(Oi)/N).
Gerrity评分
Answers the question: What was the accuracy of the forecast in predicting the correct category, relative to that of random chance?
回答了这样的问题：相对于随机概率，预测正确类别的准确性是多少？
Range: -1 to 1, 0 indicates no skill. Perfect score: 1
范围：-1到1，0表示没有技能。满分：1
Characteristics: Uses all entries in the contingency table, does not depend on the forecast distribution, and is equitable (i.e., random and constant forecasts score a value of 0). GS does not reward conservative forecasting like HSS and HK, but rather rewards forecasts for correctly predicting the less likely categories. Smaller errors are penalized less than larger forecast errors. This is achieved through the use of the scoring matrix. A more detailed discussion and examples for 3-category forecasts can be found in Jolliffe and Stephenson (2012).
特点：使用列联表中的所有条目，不依赖于预测分布，是公平的（即随机和恒定的预测得分为0）。GS并不奖励像HSS和HK这样保守的预测，而是奖励正确预测不太可能出现的类别的预测。较小的误差比较大的预测误差受到的惩罚要小。这是通过使用评分矩阵来实现的。在Jolliffe和史蒂芬森（2012）中可以找到更详细的讨论和3类预测的例子。

Methods for forecasts of continuous variables

连续变量预测的验证方法

Verifying forecasts of continuous variables measures how the values of the forecasts differ from the values of the observations. The continuous verification methods and statistics will be demonstrated on a sample data set of 10 temperature forecasts taken from Stanski et al. (1989):
连续变量的预测验证，衡量的是预测值与观测值之间的差异程度。通过来自Stanski等（1989）的10个温度预报值的样本数据集来展示连续变量预测验证方法及其统计数据：
Verification of continuous forecasts often includes some exploratory plots such as scatter plots and box plots, as well as various summary scores.
连续预测的验证通常包括一些探索性绘图，如散点图和方框图，以及各种概要评分。
Scatter plot
Scatter plot - Plots the forecast values against the observed values.
散点图——将预测值与观测值进行对比。
Answers the question: How well did the forecast values correspond to the observed values?
回答了这样的问题：预测值与观测值的对应程度如何？
Characteristics: Good first look at correspondence between forecast and observations. An accurate forecast will have points on or near the diagonal.
特点：对预测和观测之间的对应关系有很好的初步观察。准确的预测将是在对角线上的点或对角线附近的点。
Scatter plots of the error can reveal relationships between the observed or forecast values and the errors.
误差的散点图可以揭示观测值或预测值与误差之间的关系。

Box plot
箱线图
Box plot - Plot boxes to show the range of data falling between the 25th and 75th percentiles, horizontal line inside the box showing the median value, and the whiskers showing the complete range of the data.
箱线图—绘制一框显示落在第25和75个百分点范围之间的数据，框内水平线显示中值，而上下两根横线显示数据的完整范围。

Answers the question: How well did the distribution of forecast values correspond to the distribution of observed values?
回答了这样的问题：预测值的分布与观测值的分布有何关系？
Characteristics: Shows similarity between location, spread, and skewness of forecast and observed distributions. Does not give information on the correspondence between the forecasts and observations. Box plots give information similar to histograms.
特征：显示了预测分布和观测分布的位置、扩散和偏性之间的相似性。没有提供关于预测与观测之间对应关系的信息。箱线图给出了类似于直方图的信息。
Mean error
平均误差
Answers the question: What is the average forecast error?
回答了这样的问题：平均的预测误差是多少？
Range: -âˆž to âˆž. Perfect score: 0.
范围：-âˆž到âˆž，完美评分：0。
Characteristics: Simple, familiar. Also called the (additive) bias. Does not measure the magnitude of the errors. Does not measure the correspondence between forecasts and observations, i.e., it is possible to get a perfect score for a bad forecast if there are compensating errors.
特点：简单，熟悉。也称（加法）偏差。不衡量误差的大小。不衡量预测和观测之间的对应关系，也就是说，如果有补偿性误差，就有可能对于一个糟糕的预测得到一个好的评分。
In the example above, Mean Error = 0.8 C
在上面的例子中，平均误差=0.8c。
(Multiplicative) bias
（乘法）偏差
Answers the question: How does the average forecast magnitude compare to the average observed magnitude?
回答了这样的问题：平均预测量级与平均观测量级相比如何？
Range: -âˆž to âˆž. Perfect score: 1.
范围：-âˆž到âˆž。完美评分：1。
Characteristics: Simple, familiar. Best suited for quantities that have 0 as a lower or upper bound. Does not measure the magnitude of the errors. Does not measure the correspondence between forecasts and observations, i.e., it is possible to get a perfect score for a bad forecast if there are compensating errors.
特点：简单，熟悉。最适合0为下界或上界的量。不衡量误差的量级。不衡量预测和观测之间的对应关系，也就是说，如果有补偿性错误，就有可能为一个糟糕的预测得到一个完美的分数。
In the example above, Bias = 1.06.
在上面的例子中，偏差＝1.06。
Mean absolute error
绝对平均误差
Answers the question: What is the average magnitude of the forecast errors?
回答的问题：预测误差的平均大小是多少？
Range: 0 to âˆž. Perfect score: 0.
范围：0到âˆž. 完美评分：0。
Characteristics: Simple, familiar. Does not indicate the direction of the deviations.
特点：简单，熟悉。不表示偏差的方向。
In the example above, MAE = 2.8 C
在上面的例子中，MAE＝2.8。
Root mean square error
均方根误差
Answers the question: What is the average magnitude of the forecast errors?
回答的问题：预测误差的平均大小是多少？
Range: 0 to âˆž. Perfect score: 0.
范围：0至âˆž。满分：0。
Characteristics: Simple, familiar. Measures “average” error, weighted according to the square of the error. Does not indicate the direction of the deviations. The RMSE puts greater influence on large errors than smaller errors, which may be a good thing if large errors are especially undesirable, but may also encourage conservative forecasting.
特点：简单，熟悉。衡量“平均”误差，根据误差平方来加权。不表示偏差的方向。RMSE对大误差的影响比小误差大，这使得大误差特别不受欢迎，这也许是件好事，但也可能鼓励保守预测。
In the example above, RMSE = 3.2 C
在上面的例子中，RMSE＝3.2 C。
The root mean square factor is similar to RMSE, but gives a multiplicative error instead of an additive error.
均方根系数类似于RMSE，但给出了乘法误差，而不是加法误差。
Mean squared error
均方误差
Measures the mean squared difference between the forecasts and observations.
衡量预测和观测之间的均方差值。
Range: 0 to âˆž. Perfect score: 0.
范围：0至âˆž。满分：0。
Characteristics: Can be decomposed into component error sources following Murphy (1987). Units of MSE are the square of the basic units.
特点：根据墨菲（1987）均方差能被分解为分量误差源。MSE的单位是基本单位的平方。
In the example above, MSE = 10 degrees squared.
在上面的例子中，MSE＝10平方度。
Linear error in probability space (LEPS)
概率空间中的线性误差（LEPS）
Measures the error in probability space as opposed to measurement space, where CDFo () is the cumulative probability density function of the observations, determined from an appropriate climatology.
测量概率空间中的误差，与测量空间相反，其中CDFo( )是从适当的气候学确定的观测值的累积概率密度函数。
Range: 0 to 1. Perfect score: 0.
范围：0至1。完美得分：0。
Characteristics: Does not discourage forecasting extreme values if they are warranted. Requires knowledge of climatological PDF. Not yet in wide usage – Potts et al. (1996) derived an improved version of the LEPS score that is equitable and does not “bend back” (give better scores for worse forecasts near the extremes):

特点：不妨碍预测极端值，如果它们是合理的。需要了解气候知识。尚未广泛使用----Potts等（1996年）得出了LEPS评分的改进版本，它是公平的，不“向后弯曲”（对接近极端的更糟糕的预测给出更好的分数）。
In the example above, suppose the climatological temperature is normally distributed with a mean of 14 C and variance of 50 C. Then according to the first expression, LEPS=0.106.
在上面的例子中，假设气候温度是正态分布的，平均值为14℃，方差为50℃，那么根据第一个表达式，LEPS=0.106。
Stable equitable error in probability space (SEEPS)

where n(Fi,Oj) is the joint occurrence of forecast category i and observed category j in the 3x3 contingency table, and the scoring matrix is given by
其中，n(Fi,Oj)是3x3列联表中预测类别i和观测类别j的联合事件，由矩阵s给出评分矩阵。
Like LEPS, SEEPS measures the error in probability space as opposed to measurement space. It was developed to assess rainfall forecasts, where (1-p1) is the climatological probability of rain (i.e., accumulation exceeding 0.2 mm, following WMO guidelines), and p2=2p3 divides the climatological cumulative rainfall distribution into “light” (lower 2/3 of rain rates ≥0.2 mm) and “heavy” (upper 1/3 of rain rates ≥0.2 mm). Refer to diagram at right, where tL/H is the threshold delineating “light” and “heavy” rain.
与LEPS一样，SEEPS衡量概率空间中的误差，而不是测量空间。它被开发用于评估降雨预报，其中(1-p1)是降雨的气候概率(即累积超过0.2mm，遵循WMO指南)，并且p2=2p3将气候累积降雨分布划分为“轻”(降雨率低于2/3≥0.2mm)和“重”（1/3以上的降雨率≥0.2毫米）。参考右边的图表，tL/H是描绘“轻”和“重”雨的阈值。
Range: 0 to 1. Perfect score: 0.
范围：0至1。完美得分：0。
Characteristics: Encourages forecasting of all categories. Resistant to hedging. Requires knowledge of climatological PDF. 1-SEEPS may be preferred as it is positively oriented. Use of locally derived thresholds allows aggregation/comparison of scores across climatologically varying regimes. For further stability require 0.1 < p1 < 0.85, that is, climate not too dry or too wet so that rain (or no rain) is an extreme event. For more information see Rodwell et al. (2010).
特点：鼓励对所有类别进行预测。抵制对冲。需要了解气候知识。1-SEEPS可能是首选，因为它是积极的方向。使用当地得出的阈值，可以对气候变化体系中的评分进行汇总/比较。为了进一步的稳定性，需要0.1<p1<0.85，也就是说，气候不太干燥或太潮湿，因此雨（或没有雨）是极端事件。欲了解更多信息，请参见罗德威尔等（2010）。
Correlation coefficient
相关系数
Addresses the question: How well did the forecast values correspond to the observed values?
解决的问题是：预测值与观测值的相关程度如何？
Range: -1 to 1. Perfect score: 1.
范围：-1至1。满分：1.
Characteristics: Good measure of linear association or phase error. Visually, the correlation measures how close the points of a scatter plot are to a straight line. Does not take forecast bias into account – it is possible for a forecast with large errors to still have a good correlation coefficient with the observations. Sensitive to outliers.
特点：很好度量线性关联或相位误差。视觉上，相关度测量散点图的点与直线的距离。不考虑预测偏差–误差大的预测可能仍然与观测值有很好的相关系数。对离群点敏感。
In the example above, r = 0.914
在上面的例子中，r＝0.914。
Anomaly correlation
异常相关
Addresses the question: How well did the forecast anomalies correspond to the observed anomalies?
解决的问题是：预测的异常情况与观测到的异常情况的对应程度如何？
Range: -1 to 1. Perfect score: 1.
范围：-1至1。满分：1.
Characteristics: Measures correspondence or phase difference between forecast and observations, subtracting out the climatological mean at each point, C, rather than the sample mean values. The anomaly correlation is frequently used to verify output from numerical weather prediction (NWP) models. AC is not sensitive to forecast bias, so a good anomaly correlation does not guarantee accurate forecasts. Both forms of the equation are in common use – see Jolliffe and Stephenson (2012) or Wilks (2011) for further discussion.
特点：衡量预报和观测之间的对应或相位差，减去每个点的气候平均值C，而不是样本平均值。异常相关经常被用来验证数值天气预报（NWP）模型的输出。AC对预测偏差不敏感，因此良好的异常相关性不能保证准确的预测。两种形式都是常用的–参见Jolliffe和史蒂芬森（2012）或威尔克斯（2011）作进一步讨论。
In the example above, if the climatological temperature is 14 C, then AC = 0.904. AC is more often used in spatial verification.
在上面的例子中，如果气候温度为14℃，则AC＝0.904。AC在空间验证中更经常使用。
S1 score
S 1评分
where ∆F (∆O) refers to the horizontal gradient in the forecast (observations).
其中∆F (∆O)指的是预测（观测）中的水平梯度。
Answers the question: How well did the forecast gradients correspond to the observed gradients?
回答的问题：预测梯度与观测到的梯度的对应程度如何？
Range: 0 to âˆž. Perfect score: 0.
范围：0至âˆž。满分：0。
Characteristics: It is usually applied to geopotential height or sea level pressure fields in meteorology. Long historical records in NWP showing improvement in model performance over the years. Because S1 depends only on gradients, good scores can be achieved even when the forecast values are biased. Also depends on spatial resolution of the forecast.
特点：它通常应用于气象学中的位势高度或海平面气压场。多年来NWP的长期历史记录显示模型性能在改善。由于S1仅依赖于梯度，即使在预测值被偏置时也能获得良好的评分。也取决于预测的空间分辨率。
Skill score
技能评分
Answers the question: What is the relative improvement of the forecast over some reference forecast?
回答问题：与参考预测相比，预测的相对改进是什么？
Range: Lower bound depends on what score is being used to compute skill and what reference forecast is used, but upper bound is always 1; 0 indicates no improvement over the reference forecast. Perfect score: 1.
范围：下限取决于使用什么分数来计算技能和使用什么参考预测，但是上限总是1；0表示没有优于参考预测。完美得分：1。
Characteristics: Implies information about the value or worth of a forecast relative to an alternative (reference) forecast. In meteorology the reference forecast is usually persistence (no change from most recent observation) or climatology. The skill score can be unstable for small sample sizes. When MSE is the score used in the above expression then the resulting statistic is called the reduction of variance.
特点：指相对于替代（参考）预测而言，关于预测的价值或价值的信息。在气象学中，参考预测通常是恒定的（与最近的观测结果相比没有变化）或气候学上的。小样本的技能评分不稳定。当MSE是上述表达式中使用的分数时，所得到的统计称为方差的减少。

See also Methods for spatial forecasts for more scientific/diagnostic techniques.
想获得更多的科学/诊断技术，还可参阅空间预报方法。
See also Other methods for additional scores for forecasts of continuous variables.
想获得连续变量预测的更多评分，还可参阅其他方法。

Methods for probabilistic forecasts

概率预测的验证方法

A probabilistic forecast gives a probability of an event occurring, with a value between 0 and 1 (or 0 and 100%). In general, it is difficult to verify a single probabilistic forecast. Instead, a set of probabilistic forecasts, pi, is verified using observations that those events either occurred (oi=1) or did not occur (oi=0).
概率预测给出事件发生的概率，其值在0和1之间（或0和100%）。一般来说，很难验证一个单一的概率预报。相反，一组概率预测pi，使用那些事件发生(oi=1)或未发生(oi=0)的观测来验证。
An accurate probability forecast system has:
一个准确的概率预测系统包含：
** reliability - agreement between forecast probability and mean observed frequency
可靠性-预测概率与平均观测频率的一致性
sharpness - tendency to forecast probabilities near 0 or 1, as opposed to values clustered around the mean
** 锐度-预测接近0或1的概率的倾向，而不是围绕平均值的数值
resolution - ability of the forecast to resolve the set of sample events into subsets with characteristically different outcomes
分辨率-预测将样本事件集分解为具有不同结果的子集的能力
** Reliability diagram -(called “attributes diagram” when the no-resoloution and no-skill w.r.t. climatology lines are included).**
可靠性图表-（包含被称为无分辨率且无技能W.R.T.气候线的“属性图”）

The reliability diagram plots the observed frequency against the forecast probability, where the range of forecast probabilities is divided into K bins (for example, 0-5%, 5-15%, 15-25%, etc.). The sample size in each bin is often included as a histogram or values beside the data points.
可靠性图表根据预测概率绘制观测频率，其中预测概率的范围被划分为K个区间（例如，0-5％、5-15％、15-25％等）。每一个区间中的样本大小通常作为数据点旁边的直方图或值来表示。
Answers the question: How well do the predicted probabilities of an event correspond to their observed frequencies?
回答的问题：一个事件的预测概率与其观测频率的对应程度如何？
Characteristics: Reliability is indicated by the proximity of the plotted curve to the diagonal. The deviation from the diagonal gives the conditional bias. If the curve lies below the line, this indicates overforecasting (probabilities too high); points above the line indicate underforecasting (probabilities too low). The flatter the curve in the reliability diagram, the less resolution it has. A forecast of climatology does not discriminate at all between events and non-events, and thus has no resolution. Points between the “no skill” line and the diagonal contribute positively to the Brier skill score. The frequency of forecasts in each probability bin (shown in the histogram) shows the sharpness of the forecast.
特点：可靠性是由绘制曲线与对角线的接近程度来表示的。偏离对角线给出了条件偏差。如果曲线位于线下方，则表示超预测（概率太高）；线上方的点表示低预测（概率太低）。在可靠性图中，曲线越平坦，其分辨率越低。气候学的预测根本不区分事件和非事件，因而没有分辨率。“无技能”线和对角线之间的点对Brier技能得分有积极的贡献。在每个概率区间中的预测频率（直方图中所示）显示了预测的锐度。
The reliability diagram is conditioned on the forecasts (i.e., given that an event was predicted, what was the outcome?), and can be expected to give information on the real meaning of the forecast. It is a good partner to the ROC, which is conditioned on the observations. Some users may find a reliability table (table of observed relative frequency associated with each forecast probability) easier to understand than a reliability diagram.
可靠性图表是以预测为条件的（也就是说，考虑到一个事件是预测的，结果是什么？），并可期望给出预测的真实意义的信息。它是以观测为条件的ROC的良好伙伴。一些用户可能会发现可靠性表（与每个预测概率相关的观测相对频率表）比可靠性图更容易理解。

Brier score
Brier评分
Answers the question: What is the magnitude of the probability forecast errors?
回答的问题：概率预测误差的大小是多少？
Measures the mean squared probability error. Murphy (1973) showed that it could be partitioned into three terms: (1) reliability, (2) resolution, and (3) uncertainty.
测量均方概率误差。Murphy（1973）表明它可以被划分为三个项：（1）可靠性，（2）分辨率，和（3）不确定性。
Range: 0 to 1. Perfect score: 0.
范围：0至1。完美得分：0。
Characteristics: Sensitive to climatological frequency of the event: the more rare an event, the easier it is to get a good BS without having any real skill. Negative orientation (smaller score better) - can “fix” by subtracting BS from 1.
特点：对事件的气候频率敏感：事件越罕见，就越容易在没有任何真正技能的情况下得到一个好的BS。负向（分数更好）-可以通过从1减去BS来“修正”。
Brier skill score
Brier技巧评分
Answers the question: What is the relative skill of the probabilistic forecast over that of climatology, in terms of predicting whether or not an event occurred?
回答问题：在预测事件是否发生方面，概率预测比气候学的相对技巧怎么样？
Range: -âˆž to 1, 0 indicates no skill when compared to the reference forecast. Perfect score: 1.
范围：-âˆž至1, 0表示与参考预测相比没有技能。完美得分：1。
Characteristics: Measures the improvement of the probabilistic forecast relative to a reference forecast (usually the long-term or sample climatology), thus taking climatological frequency into account. Not strictly proper. Unstable when applied to small data sets; the rarer the event, the larger the number of samples needed.
特点：衡量概率预报相对于参考预报（通常是长期或样本气候学）的改进，同时把气候频率考虑在内了。不严格的。应用到小数据集时不稳定；事件越罕见，需要的样本数量就越大。

Relative operating characteristic
相对操作特性
Relative operating characteristic–Plot hit rate (POD) vs false alarm rate (POFD), using a set of increasing probability thresholds (for example, 0.05, 0.15, 0.25, etc.) to make the yes/no decision. The area under the ROC curve is frequently used as a score.
相对操作特性–使用增加概率阈值(例如，0.05、0.15、0.25等)来作出是/否决策的绘图命中率(POD)与虚报率(POFD)。ROC曲线下的面积经常被作为评分。

Answers the question: What is the ability of the forecast to discriminate between events and non-events?
回答的问题：预测区分事件和非事件的能力如何？
ROC: Perfect: Curve travels from bottom left to top left of diagram, then across to top right of diagram. Diagonal line indicates no skill.
ROC：完美：曲线从图左下到左上方，然后越过图表右上方。对角线表示没有技能。
ROC area: Range: 0 to 1, 0.5 indicates no skill. Perfect score: 1
ROC面积：范围：0到1，0.5表示没有技能。完美得分：1
Characteristics: ROC measures the ability of the forecast to discriminate between two alternative outcomes, thus measuring resolution. It is not sensitive to bias in the forecast, so says nothing about reliability. A biased forecast may still have good resolution and produce a good ROC curve, which means that it may be possible to improve the forecast through calibration. The ROC can thus be considered as a measure of potential usefulness.
特征：ROC测量预测是否能区分两种不同的结果，从而测量分辨率。它对预测中的偏差不敏感，所以不涉及可靠性。偏置预测仍可能具有良好的分辨率，并产生良好的ROC曲线，这意味着有可能通过校准来改进预测。因此，ROC可以被视为衡量潜在有用性的指标。
The ROC is conditioned on the observations (i.e., given that an event occurred, what was the corresponding forecast?) It is therefore a good companion to the reliability diagram, which is conditioned on the forecasts.
ROC是以观测为条件的（也就是说，如果发生了事件，那么相关的预测是什么？）因此，它是以预测为条件的可靠性图表的一个很好的同伴。
More information on ROC can be found in Mason 1982, Jolliffe and Stephenson 2012 (ch.3), and the WISE site.
关于ROC的更多信息可以在Mason 1982、Jolliffe和史蒂芬森2012（CH.3）和智者站点中找到。

Discrimination diagram
判别图
Discrimination diagram- Plot the likelihood of each forecast probability when the event occurred and when it did not occur. A summary score can be computed as the absolute value of the difference between the mean values of each distribution.
判别图-绘制事件发生时和未发生时的每个预测概率的可能性。摘要得分可以计算并作为每个分布的平均值之差值的绝对值。

Answers the question: What is the ability of the forecast to discriminate between events and non-events?
回答问题：预测区分事件和非事件的能力是如何？
Perfect discrimination is when there is no overlap between the distributions of forecast probabilities for observed events and non-events. As with the ROC the discrimination diagram is conditioned on the observations (i.e., given that an event occurred, what was the corresponding forecast?) Some users may find the discrimination diagram easier to understand than the ROC.
当观测事件和非事件的预测概率分布之间没有重叠时，就是完全判别。与ROC一样，判别图是以观察为条件的（即，给定事件发生，相应的预测是什么？）有些用户可能会发现识别图比ROC更容易理解。

Ranked probability score
排序概率得分
where M is the number of forecast categories, pk is the predicted probability in forecast category k, and ok is an indicator (0=no, 1=yes) for the observation in category k.
排序概率得分其中，M是预测类别的数目，pk是预测类别k中的预测概率，并且ok是针对类别k中的观测的指示符(0＝否，1＝是)。
Answers the question: How well did the probability forecast predict the category that the observation fell into?
回答的问题：概率预测对观测结果的类别预测有多好？
Range: 0 to 1. Perfect score: 0.
范围：0到1。满分：0。
Characteristics: Measures the sum of squared differences in cumulative probability space for a multi-category probabilistic forecast. Penalizes forecasts more severely when their probabilities are further from the actual outcome. Negative orientation - can “fix” by subtracting RPS from 1. For two forecast categories the RPS is the same as the Brier Score.
特点：测量多类别概率预测的累积概率空间平方差之和。如果预测的概率与实际结果相差甚远，就会更严重地惩罚预测。负向-可以通过从1减去RPS来“修正”。对于两个预测类别，RPS与Brier评分相同。
Continuous version：对于连续变量表达式为：
Ranked probability skill score
排序概率技巧评分
Answers the question: What is the relative improvement of the probability forecast over climatology in predicting the category that the observations fell into?
回答这个问题：在预测观测结果所属的类别时，概率预测相对于气候学的相对改进是什么？
Range: -âˆž to 1, 0 indicates no skill when compared to the reference forecast. Perfect score: 1.
范围：-âˆž到1，0表示与参考预测相比没有技能。满分：1.
Characteristics: Measures the improvement of the multi-category probabilistic forecast relative to a reference forecast (usually the long-term or sample climatology). Strictly proper. Takes climatological frequency into account. Unstable when applied to small data sets.
特点：衡量相对于参考预测（通常是长期或样本气候学）的多类别概率预测的改进。绝对正确。考虑到气候频率。应用于小数据集时不稳定。
**Relative value (value score) (Richardson, 2000; Wilks, 2001)
相对价值（价值评分）（理查德森，2000；威尔克斯，2001） **

Answers the question: For a cost/loss ratio C/L for taking action based on a forecast, what is the relative improvement in economic value between climatological and perfect information?
回答的问题：对于基于预测采取行动的成本/损失比C/L，在气候信息和完美信息之间经济价值的相对改善是什么？
Range: -âˆž to 1. Perfect score: 1.
范围：-âˆž到1, 满分：1.
Characteristics: The relative value is a skill score of expected expense, with climatology as the reference forecast. Because the cost/loss ratio is different for different users of forecasts, the value is generally plotted as a function of C/L.
特点：相对价值是以气候学为参考预测的预期支出技巧评分。由于不同预测用户的成本/损失率不同，因此该值通常以C/L的函数表示。
Like ROC, it gives information that can be used in decision making. When applied to a probabilistic forecasts system (for example, an ensemble prediction system), the optimal value for a given C/L may be achieved by a different forecast probability threshold than the optimal value for a different C/L. In this case it is necessary to compute relative value curves for the entire range of probabilities, then select the optimal values (the upper envelope of the relative value curves) to represent the value of the probabilistic forecast system. Click here for more information on the cost/loss model and relative value.
和ROC一样，它提供了可以用于决策的信息。当应用于概率预测系统（例如，集合预报系统）时，给定C/L的最佳值可以通过不同于不同C/L的最佳值的预测概率阈值来实现。在这种情况下，需要计算整个概率范围的相对价值曲线，然后选择最优值（相对值曲线的上包络）来表示概率预测系统的值。点击此处可获得更多关于成本/损失模型和相对价值的信息。
See also Methods for ensemble prediction systems for more scientific/diagnostic techniques.
请参阅集合预报系统的预测验证方法获取更多科学/诊断技术。