UCAS气候统计学_笔记

本文详述了气候统计学的关键概念,包括统计量、理论分布(如正态、伽马、t分布等)及其在检验中的应用。讲解了参数检验的原理,如均值和方差检验,以及相关系数的检验。此外,还探讨了气候时间序列分析,如谐波分析和谱分析,以及气候突变检验的重要性。
摘要由CSDN通过智能技术生成

2019年秋季学期,整理自严中伟、华丽娟、钱诚老师的气候统计学课。

第二章 统计量

Robustness:则表明该分析不会受到数据分布特征的影响
( μ \mu μ 数据符合正太则对,数据偏态则错不robust)

Resistance: 则表明它不会“过度”受到数据极值的影响,或者说当数据中的小\较大部分发生变化后,所采用的统计方法计算结果不会发生大的变化。 μ \mu μ 也不resist##

Location

μ \mu μ
替代平均数 μ \mu μ 更robust/resist 的location统计量:中位数,剪裁平均  Trimean  = q 0.25 + 2 q 0.5 + q 0.75 4 \text { Trimean }=\frac{q_{0.25}+2 q_{0.5}+q_{0.75}}{4}  Trimean =4q0.25+2q0.5+q0.75
百分位数: 将数据分布排列,(如中位数,上四分位数,下四分位数等。)
geomean\ harmmean

Spread/Dispersion 变化幅度

距平anomaly

**方差(variance)**S2

标准差(standard deviation

相比方差 更robust/resist spread统计量:IQR =$ q_{0.75}-q_{0.25}$

Symmetry 分布特征统计量

通常用样本的偏态系数来体现数据的分布特征,即相对于中心值的对称性

偏态系数 γ = 1 n − 1 ∑ i = 1 n ( x i − x ˉ ) 3 s 3 \gamma=\frac{\frac{1}{n-1} \sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{3}}{s^{3}} γ=s3n11i=1n(xixˉ)3
在这里插入图片描述;" style="zoom:100%/>

Yule-kendall 指数(更robust)
λ Y K = ( q 0.75 − q 0.5 ) − ( q 0.5 − q 0.25 ) I Q R = ( q 0.25 − 2 q 0.5 + q 0.75 ) I Q R \lambda_{Y K}=\frac{\left(q_{0.75}-q_{0.5}\right)-\left(q_{0.5}-q_{0.25}\right)}{I Q R}=\frac{\left(q_{0.25}-2 q_{0.5}+q_{0.75}\right)}{I Q R} λYK=IQR(q0.75q0.5)(q0.5q0.25)=IQR(q0.252q0.5+q0.75)

相关统计量

距平标准化后: 1.无量纲 2.均值0,标准差1
z = x − x ˉ s x = x ′ s x z=\frac{x-\bar{x}}{s_{x}}=\frac{x^{\prime}}{s_{x}} z=sxxxˉ=sxx
相关公式 r x y = Cov ⁡ ( x , y ) s x s y r_{x y}=\frac{\operatorname{Cov}(x, y)}{s_{x} s_{y}} rxy=sxsyCov(x,y),上协方差 Cov ⁡ ( x , y ) = 1 n − 1 ∑ i = 1 n [ ( x i − x ˉ ) ( y i − y ˉ ) ] \operatorname{Cov}(x, y)=\frac{1}{n-1} \sum_{i=1}^{n}\left[\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)\right] Cov(x,y)=n11i=1n[(xixˉ)(yiyˉ)]

Pearson相关则反应了数据对之间线性关系的强度
Spearman排序相关很好的体现了数据对之间单调关系的强度
自相关 (时间上的+空间上的)
交叉相关

经验分布

柱状图 + 累积频率分布,都是显示哪里数据多的图

**符号散点图 **在散点上多加了点东西,比如不同颜色表示啥,大小表示啥

相关矩阵

散点图矩阵
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传在这里插入图片描述

Looking vertically along the column for Ithaca precipitation, or horizontally along the row for Canandaigua precipitation, the eye is drawn to the largest few data values, which appear to line up. Most of the precipitation points correspond to small amounts and therefore hug the opposite axes. Focusing on the plot of Canandaigua versus Ithaca precipitation**, it is apparent that the two locations received most of their precipitation for the month on the same few days**. Also evident is the association of precipitation with milder minimum temperatures that was seen in previous examinations of these same data. The closer relationships between maximum and maximum, or minimum and minimum temperature variables at the two locations—as compared to the maximum versus minimum-temperature relationships at one location—can also be seen clearly.

相关图(相关矩阵升级),一点相关图One-point correlation map 空间相关图的相关性在空间上随距离逐渐变弱,但空间上存在遥相关性。
在这里插入图片描述
The surprising feature in Figure 3.28 is the region in the eastern tropical pacific, centered on Easter Island, for which the correlations with Djakarta pressure are strongly negative. This negative correlation implies that in years when average pressures at Djakarta (and nearby locations, such as Darwin) are high, pressures in the eastern Pacific are low, and vice versa. This correlation pattern is an expression in the surface pressure data of the El Nin˜o-Southern Oscillation (ENSO) phenomenon, sketched in Example 3.5, and is an example of what has come to be known as a teleconnection pattern. In the ENSO warm phase, the center of tropical Pacific convection moves eastward, producing lower than average pressures near Easter Island and higher than average pressures at Djakarta. When the precipitation shifts westward during the cold phase, pressures are low at Djakarta and high at Easter Island.

一点相关图,对相关矩阵的取值, T i = ∣ min ⁡ j r i , j  for all  j ∣ T_{i}=| \min _{j} r_{i, j} \text { for all } j | Ti=minjri,j for all j (P70)
在这里插入图片描述
Figure 3.29 Teleconnectivity, or absolute value of the strongest negative correlation from each of many one-point correlation maps plotted at the base gridpoint, for winter 500-mb heights. From Wallace and Blackmon (1983).

理论分布

优势:

  • 压缩性-几个参数就行描述数据,不需要像经验分布对所有数据进行繁杂操作。
  • 平滑及内插-数据更连续,不容易受到异常值影响。
  • 外推-理论分布可以帮助我们判断气象数据两侧没有数据值可能的发生概率。

离散分布不讲

连续分布

PDF-概率密度函数 ∫ x f ( x ) d x = 1 \int_{x} f(x) d x=1 xf(x)dx=1

CDF-累计分布函数

F ( x ) = Pr ⁡ { X ≤ x } = ∫ X ≤ x f ( x ) d x F(x)=\operatorname{Pr}\{X \leq x\}=\int_{X \leq x} f(x) d x F(x)=Pr{ Xx}=Xxf(x)dx
0 ≤ F ( x ) ≤ 1 0 \leq F(x) \leq 1 0F(x)1
F − 1 ( p ) = x ( F ) F^{-1}(p)=x(F) F1(p)=x(F) 直到概率可以反算随机变量

中心极限定理:n十分大,独立同分布数据的算数平均或和服从正态分布(μ,σ2/n)

大数定律:当试验次数很大时,便可以用事件发生的频率来代替事件的概率。

Gaussian z = x − x ˉ s z=\frac{x-\bar{x}}{s} z=sxxˉ
Gamma

α形状参数sharp,β尺度参数scale`

D = ln ⁡ ( x ˉ ) − 1 n ∑ i = 1 n ln ⁡ ( x i ) D=\ln (\bar{x})-\frac{1}{n} \sum_{i=1}^{n} \ln \left(x_{i}\right) D=ln(xˉ)n1i=1nln(xi)

α ^ = 0.5000876 + 0.1648852 D − 0.0544274 D 2 D , 0 ≤ D ≤ 0.5772 \hat{\alpha}=\frac{0.5000876+0.1648852 D-0.0544274 D^{2}}{D}, 0 \leq D \leq 0.5772 α^=D0.5000876+0.1648852D0.0544274D2,0D0.5772
or α ^ = 8.898919 + 9.059950 D + 0.9775373 D 2 17.79728 D + 11.968477 D 2 + D 3 , 0.5772 ≤ D ≤ 17.0 \quad \hat{\alpha}=\frac{8.898919+9.059950 D+0.9775373 D^{2}}{17.79728 D+11.968477 D^{2}+D^{3}}, 0.5772 \leq \mathrm{D} \leq 17.0 α^=17.79728D+11.968477D2+D38.898919+9.059950D+0.9775373D2,0.5772

评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值