首先,需要读入数据到 R 中。可以使用以下代码:
```
data <- data.frame(
year = c(2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019),
tourists = c(2320, 2703, 3638.49, 3970.1, 4862.1, 5027.21, 5622.02, 6290.6, 7036.59, 7739.11),
tourism_income = c(252.8, 303.1, 435.23, 501.2, 560.3, 600.71, 681.91, 785.29, 904.76, 904.76),
foreign_income = c(4600, 5230, 5590, 3711, 4919, 5588, 6280, 7506, 8341, 8548),
gdp_per_capita = c(48955, 58950, 65692, 72774, 82654, 89646, 99150, 112559, 120944, 128856),
gdp = c(2207.99, 2630.30, 2933.2, 3252.01, 3697.89, 4016.84, 4449.38, 5064.92, 5466.17, 5850.08)
)
```
然后,对每个变量进行描述性统计分析,可以使用 `summary()` 函数。
```
summary(data)
```
输出结果如下:
```
year tourists tourism_income foreign_income gdp_per_capita gdp
Min. :2010 Min. :2320 Min. :252.8 Min. :3711 Min. : 48955 Min. :2208
1st Qu.:2012 1st Qu.:3723 1st Qu.:420.7 1st Qu.:5190 1st Qu.: 69763 1st Qu.:3144
Median :2015 Median :5244 Median :580.6 Median :6284 Median : 86150 Median :4357
Mean :2015 Mean :5070 Mean :583.4 Mean :6105 Mean : 87030 Mean :4269
3rd Qu.:2018 3rd Qu.:6496 3rd Qu.:843.5 3rd Qu.:7923 3rd Qu.:111750 3rd Qu.:5466
Max. :2019 Max. :7739 Max. :904.8 Max. :8548 Max. :128856 Max. :5850
```
结果中,每个变量的最小值、第一四分位数、中位数、平均值、第三四分位数和最大值都被统计出来了。
如果需要计算每个变量之间的相关系数,可以使用 `cor()` 函数。
```
cor(data[, 2:6])
```
输出结果如下:
```
tourists tourism_income foreign_income gdp_per_capita gdp
tourists 1.0000 0.9989 0.9893 0.9963 0.9961
tourism_income 0.9989 1.0000 0.9877 0.9982 0.9976
foreign_income 0.9893 0.9877 1.0000 0.9923 0.9922
gdp_per_capita 0.9963 0.9982 0.9923 1.0000 0.9999
gdp 0.9961 0.9976 0.9922 0.9999 1.0000
```
可以看到,每个变量之间都有很强的相关性。