20180402-A · US Tuition Costs · ggplot2, 折线图 · R 语言数据可视化 案例 源码

所有作品合集传送门: Tidy Tuesday

2018 年合集传送门: 2018

US Tuition Costs

Average Tuition and Educational Attainment in the United States


Tidy Tuesday 在 GitHub 上的传送地址:
Thomas Mock (2022). Tidy Tuesday: A weekly data project aimed at the R ecosystem. https://github.com/rfordatascience/tidytuesday


在这里插入图片描述



1. 一些环境设置

# 设置为国内镜像, 方便快速安装模块
options("repos" = c(CRAN = "https://mirrors.tuna.tsinghua.edu.cn/CRAN/"))

2. 设置工作路径

wkdir <- '/home/user/R_workdir/TidyTuesday/2018/2018-04-02_US_Tuition_Costs/src-a'
setwd(wkdir)

3. 加载 R 包

library(scales)
library(tidyverse)
library(extrafont)
library(gghighlight)
library(showtext)
# 在 Ubuntu 系统上测试的, 不加这个我画出来的汉字会乱码 ~
showtext_auto()

4. 加载数据

df_input <- readxl::read_excel("../data/us_avg_tuition.xlsx")

# 简要查看数据内容
glimpse(df_input)
## Rows: 50
## Columns: 13
## $ State     <chr> "Alabama", "Alaska", "Arizona", "Arkansas", "California", "C…
## $ `2004-05` <dbl> 5682.838, 4328.281, 5138.495, 5772.302, 5285.921, 4703.777, …
## $ `2005-06` <dbl> 5840.550, 4632.623, 5415.516, 6082.379, 5527.881, 5406.967, …
## $ `2006-07` <dbl> 5753.496, 4918.501, 5481.419, 6231.977, 5334.826, 5596.348, …
## $ `2007-08` <dbl> 6008.169, 5069.822, 5681.638, 6414.900, 5672.472, 6227.002, …
## $ `2008-09` <dbl> 6475.092, 5075.482, 6058.464, 6416.503, 5897.888, 6284.137, …
## $ `2009-10` <dbl> 7188.954, 5454.607, 7263.204, 6627.092, 7258.771, 6948.473, …
## $ `2010-11` <dbl> 8071.134, 5759.153, 8839.605, 6900.912, 8193.739, 7748.201, …
## $ `2011-12` <dbl> 8451.902, 5762.421, 9966.716, 7028.991, 9436.426, 8315.632, …
## $ `2012-13` <dbl> 9098.069, 6026.143, 10133.503, 7286.580, 9360.574, 8792.856,…
## $ `2013-14` <dbl> 9358.929, 6012.445, 10296.200, 7408.495, 9274.193, 9292.954,…
## $ `2014-15` <dbl> 9496.084, 6148.808, 10413.844, 7606.410, 9186.824, 9298.599,…
## $ `2015-16` <dbl> 9751.101, 6571.340, 10646.278, 7867.297, 9269.844, 9748.188,…
# 检查数据的列名
colnames(df_input)
##  [1] "State"   "2004-05" "2005-06" "2006-07" "2007-08" "2008-09" "2009-10"
##  [8] "2010-11" "2011-12" "2012-13" "2013-14" "2014-15" "2015-16"

5. 数据预处理

5.1 调整列名

# 将列名简化, 方便绘图展示
tuition <- df_input %>%
  dplyr::rename(
    "state" = "State",
    "2004" = "2004-05",
    "2005" = "2005-06",
    "2006" = "2006-07",
    "2007" = "2007-08",
    "2008" = "2008-09",
    "2009" = "2009-10",
    "2010" = "2010-11",
    "2011" = "2011-12",
    "2012" = "2012-13",
    "2013" = "2013-14",
    "2014" = "2014-15",
    "2015" = "2015-16")

5.2. 数据倒置

# 从宽数据透视到长数据转换
tuition <- pivot_longer(tuition, 
                        cols = "2004":"2015",
                        names_to = "year",
                        values_to = "cost")

5.3. 分组计算历年变化及百分比

df_plot <- tuition %>%
  group_by(state) %>%
  arrange(year) %>%
  # 建议使用 dplyr::mutate 形式调用函数, 不然容易与 plyr 中的函数冲突 (因为我自己就报错了...)
  dplyr::mutate(change = (dplyr::last(cost) - dplyr::first(cost)),
         change_perc = change/dplyr::first(cost)*100) %>%
  ungroup()

6. 利用 ggplot2 绘图

# PS: 方便讲解, 我这里进行了拆解, 具体使用时可以组合在一起
gg <- ggplot(df_plot, aes(x = as.numeric(year), y = cost, color = state))
gg <- gg + geom_line(size = 2)
gg <- gg + geom_point(size = 4)
gg <- gg + scale_colour_viridis_d()
# gghighlight 根据条件高亮其中符合条件的线图
gg <- gg + gghighlight(change_perc > 100, use_group_by = FALSE, unhighlighted_params = list(color = "grey", size = .5))
gg <- gg + scale_x_continuous(breaks = seq(2004, 2015, 1))
# 利用 scales::dollar_format() 函数展示美元符号
gg <- gg + scale_y_continuous(labels = scales::dollar_format())
gg <- gg + labs(title = "美国各个州的大学学费变化",
                subtitle = "2004年至2015年,三个州的学费增长超过100%",
                x = NULL,
                y = NULL)
# theme_minimal() 去坐标轴边框的最小化主题
gg <- gg + theme_minimal()
# theme() 实现对非数据元素的调整, 对结果进行进一步渲染, 使之更加美观
gg <- gg + theme(
  # panel.grid.major 主网格线, 这一步表示删除主要网格线
  panel.grid.major = element_blank(),
  # panel.grid.minor 次网格线, 这一步表示删除次要网格线
  panel.grid.minor = element_blank(),
  # axis.text 坐标轴刻度文本
  axis.text = element_text(color = "black", size = 12),
  # axis.title 坐标轴标题
  axis.title = element_text(color = "black", size = 10),
  # plot.title 主标题
  plot.title = element_text(color = "black", size = 20, face = "bold"),
  # plot.subtitle 次要标题
  plot.subtitle = element_text(color = "red", size = 12),
  # plot.background 图片背景
  plot.background = element_rect(fill = "white"))

7. 保存图片到 PDF 和 PNG

gg

在这里插入图片描述

albert = '20180402-A-01'
ggsave(filename = paste0(albert, ".pdf"), width = 8.6, height = 5, device = cairo_pdf)
ggsave(filename = paste0(albert, ".png"), width = 8.6, height = 5, dpi = 100, device = "png")

8. session-info

sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] showtext_0.9-5    showtextdb_3.0    sysfonts_0.8.8    gghighlight_0.3.3
##  [5] extrafont_0.18    forcats_0.5.2     stringr_1.4.1     dplyr_1.0.10     
##  [9] purrr_0.3.4       readr_2.1.2       tidyr_1.2.1       tibble_3.1.8     
## [13] ggplot2_3.3.6     tidyverse_1.3.2   scales_1.2.1     
## 
## loaded via a namespace (and not attached):
##  [1] httr_1.4.4          sass_0.4.2          jsonlite_1.8.0     
##  [4] viridisLite_0.4.1   modelr_0.1.9        bslib_0.4.0        
##  [7] assertthat_0.2.1    highr_0.9           googlesheets4_1.0.1
## [10] cellranger_1.1.0    yaml_2.3.5          ggrepel_0.9.1      
## [13] Rttf2pt1_1.3.10     pillar_1.8.1        backports_1.4.1    
## [16] glue_1.6.2          extrafontdb_1.0     digest_0.6.29      
## [19] rvest_1.0.3         colorspace_2.0-3    htmltools_0.5.3    
## [22] pkgconfig_2.0.3     broom_1.0.1         haven_2.5.1        
## [25] tzdb_0.3.0          googledrive_2.0.0   generics_0.1.3     
## [28] farver_2.1.1        ellipsis_0.3.2      cachem_1.0.6       
## [31] withr_2.5.0         cli_3.3.0           magrittr_2.0.3     
## [34] crayon_1.5.1        readxl_1.4.1        evaluate_0.16      
## [37] fs_1.5.2            fansi_1.0.3         xml2_1.3.3         
## [40] textshaping_0.3.6   tools_4.2.1         hms_1.1.2          
## [43] gargle_1.2.1        lifecycle_1.0.1     munsell_0.5.0      
## [46] reprex_2.0.2        compiler_4.2.1      jquerylib_0.1.4    
## [49] systemfonts_1.0.4   rlang_1.0.5         grid_4.2.1         
## [52] rstudioapi_0.14     labeling_0.4.2      rmarkdown_2.16     
## [55] gtable_0.3.1        DBI_1.1.3           R6_2.5.1           
## [58] lubridate_1.8.0     knitr_1.40          fastmap_1.1.0      
## [61] utf8_1.2.2          ragg_1.2.3          stringi_1.7.8      
## [64] Rcpp_1.0.9          vctrs_0.4.1         dbplyr_2.2.1       
## [67] tidyselect_1.1.2    xfun_0.32

测试数据

配套数据下载:us_avg_tuition.xlsx

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Good day! The cost of university tuition fees has been a topic of concern among students and parents alike. Some argue that the fees are too high, limiting access to higher education, while others believe that the cost is justified by the quality of education and facilities provided. In this article, we will examine the different perspectives surrounding university tuition fees and explore the impact they have on students. Firstly, it is important to acknowledge that higher education is a valuable investment. Research has shown that individuals with a university degree earn significantly more over their lifetime compared to those without. This is because university education equips students with new skills, knowledge and networks, enhancing their employability and career prospects. However, the rising cost of tuition fees is making it increasingly difficult for students from lower-income families to pursue a degree. This is because tuition fees may not only be a financial burden but can also discourage students psychologically and create extra stress. Furthermore, many students are forced to accumulate significant debt in order to pay for their education. This debt can leave them struggling financially for years after graduation, burdening them with the responsibility to pay back the borrowed amount. In some cases, students are forced to delay important milestones, such as buying a house or starting a family, due to this financial burden. In addition to the financial impact, the high cost of tuition fees can also have a psychological impact on students. It can create a sense of pressure to succeed academically in order to justify the cost of their education, potentially leading to burnout, stress and anxiety. It can also lead to a sense of resentment towards the university and the feeling that students are being taken advantage of. In conclusion, university tuition fees are a complicated issue with both positive and negative effects. While it is important to acknowledge the value of higher education, we must also address the rising cost of tuition fees and the impact it has on students. Whether it is by increasing government funding for education, providing more scholarships, or reducing operational costs, solutions must be found to ensure that access to higher education is available to all, regardless of their financial background. I hope this article was informative and useful to you. Good luck with your studies!
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值