R语言内置函数table和tidyr包的spread函数生成计数透视表

本文链接：https://blog.csdn.net/weixin_41578567/article/details/103328571

R语言通过tidyr中的spread生成计数透视表

1. 样例数据
2. 两种方法（内置函数table和tidyr包的spread函数）
- 2.1 内置函数table
- 2.2 tidyr包的spread函数

1. 样例数据

先看下样例数据（total_score）的表结构：
主要2列：company_industry，final_class
在这里插入图片描述
场景需求：根据company_industry对不同的final_class进行条数统计，生成透视表

2. 两种方法（内置函数table和tidyr包的spread函数）

table
- 代码较简单，但是如要输出标准的表格需再多做处理，并且只能计数；
spread
- 可以配合dplyr进行多种统计，灵活度高

2.1 内置函数table

t1 = as.data.frame.matrix(table(total_score$company_industry,total_score$final_class))

在这里插入图片描述
查看t1的数据结构如下

使用table的话必须通过as.data.frame.matrix转换，并且table的第一个参数（即company_industry那列）会被作为index，而不是单独一列，需再单独处理：

t1 = cbind(row.names(t1), t1)
#更名
names(t1)[1] = 'company_industry'

在这里插入图片描述

2.2 tidyr包的spread函数

先导入相关包

library(dplyr)
library(tidyr)

使用group_by + summarise看看结果

# 注： dplyr中的管道操作，列名直接以变量形式传递，不添加引号
total_score %>% group_by(company_industry, final_class) 
			%>% summarise('主体数量' = n())

在这里插入图片描述
spread生成透视表

total_score %>% group_by(company_industry, final_class) 
			%>% summarise('主体数量' = n()) 
			%>% spread(final_class, '主体数量')

在这里插入图片描述
spread函数默认NA不填充，若需填充NA则需添加fill参数

total_score %>% group_by(company_industry, final_class) 
			%>% summarise('主体数量'=n()) 
			# 将NA填充为0即添加fill = 0
			%>% spread(final_class, '主体数量', fill = 0)

在这里插入图片描述