对于您的问题,我会一一回答:
1. 读入 Bike_share.xlsx 文件,并命名为 bike_share。
```R
library(readxl)
bike_share <- read_excel("Bike_share.xlsx")
```
2. 将 hour 变量因子化,并绘制箱线图比较各小时的总租车量。
```R
bike_share$hour <- factor(bike_share$hour)
boxplot(count ~ hour, data = bike_share)
```
3. 增加一列 season,season 将 month 分为 4 组:1-3 月为"spring", 4-6 月为"summer",7-9 月为"autumn",10-12 月为"winter",将其 保存为 bike_share2,然后统计各季节的平均租车量。
```R
bike_share2 <- bike_share
bike_share2$season <- cut(bike_share2$month, breaks = c(0,3,6,9,12),
labels = c("spring", "summer", "autumn", "winter"), include.lowest = TRUE)
aggregate(count ~ season, data = bike_share2, mean)
```
4. 绘制柱状图查看各小时租车量中各季节的占比,设置 position 参数使得各柱子长度均相等。
```R
library(ggplot2)
ggplot(bike_share2, aes(x = hour, fill = season)) +
geom_bar(position = "fill") +
scale_y_continuous(labels = scales::percent) +
labs(title = "Hourly bike rentals by season", x = "Hour", y = "Percentage")
```
5. 绘制柱状图统计各小时的平均租车量,将工作日和非工作日分面,分面呈现为 2 行。
```R
bike_share$day_type <- ifelse(bike_share$workingday == 1, "Workday", "Weekend/Holiday")
ggplot(bike_share, aes(x = hour, y = count, fill = day_type)) +
geom_col(position = "dodge") +
facet_wrap(~day_type, nrow = 2) +
labs(title = "Hourly bike rentals by day type", x = "Hour", y = "Average bike rentals")
```
6. 将 date 变量转换成标准日期格式,然后按日期加总租车量,并绘制出各天租车量的曲线图。
```R
bike_share$date <- as.Date(bike_share$date, format = "%Y-%m-%d")
daily_rentals <- aggregate(count ~ date, data = bike_share, sum)
ggplot(daily_rentals, aes(x = date, y = count)) +
geom_line() +
labs(title = "Daily bike rentals", x = "Date", y = "Total bike rentals")
```