R-dataframe常用操作

最新推荐文章于 2024-08-09 21:49:43 发布

vshadow

最新推荐文章于 2024-08-09 21:49:43 发布

阅读量2.2k

点赞数

分类专栏： R语言

本文链接：https://blog.csdn.net/vshadow/article/details/79383639

版权

R语言专栏收录该内容

12 篇文章 0 订阅

订阅专栏

获取数据维度

> dim(df)

[1] 14  5

获取数据结构

> str(df)

'data.frame':    14 obs. of  5 variables:
 $ Outlook    : Factor w/ 3 levels "overcast","rainy",..: 3 3 1 2 2 2 1 3 3 2 ...
 $ Temperature: Factor w/ 3 levels "cool","hot","mild": 2 2 2 3 1 1 1 3 1 3 ...
 $ Humidity   : Factor w/ 2 levels "high","normal": 1 1 1 1 2 2 2 1 2 2 ...
 $ Windy      : logi  FALSE TRUE FALSE FALSE FALSE TRUE ...
 $ Play       : Factor w/ 2 levels "no","yes": 1 1 2 2 2 1 2 1 2 2 ...

获取数据统计信息

> summary(df)

     Outlook  Temperature   Humidity   Windy          Play 
 overcast:4   cool:4      high  :7   Mode :logical   no :5 
 rainy   :5   hot :4      normal:7   FALSE:8         yes:9 
 sunny   :5   mild:6                 TRUE :6

选取指定的列

> df <- data.frame(df$Temperature, df$Outlook)

> df

       df.Outlook          df.Temperature
1          sunny               hot
2          sunny               hot
3       overcast               hot
4          rainy              mild
5          rainy              cool
6          rainy              cool
7       overcast              cool
8          sunny              mild
9          sunny              cool
10         rainy              mild
11         sunny              mild
12      overcast              mild
13      overcast               hot
14         rainy              mild

查看部分数据：

head(df)

tail(df)

赋予新列名

> names(ndata) <- c("temp", "out")

> names(ndata)

[1] "temp" "out"

获取列名

> names(tdata)

[1] "Outlook" "Temperature" "Humidity" "Windy" "Play"

类型转换

df$Windy <- as.character(df$Windy)

df$Windy[df$Windy == "0"] <- "FALSE"
df$Windy[df$Windy == "1"] <- "TRUE"

df$Windy <- as.factor(df$Windy)

> summary(df$Windy)

FALSE TRUE

8 6

提取前2行

> result <- tdata[1:2,]

> result

Outlook Temperature Humidity Windy Play

1 sunny hot high FALSE no

2 sunny hot high TRUE no

# Extract 3rd and 5th row with 2nd and 4th column.

> result <- tdata[c(3,5),c(2,4)]

> result

Temperature Windy

3 hot FALSE

5 cool FALSE

合并列

cbind

合并行

rbind

清除包含NA的行

na.omit(df)

新增列

rs <- mutate(rs, flg = (beg_dif >0 & end_dif > 0) )

或

flg <- logical()

flg <- data.frame( rs$beg_dif >0 & rs$end_dif > 0 )

rs <- cbind(rs, flg)

删除列,通过subset反向select。

rs <- subset(rs, select = -flg)

一个比较详细dataframe操作的参考：

https://www.datacamp.com/community/tutorials/15-easy-solutions-data-frame-problems-r

vshadow

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录