R语言cut函数可以把数值变量转化成因子变量
Description
cut
divides the range of x
into intervals and codes the values in x
according to which interval they fall. The leftmost interval corresponds to level one, the next leftmost to level two and so on.
Usage
cut(x, ...) ## Default S3 method: cut(x, breaks, labels = NULL, include.lowest = FALSE, right = TRUE, dig.lab = 3, ordered_result = FALSE, ...)
Arguments
x | a numeric vector which is to be converted to a factor by cutting. |
breaks | either a numeric vector of two or more unique cut points or a single number (greater than or equal to 2) giving the number of intervals into which |
labels | labels for the levels of the resulting category. By default, labels are constructed using |
include.lowest | logical, indicating if an ‘x[i]’ equal to the lowest (or highest, for |
right | logical, indicating if the intervals should be closed on the right (and open on the left) or vice versa. |
dig.lab | integer which is used when labels are not given. It determines the number of digits used in formatting the break numbers. |
ordered_result | logical: should the result be an ordered factor? |
... | further arguments passed to or from other methods. |
vec <-sample(50:100,70,replace = TRUE)
break1<-fivenum(vec)
break2 <- c(49,60,70,90,100)
labels = c("差", "中", "良", "优")
score_grade <- cut(vec,break1,labels,ordered_result = T)
is.na(score_grade)
score_grade <- cut(vec,break2,labels,ordered_result = T)
## iris dataframe
q25<-quantile(iris$Sepal.Length,0.25)
q50<-quantile(iris$Sepal.Length,0.5)
q75<-quantile(iris$Sepal.Length,0.75)
groupvec<-c(min(iris$Sepal.Length),q25,q50,q75,max(iris$Sepal.Length))
labels<-c('A','B','C','D')
iris$new_col<-with(iris,cut(Sepal.Length,breaks = groupvec,
labels = labels,include.lowest = TRUE))
#cut参数分别是:对象,分界点,组名,
head(iris)