R Programming -- Factors

  1. Factors

    • Try R is Sponsored By:
      O'Reilly
    • Complete to
      Unlock
      Chapter 5 Badge

    Often your data needs to be grouped by category: blood pressure by age range, accidents by auto manufacturer, and so forth. R has a special collection type called afactor to track these categorized values.

  2. Creating Factors5.1

    It's time to take inventory of the ship's hold. We'll make a vector for you with the type of booty in each chest.

    To categorize the values, simply pass the vector to the factor function:

    Redo Complete
    > chests <- c('gold', 'silver', 'gems', 'gold', 'gems')
    > types <- factor(chests)
    
  3. There are a couple differences between the original vector and the new factor that are worth noting. Print thechests vector:

    Redo Complete
    > print(chests)
    [1] "gold"   "silver" "gems"   "gold"   "gems"  
    
  4. You see the raw list of strings, repeated values and all. Now print the types factor:

    Redo Complete
    > print(types)
    [1] gold   silver gems   gold   gems  
    Levels: gems gold silver
    

    Printed at the bottom, you'll see the factor's "levels" - groups of unique values. Notice also that there are no quotes around the values. That's because they're not strings; they're actually integer references to one of the factor's levels.

  5. Let's take a look at the underlying integers. Pass the factor to the as.integer function:

    Redo Complete
    > as.integer(types)
    [1] 2 3 1 2 1
    
  6. You can get only the factor levels with the levels function:

    Redo Complete
    > levels(types)
    [1] "gems"   "gold"   "silver"
    
  7. Plots With Factors5.2

    You can use a factor to separate plots into categories. Let's graph our five chests by weight and value, and show their type as well. We'll create two vectors for you; weights will contain the weight of each chest, and priceswill track how much the chests are worth.

    Now, try calling plot to graph the chests by weight and value.

    Redo Complete
    > weights <- c(300, 200, 100, 250, 150)
    > prices <- c(9000, 5000, 12000, 7500, 18000)
    > plot(weights, prices)
    
    • 100150200250300600080001000012000140001600018000weightsprices
  8. We can't tell which chest is which, though. Fortunately, we can use different plot characters for each type by converting the factor to integers, and passing it to the pch argument of plot.

    Redo Complete
    > plot(weights, prices, pch=as.integer(types))
    

    "Circle", "Triangle", and "Plus Sign" still aren't great descriptions for treasure, though. Let's add a legend to show what the symbols mean.

    • 100150200250300600080001000012000140001600018000weightsprices
  9. The legend function takes a location to draw in, a vector with label names, and a vector with numeric plot character IDs.

    Redo Complete
    > legend("topright", c("gems","gold","silver"),pch=1:3)
    

    Next time the boat's taking on water, it would be wise to dump the silver and keep the gems!

    • 100150200250300600080001000012000140001600018000weightspricesgemsgoldsilvergemsgoldsilver
  10. If you hard-code the labels and plot characters, you'll have to update them every time you change the plot factor. Instead, it's better to derive them by using the levels function on your factor:

    Redo Complete
    > legend("topright",levels(types),pch=1:length(levels(types)))
    
    • 100150200250300600080001000012000140001600018000weightspricesgemsgoldsilvergemsgoldsilver
  11. Chapter 5 Completed

    Chapter 5 Badge
    Share your plunder:

    A long inland march has brought us to the end of Chapter 5. We've stumbled across another badge!

    Factors help you divide your data into groups. In this chapter, we've shown you how to create them, and how to use them to make plots more readable.

    More from O'Reilly

    Did you know that our sponsor O'Reilly has some great resources for big data practitioners? Check out the Strata Newsletter, the Strata Blog, and get access to five e-books on big data topics from leading thinkers in the space.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值