-
Factors
Often your data needs to be grouped by category: blood pressure by age range, accidents by auto manufacturer, and so forth. R has a special collection type called afactor to track these categorized values.
-
You see the raw list of strings, repeated values and all. Now print the
types
factor:Redo Complete> print(types) [1] gold silver gems gold gems Levels: gems gold silver
Printed at the bottom, you'll see the factor's "levels" - groups of unique values. Notice also that there are no quotes around the values. That's because they're not strings; they're actually integer references to one of the factor's levels.
-
Plots With Factors5.2
You can use a factor to separate plots into categories. Let's graph our five chests by weight and value, and show their type as well. We'll create two vectors for you;
weights
will contain the weight of each chest, andprices
will track how much the chests are worth.Now, try calling
plot
to graph the chests by weight and value.Redo Complete> weights <- c(300, 200, 100, 250, 150) > prices <- c(9000, 5000, 12000, 7500, 18000) > plot(weights, prices)
- 100150200250300600080001000012000140001600018000weightsprices
-
We can't tell which chest is which, though. Fortunately, we can use different plot characters for each type by converting the factor to integers, and passing it to the
pch
argument ofplot
.Redo Complete> plot(weights, prices, pch=as.integer(types))
"Circle", "Triangle", and "Plus Sign" still aren't great descriptions for treasure, though. Let's add a legend to show what the symbols mean.
- 100150200250300600080001000012000140001600018000weightsprices
-
The
legend
function takes a location to draw in, a vector with label names, and a vector with numeric plot character IDs.Redo Complete> legend("topright", c("gems","gold","silver"),pch=1:3)
Next time the boat's taking on water, it would be wise to dump the silver and keep the gems!
- 100150200250300600080001000012000140001600018000weightspricesgemsgoldsilvergemsgoldsilver
-
If you hard-code the labels and plot characters, you'll have to update them every time you change the plot factor. Instead, it's better to derive them by using the
levels
function on your factor:- 100150200250300600080001000012000140001600018000weightspricesgemsgoldsilvergemsgoldsilver
-
Chapter 5 Completed
Share your plunder:
A long inland march has brought us to the end of Chapter 5. We've stumbled across another badge!
Factors help you divide your data into groups. In this chapter, we've shown you how to create them, and how to use them to make plots more readable.
More from O'Reilly
Did you know that our sponsor O'Reilly has some great resources for big data practitioners? Check out the Strata Newsletter, the Strata Blog, and get access to five e-books on big data topics from leading thinkers in the space.
R Programming -- Factors
最新推荐文章于 2022-05-09 17:43:13 发布