I’ve come to learn that there are many ways to apply R data frame filters, and I won’t cover them all. Some methods require a separate package, and others are scenario based (interactive, for example).
I’ll cover basic R data frame filters and also review how to convert factors as part of a filtering exercise. I’ve used the ChickWeight
data set for this demonstration.
R Data Frame Filters with $
The basic syntax for the filter is: dataset$filter_column logical test
If I want all ChickWeight rows with a weight less than 60, that would look as follows:
However, the result is a True / False vector without actually applying the filter to a selection as covered in my previous post. To get useful results, the filter can be added to a bracket selection.
> filter <- ChickWeight$weight < 60 # now the vector is stored
> head(ChickWeight[filter,]) #only shows when T
weight Time Chick Diet
1 42 0 1 1
2 51 2 1 1
3 59 4 1 1
13 40 0 2 1
14 49 2 2 1
15 58 4 2 1
Really, that’s about it – very straightforward. There are functions and other complex ways of applying filters, but to start, this method is very flexible and powerful. Note that the filter condition doesn’t need to be assigned to a variable; it’s just cleaner in this example (IMO) versus writing the filter inline in the brackets.
Applying the Filter to a Factor
For this next demonstration, I’ve moved to the InsectSprays data set:
data("InsectSprays")
Recall that R will detect when a data frame has an ordered factor, and it will convert the column data appropriately. You can see with InsectSprays
that the the spray
column is an ordered factor:
> str(InsectSprays)
‘data.frame’: 72 obs. of 2 variables:
$ count: num 10 7 20 14 14 12 10 23 17 20 …
$ spray: Factor w/ 6 levels “A”,”B”,”C”,”D”,..: 1 1 1 1 1 1 1 1 1 1 …
By default, trying to filter the data frame where spray
is equal to its numeric value (i.e. 1, 2, 3, 4, 5, 6) will fail:
> InsectSprays[InsectSprays$spray == 1,]
[1] count spray
<0 rows> (or 0-length row.names)
There are many real world scenarios where numbers are preferred for statistics, filtering, sorting, user interface control values, and more. So, if you need to convert the character value to a numeric value, the as.numeric()
function is your go to:
That’s all for this brief summary on R Data Frame Filters.
Categories: R Programming
Leave a Reply