Consider a data frame containing a factor.
When I create a subset using subset() or any other indexing function, then a new data frame is created.
I have observed that the factor variable retains all of its original levels, even if they do not exist in the new data frame.
This creates problems while plotting or using the functions that rely on factor levels.
Is there any way to remove levels from a factor in the new data frame i.e. the data frame I have taken a subset of
Below is my example:
data <- data.frame(letters=letters[1:10],
numbers=seq(1:10))
levels(data$letters)
## [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
subdata <- subset(data, numbers <= 5)
## letters numbers
## 1 a 1
## 2 b 2
## 3 c 3
## 4 d 4
## 5 e 5
## But the levels are still there!
levels(subdata$letters)
## [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"