Internally stored as numericals
Factors are internally stored as numericals, which can be revealed by applying
as.numeric
or
as.integer(…)
on a factor:
fac <- factor(c('foo', 'bar', 'foo', 'baz', 'bar', 'foo'));
as.numeric(fac);
#
# 3 1 3 2 1 3
#
In unordered factors (as is the case in the last code snippet), the numbers are assigned in alphabetical order.
Creating ordered factors
An ordered factor can be created by using ordered=TRUE
when factor()
is called. In this case, the variables class attribute will be both factor
and ordered
.
Note, that
days
does not have
'Wed'
but we specify it as a factor in the call to
factor
. Using
table(…)
shows the count of each factor, even if not being assigned:
days <- factor (c('Wed', 'Mon', 'Fri', 'Fri', 'Wed', 'Sun', 'Mon', 'Wed', 'Thu', 'Thu', 'Sat'),
ordered = TRUE,
levels = c('Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat')
);
attr(days, 'class');
#
# "ordered" "factor"
#
table(days);
#
# Sun Mon Tue Wed Thu Fri Sat
# 1 2 0 3 2 2 1
#
Omit 'Levels' when printing
When a factor is printed, by default, its levels are also displayed in a separate line.
In order to omit this line and only print the factor's elements, one of the
as.XYZ()
functions might be used:
f <- factor(c('foo', 'bar', 'foo', 'baz', 'bar'));
f
#
# [1] foo bar foo baz bar
# Levels: bar baz foo
as.character(f);
#
# [1] "foo" "bar" "foo" "baz" "bar"