Creating a new class
Although the casual user might not realize it, R is actually a fully object oriented language, as every variable used in an R program is an object, or instance of a class. Classes in R are of two main types: S3 and S4. S3 classes (so named because they were defined for version 3 of the S language, the precursor to R) are older and, although many built-in R classes are of the S3 type, it’s considered good practice to create any new classes according to the more recent S4 standard, so that’s what we’ll look at in this post.
If you’re familiar with class definition techniques in languages such as Java, C++ or C#, R’s methods for defining classes will seem a bit bizarre. At the minimum, an R class must have a name and optionally one or more data fields, known as slots, each of which must have an existing data type. A class is created using setClass()
:
setClass("numbers", representation(a = "numeric", b = "numeric")) num1 = new("numbers", a = 12, b = 42) num1@a [1] 12
We’ve created a class called numbers
which contains 2 numeric
fields: a
and b
. The representation()
property of setClass()
is given a list of slot names and their associated data types.
An object can be created from a class using the new()
function (this is about the only feature of R classes that would be familiar to a ‘regular’ object-oriented programmer!), which takes as its first argument the name of the class, followed by initial values for its slots. Once the num1
object has been created, its slots can be referred to by using the object’s name followed by @ followed by the slot name, as shown.
Adding methods to a class
In most OO languages, methods can be added to a class by writing them inside the class definition. Such methods belong to that class and need have no connection with any code outside the class (indeed, proper object oriented design often precludes outside connections). In R, things are quite different. A method can be added to a class using the setMethod()
function, but the procedure for doing so is a bit tricky. As an example, suppose we want to add a method to numbers
which prints out the slot a
for a given object. In order to do this, we must override an existing function so that it operates on a numbers
object; we can’t just invent a new method from scratch.
For example, there is a print()
function built in to R, so we could call our new method print
and customize it so that it prints out the a
slot of a numbers
object. Here’s how it’s done:
setMethod("print", "numbers", function(x) { cat(paste("a =", x@a))}) print(num1) a = 12
The first argument to setMethod()
is the method’s name, which must match that of an existing function. The second argument is the class to which the method is to be added. The third argument is a definition of the method which overrides the existing definition, and which will be called whenever print()
is invoked on a numbers
object. In this case, the function uses the cat()
function to print out "a ="
followed by the value of a
. The function is invoked as shown.
One important point must be emphasized here. The argument name (x
) in the function definition must match that in the definition of the function that is being overridden. If you’re overriding a built-in R function, you’ll need to check the documentation to see what name is used for the argument(s) of the function you’re overriding. The documentation for print()
gives the first argument name as x
, so we have to use that name in our own definition. In fact, the documentation says explicitly: “x
: an object used to select a method”.
What if we want to add a method with a name of our own choosing? In that case, we need to define a function with that name outside the class first and then override it as a method within the class. For example, if we wanted a method a.b
that prints out both a
and b
we could write:
a.b = function(obj) {} setMethod("a.b", "numbers", function(obj) { cat(paste("a =", obj@a, " b =", obj@b))}) a.b(num1) a = 12 b = 42
We first define a.b
as an empty function that takes a single argument called obj
. We can then use setMethod()
to override this function so that it works for a numbers
object. Again, we must use the same argument name (obj
) in the method definition as was used in the original function definition. Calling a.b()
on a numbers
object gives the expected result. If we call a.b
on any other data type, the original (empty) definition of a.b
is called which returns nothing, so the result is NULL
.
Prototypes and default values
In our definition of the numbers
class, the slots a
and b
were defined as numeric
data types, but no default values were given. If we create a new object without giving values for these slots, we get an object with empty numeric vectors:
> num2 = new("numbers") > num2 An object of class "numbers" Slot "a": numeric(0) Slot "b": numeric(0)
If we want the option of not specifying one or more of the arguments, we can provide a prototype
parameter to setClass()
:
setClass("numbersDef", representation(a = "numeric", b = "numeric"), prototype(a = 100, b = 666)) > num2 = new("numbersDef") > num2 An object of class "numbersDef" Slot "a": [1] 100 Slot "b": [1] 666 > num3 = new("numbersDef", b = 222) > num3 An object of class "numbersDef" Slot "a": [1] 100 Slot "b": [1] 222
We can now create a numbersDef
object by specifying none, one or both slots, with the prototype default values filling in any missing slots.