In the {ExtendDataFrames} package we need to create a class. For this
example, We have created the <birthdays>
class. It is
very simple, fundamentally the class just needs two columns, with column
names: name
and birthday
. The names must be
exact (i.e. case- and plural-sensitive), but we’ll come back to this
aspect later. The role of the class – and it’s important that a
well-designed class has a role, otherwise just using the build in data
frame will probably suffice – is to store the name of a person and their
birthday.
For a full explaination of the functions used in this vignette, look at the source code and see the other vignette.
We can create a <birthdays>
object by calling the
class constructor. First we create a data frame that can be converted to
a <birthdays>
object and pass that to the
constructor, which is called birthdays()
. In other R
packages class constructors may be called
new_[class-name]()
, build_[class-name]()
, or
similar.
df <- data.frame(
name = c("kevin", "stacey"),
birthday = c(as.Date("2001-01-01"), as.Date("2002-01-02"))
)
birthdays <- birthdays(df)
birthdays
#> A `birthdays` object with 2 rows and 2 cols
We can check whether our new object has been assigned the correct class attribute.
class(birthdays)
#> [1] "birthdays" "data.frame"
attributes(birthdays)
#> $names
#> [1] "name" "birthday"
#>
#> $class
#> [1] "birthdays" "data.frame"
#>
#> $row.names
#> [1] 1 2
The benefit of extending the data frame class is we inherit all of
the methods already written for data frames, and can overwrite
(overload) these methods with our own custom
<birthday>
methods. We have written a custom
print()
(print.birthdays()
) method and will
use the inherited summary()
and str()
methods.
print(birthdays)
#> A `birthdays` object with 2 rows and 2 cols
summary(birthdays)
#> name birthday
#> Length:2 Min. :2001-01-01
#> Class :character 1st Qu.:2001-04-02
#> Mode :character Median :2001-07-03
#> Mean :2001-07-03
#> 3rd Qu.:2001-10-02
#> Max. :2002-01-02
str(birthdays)
#> Classes 'birthdays' and 'data.frame': 2 obs. of 2 variables:
#> $ name : chr "kevin" "stacey"
#> $ birthday: Date, format: "2001-01-01" "2002-01-02"
The print method returns invisibly which we can check:
tmp <- print(birthdays)
#> A `birthdays` object with 2 rows and 2 cols
identical(tmp, birthdays)
#> [1] TRUE
We can also write custom methods that are not already defined, in
other words do not already have a defined generic function. Here we
write a birthdays_per_month()
method which does as it says
on the tin.
birthdays_per_month(birthdays)
#> Jan
#> 2
The <birthdays>
class is validated upon
construction and can also be validated interactively when working with
it:
validate_birthdays(birthdays)
validate_birthdays(birthdays[1, ])
try(validate_birthdays(birthdays[, 1]))
#> Removing crucial column in `<birthdays>` returning `<data.frame>`
#> Error in validate_birthdays(birthdays[, 1]) :
#> input must contain 'name' and 'birthday' columns
birthdays$age <- seq_len(nrow(birthdays))
validate_birthdays(birthdays)
For a full explanation of the behaviour in the above code chunk see the other vignette.
Other methods for the class can also be added, for example:
birthday_paradox(birthdays)
#> num_coincidence prob_coincindence
#> 0.000000000 0.002739726
Another example of making a class with randomised names and birthdays:
df <- data.frame(
name = randomNames::randomNames(10),
birthday = sample(
x = seq.Date(
from = as.Date("2020-01-01"),
to = as.Date("2022-01-01"),
by = 1),
size = 10
)
)
birthdays <- birthdays(df)
birthdays
#> A `birthdays` object with 10 rows and 2 cols
as.data.frame(birthdays)
#> name birthday
#> 1 Holiday, Brianna 2020-09-08
#> 2 Ramirez, Rachel 2020-09-11
#> 3 el-Radi, Isaam 2021-11-28
#> 4 Ranzinger, Pantea 2020-05-08
#> 5 Soto Reyes, Elexsis 2021-09-25
#> 6 Reynolds, Jaylen 2021-05-14
#> 7 Jundt, Andrew 2020-05-30
#> 8 el-Sabet, Taaliba 2021-12-16
#> 9 Sur, Yo Han 2021-06-02
#> 10 Lacota, Anissa 2021-08-02