The function indToPair takes a dataset of observations (such as individuals in an infectious disease outbreak) and transforms it into a dataset of pairs.

indToPair(
  indData,
  indIDVar,
  separator = "_",
  dateVar = NULL,
  units = c("mins", "hours", "days", "weeks"),
  ordered = FALSE
)

Arguments

indData

An individual-level dataframe.

indIDVar

The name (in quotes) of the column with the individual ID.

separator

The character to be used to separate the individual IDs when creating the pairID.

dateVar

The name (in quotes) of the column with the dates that the individuals are observed (optional unless ordered = TRUE). This column must be a date or date-time (POSIXt) object. If supplied, the time difference between individuals will be calculated in the units specified.

units

The units for the time difference, only necessary if dateVar is supplied. Must be one of "mins", "hours", "days", "weeks".

ordered

A logical indicating if a set of ordered pairs should be returned (<dateVar>.1 before <dateVar>.2 or <dateVar>.1 = <dateVar>.2). If FALSE a dataframe of all pairs will be returned

Value

A dataframe of either all possible pairs of individuals (ordered = FALSE) or ordered pairs of individuals (ordered = TRUE). The dataframe will have all of the original variables with suffixes ".1" and ".2" corresponding to the original values of <indIDVar>.1 and <indIDVar>.2.

Added to the dataframe will be a column called pairID which is <indIDVar>.1

and <indIDVar>.2 separated by separator.

If dateVar is provided the dataframe will also include variables <dateVar>.Diff giving the difference of time of dateVar for <indIDVar>.1 and <indIDVar>.2

in the units specified

Details

The function requires an id column: indIDVar to identify the individual observations. The resulting pair-level dataframe will have a pairID column which combines the individual IDs for that pair.

The function can either output all possible pairs (ordered = FALSE) or only ordered pairs (ordered = TRUE) where the ordered is determined by a date variable (dateVar). If orded = TRUE, then dateVar must be provided and if ordered = FALSE, it is optional. In both cases, if dateVar is provided, the output will include the time difference between the individuals in the pair in the units specified ("mins", "hours", "days", "weeks").

Examples

## Create a dataset of all pairs with no date variable
pairU <- indToPair(indData = indData, indIDVar = "individualID")

## Create a dataset of all pairs with a date variable
pairUD <- indToPair(indData = indData, indIDVar = "individualID",
                      dateVar = "infectionDate", units = "days")

## Create a dataset of ordered pairs
pairO <- indToPair(indData = indData, indIDVar = "individualID",
                     dateVar = "infectionDate", units = "days", ordered = TRUE)