The function estimateR
uses the relative transmission probabilities to estimate
the individual-level, time-level, and average effective reproductive numbers
for an outbreak.
estimateR(
df,
indIDVar,
dateVar,
pVar,
timeFrame = c("days", "months", "weeks", "years"),
rangeForAvg = NULL,
bootSamples = 0,
alpha = 0.05,
progressBar = TRUE
)
The name of the dateset with transmission probabilities (column pVar
),
individual IDs (columns <indIDVar>.1
and <indIDVar>.2
), and the dates of
observation (columns <dateVar>.1
and <dateVar>.2
).
The name (in quotes) of the individual ID columns
(data frame df
must have variables called <indIDVar>.1
and <indIDVar>.2
).
The name (in quotes) of the columns with the dates that the individuals are
observed (data frame df
must have variables called <dateVar>.1
and
<dateVar>.2
) which must be date or date-time (POSIXt) objects.
The column name (in quotes) of the transmission probabilities.
The time frame used to calculate Rt
(one of "days", "months", "weeks", "years"
).
A vector with the start and ending time period to be used to calculate the average effective reproductive number.
The number of bootstrap samples; if 0, then no confidence intervals are calculated.
The alpha level for the confidence intervals.
A logical indicating if a progress bar should be printed (default is TRUE).
A list with five elements:
RiDf
- a data frame with the individual-level reproductive numbers. Column names:
<indIDVar>
- the individual ID with name specified.
<dateVar>
- the date the individual was observed with name specified.
Ri
- the individual-level reproductive number.
nInfectees
- the number of possible infectees for this individual.
RtDf
- a data frame with the time-level reproductive numbers. Column names:
time
- the time frame corresponding to the reproductive number estimate
(day for "days" and "weeks", month for "months", year for "years").
timeRank
- the rank of the time frame.
Rt
- the time-level reproductive number for this time frame.
ciLower
- lower bound of confidence interval for Rt
(only if bootSamples > 0).
ciUpper
- upper bound of confidence interval for Rt
(only if bootSamples > 0).
RtAvgDf
- a data frame with the average effective reproductive. Column names:
RtAvg
- the average time-level reproductive number between the range
specified in rangeForAvg
.
ciLower
- lower bound of confidence interval for Rt
(only if bootSamples > 0).
ciUpper
- upper bound of confidence interval for Rt
(only if bootSamples > 0).
timeFrame
- a vector with the timeFrame input
rangeForAvg
- a vector with the rangeForAvg input
The effective reproductive number is the average number of cases an infectious case will produce in a population of both susceptible and non-susceptibe individuals. The rational behind this reproductive number estimation is Wallinga and Teunis (2004) where the individual-level reproductive number is estimated by summing the relative probability that the individual infected any other individual.
If \(p_{ij}\) equals the relative probability that case \(i\) was infected by case \(j\), then the individual-level reproductive number (\(R_j\)) is calculated by:
$$R_j = \sum_{m \ne j} {p_{mj}}$$
The time-level reproductive number is then estimated by averaging the individual-level reproductive numbers for all individuals observed in the time frame (can specify days, weeks, months, years).
Finally, the time-level reproductive numbers are averaged to
estimate the average effective reproductive number within rangeForAvg
.
To get the best estimate of the average effective reproductive number, one should
only consider the stable portion of the outbreak (exclude the beginning and end).
If bootSamples > 0
, bootstrap confidence intervals will be estimated for
both the time-level and average reproductive numbers using parametric bootstrapping.
Wallinga J, Teunis P. Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. American Journal of Epidemiology. 2004 Sep 15;160(6):509-16.
## Use the nbResults data frame included in the package which has the results
## of the nbProbabilities() function on a TB-like outbreak.
## Getting initial estimates of the reproductive number
# (without specifying rangeForAvg and without confidence intervals)
rInitial <- estimateR(nbResults, dateVar = "infectionDate",
indIDVar = "individualID", pVar = "pScaled",
timeFrame = "months")
#> Please choose the stable portion of the outbreak to calculate the average Rt
## Finding the stable portion of the outbreak for rangeForAvg using plot of Rt
cut1 <- 25
cut2 <- 125
# Optional plot to determine the cutpoints above
# ggplot(data = rInitial$RtDf, aes(x = timeRank, y = Rt)) +
# geom_point() +
# geom_line() +
# geom_hline(data = rInitial$RtAvgDf, aes(yintercept = RtAvg), size = 0.7) +
# geom_vline(aes(xintercept = cut1), linetype = 2, size = 0.7) +
# geom_vline(aes(xintercept = cut2), linetype = 2, size = 0.7)
## Finding the final reproductive number estimates with confidence intervals
# NOTE should run with bootSamples > 2.
rFinal <- estimateR(nbResults, dateVar = "infectionDate",
indIDVar = "individualID", pVar = "pScaled",
timeFrame = "months", rangeForAvg = c(cut1, cut2),
bootSamples = 2, alpha = 0.05)
#>
|
| | 0%
|
|=================================== | 50%
|
|======================================================================| 100%
rFinal$RtAvgDf
#> RtAvg ciLower ciUpper
#> 1 1.125558 0.8799709 1.343096