Richard Aubrey White - R-Package “plnr” (version 2022.6.8) Published on CRAN

This blog post has also been posted here.

Changes since last version

The R-package “plnr” (version 2022.6.8) has been published on CRAN. “plnr” is a part of the splverse, a set of R packages developed to help solve problems that frequently occur when performing infectious disease surveillance. “plnr” has two vignettes that briefly show the mental model behind “plnr”:

Concept

Broad technical terms
Object	Description
argset	A named list containing a set of arguments.
analysis	These are the fundamental units that are scheduled in `plnr`: 1 argset 1 (action) function that takes two arguments data (named list) argset (named list)
plan	This is the overarching “scheduler”: 1 data pull 1 list of analyses
Different types of plans
Plan Type	Description
Single-function plan	Same action function applied multiple times with different argsets applied to the same datasets.
Multi-function plan	Different action functions applied to the same datasets.
Plan Examples
Plan Type	Example
Single-function plan	Multiple strata (e.g. locations, age groups) that you need to apply the same function to to (e.g. outbreak detection, trend detection, graphing).
Single-function plan	Multiple variables (e.g. multiple outcomes, multiple exposures) that you need to apply the same statistical methods to (e.g. regression models, correlation plots).
Multi-function plan	Creating the output for a report (e.g. multiple different tables and graphs).

In brief, we work within the mental model where we have one (or more) datasets and we want to run multiple analyses on these datasets. These multiple analyses can take the form of:

Single-function plans: One action function (e.g. table_1) called multiple times with different argsets (e.g. year=2019, year=2020).
Multi-function plans: Multiple action functions (e.g. table_1, table_2) called multiple times with different argsets (e.g. table_1: year=2019, while for table_2: year=2019 and year=2020)

By demanding that all analyses use the same data sources we can:

Be efficient with requiring the minimal amount of data-pulling (this only happens once at the start).
Better enforce the concept that data-cleaning and analysis should be completely separate.

By demanding that all analysis functions only use two arguments (data and argset) we can:

Reduce mental fatigue by working within the same mental model for each analysis.
Make it easier for analyses to be exchanged with each other and iterated on.
Easily schedule the running of each analysis.

By including all of this in one Plan class, we can easily maintain a good overview of all the analyses (i.e. outputs) that need to be run.