This blog post has also been posted here.
Changes since last version
The R-package “plnr” (version 2022.6.8) has been published on CRAN. “plnr” is a part of the splverse, a set of R packages developed to help solve problems that frequently occur when performing infectious disease surveillance. “plnr” has two vignettes that briefly show the mental model behind “plnr”:
Concept
Broad technical terms | |
Object | Description |
argset | A named list containing a set of arguments. |
analysis | These are the fundamental units that are scheduled in
|
plan | This is the overarching “scheduler”:
|
Different types of plans | |
Plan Type | Description |
Single-function plan | Same action function applied multiple times with different argsets applied to the same datasets. |
Multi-function plan | Different action functions applied to the same datasets. |
Plan Examples | |
Plan Type | Example |
Single-function plan | Multiple strata (e.g. locations, age groups) that you need to apply the same function to to (e.g. outbreak detection, trend detection, graphing). |
Single-function plan | Multiple variables (e.g. multiple outcomes, multiple exposures) that you need to apply the same statistical methods to (e.g. regression models, correlation plots). |
Multi-function plan | Creating the output for a report (e.g. multiple different tables and graphs). |
In brief, we work within the mental model where we have one (or more) datasets and we want to run multiple analyses on these datasets. These multiple analyses can take the form of:
- Single-function plans: One action function (e.g.
table_1
) called multiple times with different argsets (e.g.year=2019
,year=2020
). - Multi-function plans: Multiple action functions (e.g.
table_1
,table_2
) called multiple times with different argsets (e.g.table_1
:year=2019
, while fortable_2
:year=2019
andyear=2020
)
By demanding that all analyses use the same data sources we can:
- Be efficient with requiring the minimal amount of data-pulling (this only happens once at the start).
- Better enforce the concept that data-cleaning and analysis should be completely separate.
By demanding that all analysis functions only use two arguments (data
and argset
) we can:
- Reduce mental fatigue by working within the same mental model for each analysis.
- Make it easier for analyses to be exchanged with each other and iterated on.
- Easily schedule the running of each analysis.
By including all of this in one Plan
class, we can easily maintain a good overview of all the analyses (i.e. outputs) that need to be run.