Background

APLUS ML (A Python Library for Usefulness Simulations of Machine Learning Models) is a simulation framework for conducting usefulness assessments of machine learning models in workflows, as originally published in this 2023 JBI paper.

It aims to quantitatively answer the question: If I use this ML model within this workflow, will the benefits outweigh the costs, and by how much?

APLUS graphical abstract

APLUS was originally developed for clinical workflows in healthcare settings. However, APLUS is a applicable to any workflow that involves a model making decisions on a stream of datapoints, and we encourage contributors from any domain to use and extend APLUS.

Motivation

Despite rapid advancements in machine learning (ML) for healthcare, model deployment remains limited as traditional ML evaluation metrics (e.g., AUROC, F1 Score) fail to account for the complexities of real-world workflows such as staffing constraints, resource limitations, treatment heterogeneity, and variable patient flow.

These operational conditions can greatly distort the realized impact of introducing an ML model into a clinical setting.

APLUS addresses this problem by evaluating ML models under realistic workflow constraints via simulation.

Components

Below, we provide a high-level overview of the three components of APLUS. We provide a more formal specification of the concepts underlying APLUS on the Concept Guide page.

1. Simulation

APLUS uses discrete-event simulation to model workflows.

Each patient’s journey through the workflow is represented as an ordered sequence of states over evenly-spaced time steps, which enables modeling cycles and resource-dependent prioritization (e.g., Stage IV cancer patients receiving preferential access to treatment over Stage II patients).

Transitions between states can be determined probabilistically or via conditional expressions, which enables modeling heterogeneous treatment effects, patient subpopulations, and system-level resource constraints.

APLUS outputs a structured dictionary that contains the full history of the simulation (e.g. states, transitions, achieved utilities, etc.) to enable arbitrary downstream analyses.

2. Configuration

APLUS represents workflows as a state machine specified via a YAML config file. The config consists of three sections:

  1. Metadata - Essential initialization details, such as:

    • Workflow name

    • Paths to required files (e.g., CSV files with model predictions)

    • Column mappings for tabular data (e.g. which column in a CSV corresponds to patient IDs)

  2. Variables – Parameters of the simulation. There are four types of variables:

    • Simulation-Level Variables: Tracked by the simulation engine itself and measure the progression of time within the simulation. (e.g. the number of timesteps the simulation has run, the duration of time that a patient has been in a state, etc.).

    • Patient-Level Properties: Unique, individual-level properties associated with each patient(e.g., age, cancer stage, ML model predictions).

    • System-Level Resources: Attributes of the overall system shared across all patients (e.g., MRI availability, budget, specialist capacity).

    • Constants: Fixed values. These can be any primitive Python type (integer, float, string, or boolean) or basic Python data structure (list, dict, set).

  3. States – The steps of the workflow. There are three types of states:

    • Start: Where all patients begin their journey through the workflow. There is always exactly 1 start state.

    • Intermediate: Transition points within the workflow. There can be 0+ intermediate states.

    • End: Terminal states, which indicate the completion of the workflow. There must be 1+ end states.

3. Visualization

APLUS has a set of predefined visualizations for workflows which are generated using Graphviz.

Definitions

Here, we provide a more detailed specification of the concepts underlying the APLUS simulation framework.

Simulation

We use discrete event simulation to simulate our workflow \(W\).

In other words, we represent the world as occuring through a set of discrete, evenly spaced timesteps \(\lambda = 0, 1,...,N\). Each timestep \(\lambda\) could represent a second, minute, hour, day, etc., the interpretation is up to the user.

Events \(A\) and \(B\) that occur within the same timestep \(\lambda\) can have arbitrary ordering if there does not exist a strict \(A \rightarrow B\) or \(B \rightarrow A\) dependency between these events. In other words, if 3 patients have an MRI and 2 patients have a blood test on timestep \(\lambda = 3\), then assuming none of these events are dependent on each other, the ordering in which the blood tests and MRIs occur will be random.

A β€œduration” refers to a number of timesteps (i.e. a length of time).

Workflow

A workflow \(W\) is simply a set of states \(S\).

States

Each state \(s \in S\) has associated with it:

  1. A duration \(\lambda_s\) representing how many timesteps an agent will wait in this state before transitioning to another state

  2. A set of utilities \(U_s\)

  3. A set of resource deltas \(R_s \subseteq R\) that specify how various resources \(r \in R_s\) change when an agent arrives at this state

  4. A set of transitions \(T_s \subseteq T\)

  5. A type \(\tau_s \in \{\text{start}, \text{normal}, \text{end}\}\)

Invariants:

  • \(|\{ s \in S | \tau_s = \text{start} \}| = 1\)

  • \(\forall s \in S\) such that \(\tau_s = \text{start}, |T_s| > 0\)

  • \(\forall s \in S\) such that \(\tau_s = \text{normal}, |T_s| > 0\)

  • \(\forall s \in S\) such that \(\tau_s = \text{end}, |T_s| = 0\)

Transitions

Given the set of all transitions \(T\), each transition \(t \in T_s \subseteq T\) has associated with it:

  1. A source state \(s \in S\)

  2. A destination state \(s' \in S\) (where \(s'\) could be the same as \(s\))

  3. A duration \(\lambda_t\) representing how many timesteps an agent will wait, after having chosen this transition \(t\), before moving to state \(s'\)

  4. A condition \(c_t \in C\) that, only when TRUE, allows the agent to take this transition \(t\) to state \(s'\)

  5. A set of utilities \(U_t\)

  6. A set of resource deltas \(R_t \subseteq R\) that specify how various resources \(r \in R_t\) change after an agent takes this transition

Utilities

Given the set of all utilities \(U\), each utility \(u \in U\) has associated with it:

  1. A value \(u_v \in \mathbb{R}\) representing the numeric value of this utility

  2. A unit \(u_u\) (i.e. QALYs, US dollars, years, etc.)

  3. A condition \(c_u \in C\) that, only when TRUE, has the simulation record that this utility value \(u_v\) for unit \(u_u\) was achieved

Conditions

A condition \(c \in C\) determines whether a utility or transition can be taken. A condition \(c\) can take the form of either:

  1. A probability (in which case \(\{ c \in \mathbb{R} | 0 \le c \le 1 \}\)); OR

  2. An arbitrary Python expression which evaluates to TRUE or FALSE

Resources

A resource \(r \in R\) is a constrained resource that is shared across all patients. This represents a hospital-level constraint of a workflow (i.e. fiscal budget, number of nurses, MRI machine availability, etc.). Each resource \(r\) has associated with it:

  1. A level \(r_l \in \mathbb{N}\) which represents the current value of the resource

  2. An initial amount \(r_i \in \mathbb{N}\) which ensures \(r_l = r_i\) when \(\lambda = 0\)

  3. An maximum capacity \(r_m \in \mathbb{N}\) which ensures that \(r_l \le r_m\)

  4. A refill amount \(r_a \in \mathbb{N}\) that represents how much this resource gets increased after \(\lambda_r\) timesteps have elapsed since the last refill

  5. A refill duration \(\lambda_r \in \mathbb{N}\) that represents how many timesteps must elapse before the resource is increased to a value of \(\max{r_l + r_a, r_m}\)

Important

In order to decrement a resource, you need to specify a resource delta on the relevant state/transition. Otherwise, if you just require that nurse_capacity > 0 for a transition, then the simulation will not automatically decrement nurse_capacity by 1 when that transition is taken (which can be surprising to some users). This is often a cause of infinite loops, or situations where changing the \(r_i\), \(r_m\), or \(r_a\) of a resource has no effect on the model’s achieved utility.

Patients

Each patient \(p \in P\) has associated with it:

  1. A start timestep \(\lambda_p\) representing the timestep of the simulation at which the patient began progressing through the workflow (i.e. the day that the patient was admitted to the hospital)

  2. A current state \(s_p \in S\). The patient always starts at a state \(s_p\) where \(\tau_{s_p} = \text{start}\)

  3. A set of properties \(\rho_p\) which can be anything (integers, floats, strings, dictionaries, lists, etc.)

  4. A history object \(H_p\) which captures all of the past states, transitions, and utilities that the patient achieved.

Running a Simulation

Each patient \(p\) starts his/her workflow at the state \(s\), where \(\tau_s = \text{start}\). Note: This is the same for all patients.

Each patient \(p\) starts his/her workflow at timestep \(\lambda_p\). Note: This varies across all patients.

Each patient \(p\) then progresses through the states of the workflow, according to the applicable transitions and conditions. The patient stops their journey when either of the following conditions are met:

  • The patient reaches a state \(s\) where \(\tau_s = \text{end}\); OR

  • The simulation is terminated prematurely after a set number of timesteps have occurred