API Reference

The API is organized as follows:

  • Config: Configuration for the simulation. This Config class can be used as an alternative to specifying a YAML config file.

  • Models: Models used by the simulation engine. Namely, Patient, State, Transition, History, and Utility.

  • Simulation: Simulation engine. This Simulation class is responsible for running patients through the workflow (i.e. running the actual simulation).

  • Run: Helper functions for running a set of simulations.

  • Draw: Helper functions for drawing the workflow.

  • Parse: Helper functions for parsing the YAML config file.


Config

Python Pydantic models for defining an APLUSΒ simulation configuration.

class aplusml.config.Config(*, metadata: ConfigMetadata, variables: Dict[str, ConfigVariable] = {}, states: Dict[str, ConfigState] = {})[source]

Specification for an APLUS simulation. All three fields are required – metadata, variables, and states.

Each field is a Pydantic model, as defined in this API.

Instead of specifying this Config object directly, you can use the create_from_yaml method to load a YAML file that follows the schema in Configuration.

Use via:

config = Config(
    metadata=ConfigMetadata(...),
    variables=ConfigVariable(...),
    states=ConfigState(...),
)
simulation = aplusml.Simulation.create_from_config(config)
Parameters:
  • metadata (ConfigMetadata) – Metadata section of config.

  • variables (Dict[str, ConfigVariable]) – Variables section of config.

  • states (Dict[str, ConfigState]) – States section of config.

is_valid() bool[source]

Return TRUE if the Config is valid, FALSE otherwise.

class aplusml.config.ConfigMetadata(*, name: str | None = None, path_to_properties: str | None = None, properties_col_for_patient_id: str | None = None, patient_sort_preference_property: ConfigPatientSortPreferenceProperty | None = None)[source]

Metadata section of config.

Parameters:
  • name (Optional[str]) – Name of the simulation.

  • path_to_properties (Optional[str]) – Path to CSV file where each row is a patient, each column is a property. This is optional – if you don’t have a patient property CSV, you can leave this as None. Note: Only properties explicitly enumerated in the β€˜variables’ config section will be imported.

  • properties_col_for_patient_id (Optional[str]) – Column name in the CSV that contains unique patient IDs.

  • patient_sort_preference_property (Optional[ConfigPatientSortPreferenceProperty]) – Property to sort patients by when prioritizing allocation of a finite resource.

is_valid() bool[source]

Return TRUE if the ConfigMetadata is valid, FALSE otherwise.

class aplusml.config.ConfigPatientSortPreferenceProperty(*, variable: str, is_ascending: bool = True)[source]

Patient sort preference property section of config.

Parameters:
  • variable (str) – Name of a property (must be defined in the variables config section) that will be used to sort patients when prioritizing allocation of a finite resource

  • is_ascending (bool) – True = ascending order, False = descending. Default: True

class aplusml.config.ConfigState(*, type: Literal['start', 'end', 'intermediate'] = 'intermediate', label: str | None = None, transitions: List[ConfigTransition] = [], duration: int = 0, utilities: str | int | float | bool | List[ConfigUtility] = [], resource_deltas: Dict[str, float] = {})[source]

State section of config – i.e. a dictionary mapping State IDs to States.

Parameters:
  • type (str) – Whether the state is a start, end, or intermediate state within the workflow. Default: β€œintermediate”.

  • label (Optional[str]) – Human-readable label for the state. Default: value of key.

  • transitions (List[ConfigTransition]) – List of possible state transitions.

  • duration (int) – Number of timesteps to wait before transitions are evaluated. Default: 0

  • utilities (Union[str, int, float, bool, List[ConfigUtility]]) – If str, float, or bool, it’s evaluated as a Python expression. Default: [].

  • resource_deltas (Dict[str, float]) – Changes to resource levels from entering this state. Default: {}. [key] = name of a resource defined in variables. [value] = how much to change each resource level AS SOON AS this state is hit

is_valid(id: str) bool[source]

Return TRUE if the ConfigState is valid, FALSE otherwise.

class aplusml.config.ConfigTransition(*, dest: str, label: str | None = '', if_: str | bool | None = None, prob: int | float | str | None = None, duration: int = 0, utilities: str | int | float | bool | List[ConfigUtility] = [], resource_deltas: Dict[str, float] = {})[source]

Transition within a State.

Transition conditions can either have…

  • All transitions have an β€˜if’ condition (where if the last transition doesn’t have an β€˜if’, it defaults to always TRUE)

  • All transitions have a β€˜prob’ condition (where if the last transition doesn’t have a β€˜prob’, it defaults to = 1 - (sum of other probs))

  • The first set of transitions have an β€˜if’ condition, but the second set have a β€˜prob’

Parameters:
  • dest (str) – ID of the destination state.

  • label (Optional[str]) – Human-readable label for the transition. Default: β€œβ€.

  • if (Optional[Union[str, bool]]) – A Python expression. If it evaluates to TRUE, then the transition is taken.

  • prob (Optional[Union[str, float, int]]) – Probability of the transition.

  • duration (int) – Number of timesteps to wait before transitions are evaluated. Default: 0

  • utilities (Union[str, int, float, bool, List[ConfigUtility]]) – If str, float, or bool, it’s evaluated as a Python expression. Default: [].

  • resource_deltas (Dict[str, float]) – Changes to resource levels from taking this transition. Default: {}. [key] = name of a resource defined in variables. [value] = how much to change each resource level AS SOON AS this transition is taken

is_valid(state_id: str) bool[source]

Return TRUE if the ConfigTransition is valid, FALSE otherwise.

class aplusml.config.ConfigUtility(*, value: int | float | str | None = None, if_: str | bool | None = None, unit: str = '')[source]

Utility within a State or Transition.

Parameters:
  • value – If str, it’s evaluated as a Python expression.

  • if – A Python expression. If it evaluates to TRUE, then the value for this utility is set to this value. Note: These β€˜if’ statements are not mutually exclusive (i.e. if multiple conditions within the same State evaluate to TRUE, then they will simply be summed together)

  • unit – Measurement unit. Default: β€˜β€™.

is_valid(state_id: str) bool[source]

Return TRUE if the ConfigUtility is valid, FALSE otherwise.

class aplusml.config.ConfigVariable(*, type: Literal['scalar', 'resource', 'property', 'simulation'] = 'scalar', value: int | float | bool | str | list | dict | set | None = None, init_amount: int | None = None, max_amount: int | None = None, refill_amount: int | None = None, refill_duration: int | None = None, column: str | None = None, distribution: Literal['bernoulli', 'exponential', 'binomial', 'normal', 'poisson', 'uniform'] | None = None, mean: int | float | None = None, std: int | float | None = None, start: int | float | None = None, end: int | float | None = None)[source]

Variable section of config – i.e. a dictionary mapping variable IDs to variables.

Parameters:
  • type (str) –

    Type of variable. Must be one of: β€˜scalar’, β€˜resource’, β€˜property’, β€˜simulation’.

    • scalar: A scalar value that is shared across all patients. It can be used to model things like the sensitivity of a screening test, the prevalence of a disease, etc. If set, then value must be specified.

    • resource: A finite resource that is shared across all patients. It can be decremented, incremented, and reset by the simulation. It can be used to model things like hospital beds, lab capacity, etc. If set, then init_amount, max_amount, refill_amount, and refill_duration must be specified.

    • property: A property that is unique to each patient. This is a property that is unique to each patient (i.e. each patient may have a different value for this property). It can be used to model things like the age of a patient, the gender of a patient, etc. If set, then column (if loaded from a CSV file), value (if specified as a constant), or distribution (if randomly sampled) must be specified.

    • simulation: A simulation variable that is set and tracked automatically by APLUS. Must be one of: β€˜time_left_in_sim’, β€˜time_already_in_sim’, β€˜sim_current_timestep’.

  • value (Optional[Union[int, float, bool, str, list, dict, set]]) – Scalar value. Must be a valid Python type. Use β€˜!!set’ tag for sets.

  • init_amount (Optional[int]) – Initial amount of the resource.

  • max_amount (Optional[int]) – Maximum amount of resource allowed.

  • refill_amount (Optional[int]) – Amount added per refill.

  • refill_duration (Optional[int]) – Time interval between refills.

  • column (Optional[str]) – If this property is loaded from a CSV file, use the column parameter to specify the column name (e.g. β€˜y’ or β€˜y_hat_dl’). Each row of the CSV will be a patient, and the value of this property for each patient will be the value of the column in the CSV file.

  • value – Scalar value. Must be a valid Python type. Use β€˜!!set’ tag for sets.

  • distribution (Optional[VALID_DISTRIBUTION_TYPES]) – If randomly sampled, specify the distribution type.

  • mean (Optional[Union[int, float]]) – Mean value for distribution.

  • std (Optional[Union[int, float]]) – Standard deviation.

  • start (Optional[Union[int, float]]) – Minimum value.

  • end (Optional[Union[int, float]]) – Maximum value.

is_valid(id: str) bool[source]

Return TRUE if the ConfigVariable is valid, FALSE otherwise.


Models

Classes used for modeling components of an APLUS simulation – namely: Utility, Transition, State, History, and Patient.

class aplusml.models.History(current_timestep: int, state_id: str, transition_idx: int, state_utility_idxs: List[int], transition_utility_idxs: List[int], state_utility_vals: List[float], transition_utility_vals: List[float], sim_variables: Dict)[source]

The history of a patient’s states and transitions.

current_timestep

The current timestep.

Type:

int

state_id

The ID of the current state.

Type:

str

transition_idx

The index of the current transition.

Type:

int

state_utility_idxs

The indices of the utilities of the current state.

Type:

List[int]

transition_utility_idxs

The indices of the utilities of the current transition.

Type:

List[int]

state_utility_vals

The evaluated utility values of the current state.

Type:

List[float]

transition_utility_vals

The evaluated utility values of the current transition.

Type:

List[float]

__init__(current_timestep: int, state_id: str, transition_idx: int, state_utility_idxs: List[int], transition_utility_idxs: List[int], state_utility_vals: List[float], transition_utility_vals: List[float], sim_variables: Dict)[source]
Parameters:
  • current_timestep (int) – The current timestep. Example: 1

  • state_id (str) – The ID of the current state. Example: β€˜state1’

  • transition_idx (int) – The index of the current transition. Example: 0

  • state_utility_idxs (List[int]) – The indices of the utilities of the current state. Example: [0]

  • transition_utility_idxs (List[int]) – The indices of the utilities of the current transition. Example: [0]

  • state_utility_vals (List[float]) – The evaluated utility values of the current state. Example: [100000]

  • transition_utility_vals (List[float]) – The evaluated utility values of the current transition. Example: [100000]

  • sim_variables (Dict) – The variables of the simulation. Example: {β€˜y_hat’: 0.5}

class aplusml.models.Patient(id: str, start_timestep: int, properties: dict | None = None)[source]

A patient in the simulation.

id

The ID of the patient. IMPORTANT: Must be unique across all patients.

Type:

str

start_timestep

The start timestep of the patient, i.e. at which timestep in the simulation this patient starts their workflow.

Type:

int

properties

The properties of the patient.

Type:

dict

history

The history of the patient, i.e. all past states, transitions, and utilities the patient has experienced.

Type:

List[History]

current_state

The ID of the current state the patient is in.

Type:

str

__init__(id: str, start_timestep: int, properties: dict | None = None)[source]
Parameters:
  • id (str) – The ID of the patient. Example: patient1

  • start_timestep (int) – The start timestep of the patient. Example: 1

  • properties (dict, optional) – The properties of the patient. Example: {'y_hat': 0.5}

get_state_history()[source]

Get the state history of this patient. Returns a list of state IDs (in chronological order) that the patient has visited.

get_sum_utilities(simulation: aplusml.sim.Simulation) Dict[str, float][source]

Returns a dictionary of the sum of the utilities of the patient’s history.

Parameters:

simulation (Simulation) – The simulation.

Returns:

A dictionary where each [key] is a unit, and each [value] is the sum of the utilities of the patient’s history for that unit. Example: {β€˜USD’: 100000}

Return type:

Dict[str, float]

repr_state_history(is_show_timesteps: bool = False)[source]

Get the state history in a human-readable format.

Parameters:

is_show_timesteps (bool, optional) – If TRUE, then show the timestep of each state transition. Defaults to FALSE.

class aplusml.models.State(id: str, label: str, type: str, duration: int, utilities: List[Utility], transitions: List[Transition], resource_deltas: Dict[str, float])[source]

A state in the workflow.

id

The ID of the state. Must be unique across all states.

Type:

str

label

The label of the state.

Type:

str

type

The type of the state. Must be one of: β€˜start’, β€˜intermediate’, β€˜end’

Type:

str

duration

The duration of the state. Must be a non-negative integer.

Type:

int

utilities

The utilities of the state.

Type:

List[Utility]

transitions

The transitions of the state.

Type:

List[Transition]

__init__(id: str, label: str, type: str, duration: int, utilities: List[Utility], transitions: List[Transition], resource_deltas: Dict[str, float])[source]
Parameters:
  • id (str) – The ID of the state.

  • label (str) – The label of the state.

  • type (str) – The type of the state. Must be one of: β€˜start’, β€˜intermediate’, β€˜end’

  • duration (int) – The duration of the state. Example: 1

  • utilities (List[Utility]) – The utilities of the state. Example: [Utility(value=’100000’, unit=’USD’)]

  • transitions (List[Transition]) – The transitions of the state. Example: [Transition(dest=’state2’, label=’To State 2’, duration=1, utilities=[Utility(value=’100000’, unit=’USD’)], resource_deltas={β€˜MRI’: -1})]

  • resource_deltas (Dict[str, float]) – The resource deltas of the state. Example: {β€˜MRI’: -1}

print()[source]

Print the state in a human-readable format.

serialize()[source]

Serialize the state into a dictionary.

class aplusml.models.Transition(dest: str, label: str, duration: int, utilities: List[Utility], resource_deltas: Dict[str, float], if_: str | bool | None = None, prob: str | float | None = None)[source]

A transition between states.

dest

The destination state.

Type:

str

label

The label of the transition.

Type:

str

duration

The duration of the transition.

Type:

int

utilities

The utilities of the transition.

Type:

List[Utility]

resource_deltas

The resource deltas of the transition.

Type:

Dict[str, float]

if_

The condition for the transition, specified as a Python expression.

Type:

Optional[Union[str, bool]]

prob

The probability of the transition.

Type:

Optional[Union[str, float]]

__init__(dest: str, label: str, duration: int, utilities: List[Utility], resource_deltas: Dict[str, float], if_: str | bool | None = None, prob: str | float | None = None)[source]
Parameters:
  • dest (str) – The destination state.

  • label (str) – The label of the transition.

  • duration (int) – The duration of the transition. Example: 1

  • utilities (List[Utility]) – The utilities of the transition. Example: [Utility(value=’100000’, unit=’USD’)]

  • resource_deltas (Dict[str, float]) – The resource deltas of the transition. Example: {β€˜MRI’: -1}

  • if (Union[str, bool], optional) – The condition for the transition, specified as a Python expression. Defaults to None. Example: β€˜y_hat > 0.5’

  • prob (Union[str, float], optional) – The probability of the transition. Defaults to None. Example: 0.5

get_variables_in_conditional() List[str][source]

Returns a list of variables involved in the conditional expression.

is_conditional_if() bool[source]

Returns True if the transition is conditional on a Python expression.

is_conditional_prob() bool[source]

Returns True if the transition is probabilistic.

print()[source]

Print the transition in a human-readable format.

serialize()[source]

Serialize the transition into a dictionary.

class aplusml.models.Utility(value: str, unit: str = '', if_: str | None = None)[source]

A utility is a value that is associated with being in a state or undergoing a transition.

value

The value of the utility. Example: β€˜100000’

Type:

str

unit

The unit of the utility. Defaults to β€˜β€™. Example: β€˜USD’, β€˜days’, β€˜kg’, β€˜cm’, etc.

Type:

str, optional

if_

The condition for the utility, specified as a Python expression. Defaults to None. Example: β€˜y_hat > 0.5’

Type:

str, optional

__init__(value: str, unit: str = '', if_: str | None = None)[source]

A utility is a value that is associated with being in a state or undergoing a transition.

Parameters:
  • value (str) – The value of the utility. Example: β€˜100000’

  • unit (str, optional) – The unit of the utility. Defaults to β€˜β€™. Example: β€˜USD’, β€˜days’, β€˜kg’, β€˜cm’, etc.

  • if (str, optional) – The condition for the utility, specified as a Python expression. Defaults to None. Example: β€˜y_hat > 0.5’

is_conditional_if() bool[source]

Returns True if the utility is conditional on a Python expression.


Simulation

Core APLUS simulation engine which progresses patients through a given workflow

class aplusml.sim.Simulation[source]

The core APLUS simulation engine which progresses patients through a given workflow.

This class manages the entire simulation process including:

  • Patient state transitions and history tracking

  • Variable evaluation and management

  • Resource allocation and replenishment

  • Utility calculations

  • Workflow visualization

The simulation operates on a timestep basis, processing patients through states and transitions based on conditional and probabilistic rules defined in the workflow configuration.

metadata

Configuration metadata for the simulation

Type:

dict

variables

Variables used in the simulation, keyed by ID

Type:

Dict[str, Dict]

variable_history

History of variable values over time

Type:

Dict[List[Tuple[int, Any]]]

states

States in the workflow, keyed by ID

Type:

Dict[str, State]

current_timestep

Current simulation timestep

Type:

int

__init__()[source]

Initializes the simulation with default values

classmethod create_from_config(config: Config, path_to_patient_properties: str | None = None) aplusml.sim.Simulation[source]

Create a Simulation object from a Config object.

If path_to_patient_properties is provided, then it overwrites the current Metadata section’s path_to_properties key.

Parameters:
  • config (Config) – A Python Config object

  • path_to_patient_properties (str) – Path to patient properties CSV file. Optional.

Returns:

A Simulation object that contains all of the metadata, variables, states, and transitions from the YAML file

Return type:

Simulation

classmethod create_from_yaml(path_to_yaml: str, path_to_patient_properties: str | None = None) aplusml.sim.Simulation[source]

Create a Simulation object from YAML

If path_to_patient_properties is provided, then it overwrites the current Metadata section’s path_to_properties key.

Parameters:
  • path_to_yaml (str) – Path to YAML file

  • path_to_patient_properties (str) – Path to patient properties CSV file. Optional.

Returns:

A Simulation object that contains all of the metadata, variables, states, and transitions from the YAML file

Return type:

Simulation

create_patients_for_simulation(patients: List[Patient], func_match_patient_to_property_column: Callable | None = None, is_overwrite_existing_properties: bool = True, random_seed: int = 0) List[Patient][source]

Creates a deep copy of patients and initializes their properties.

  1. Deep copies patients using pickle

  2. Sorts patients by ID

  3. Initializes patient properties from:
    • Constants

    • CSV file data (using matching function)

    • Random distributions

Parameters:
  • patients (List[Patient]) – Template patients to copy and initialize

  • func_match_patient_to_property_column (Callable, optional) – Function to match patients to rows in properties CSV. Takes (patient_id, random_idx, df, column). Required if using CSV properties without ID column. Defaults to None.

  • is_overwrite_existing_properties (bool, optional) – If TRUE, then overwrite each patient’s existing properties with their default values, as specified in self.variables. If FALSE, then only initialize properties that are not already set. Defaults to True.

  • random_seed (int, optional) – Random seed for reproducibility. Defaults to 0.

Returns:

New list of initialized patients

Return type:

List[Patient]

Raises:

ValueError – If property configuration is invalid or required matching function missing

draw_workflow_diagram(path_to_file: str | None = None, is_display: bool = True, figsize: Tuple[int, int] = (20, 20))[source]

Visualizes the workflow as a directed graph using pydot.

Creates a visual diagram showing:

  • States as nodes

  • Transitions as edges

  • Transition conditions and probabilities

  • State/transition utilities and durations

Parameters:
  • path_to_file (str, optional) – Path to save diagram. Must include supported file extension. If None, diagram is not saved. Defaults to None.

  • is_display (bool, optional) – Whether to display diagram (useful for Jupyter). Defaults to True.

  • figsize (Tuple[int, int], optional) – Figure size for matplotlib. Defaults to (20, 20).

Raises:

ValueError – If path_to_file has unsupported file extension

evaluate_duration(patient: Patient, duration: str, variables: Dict[str, Any]) int[source]

Evaluates how long a patient should remain in a state or transition.

Duration can be a constant or an expression using available variables.

Parameters:
  • patient (Patient) – The patient to evaluate duration for

  • duration (str) – The duration expression to evaluate

  • variables (Dict[str, Any]) – Variables available for duration evaluation

Returns:

Number of timesteps the duration should last

Return type:

int

evaluate_expression(patient: Patient, expression: Any, variables: dict, expression_compiled: CodeType | None = None) Any[source]

Evaluates a Python expression in the context of a patient and variables.

Handles evaluation of:

  • Literal values (bool, int, float)

  • String expressions using Python’s eval()

  • Pre-compiled expressions for better performance

The expression can reference any variables passed in the variables dict.

Parameters:
  • patient (Patient) – The patient context for evaluation

  • expression (Any) – The expression to evaluate - can be literal value or string expression

  • variables (dict) – Dictionary of variables available during evaluation

  • expression_compiled (CodeType, optional) – Pre-compiled version of the expression

Returns:

The result of evaluating the expression

Return type:

Any

Raises:

NameError – If expression references undefined variables

evaluate_transition_if(patient: Patient, transition: Transition, variables: dict) bool[source]

Evaluates whether a conditional transition should be taken.

Checks if the transition has an β€˜if’ condition and evaluates it in the context of the given patient and variables. If there is no condition, returns True.

Parameters:
  • patient (Patient) – The patient attempting the transition

  • transition (Transition) – The transition to evaluate

  • variables (dict) – Variables available for condition evaluation

Returns:

True if transition condition is met or no condition exists, False otherwise

Return type:

bool

evaluate_transition_prob(patient: Patient, transition: Transition, variables: Dict[str, Any]) float[source]

Evaluates whether a probabilistic transition should be taken.

For transitions with a β€˜prob’ value, evaluates the probability expression. For transitions without a β€˜prob’, returns None (will be set to remaining probability).

Parameters:
  • patient (Patient) – The patient attempting the transition

  • transition (Transition) – The transition to evaluate

  • variables (Dict[str, Any]) – Variables available for probability evaluation

Returns:

Evaluated probability value between 0 and 1, or None if no probability specified

Return type:

float

evaluate_utility_if(patient: Patient, utility: Utility, variables: Dict[str, Any]) bool[source]

Evaluates whether a utility should be applied based on its condition.

Similar to evaluate_transition_if, but for utility conditions. Returns True if utility has no condition or if condition evaluates to True.

Parameters:
  • patient (Patient) – The patient to evaluate utility for

  • utility (Utility) – The utility to evaluate

  • variables (Dict[str, Any]) – Variables available for condition evaluation

Returns:

True if utility should be applied, False otherwise

Return type:

bool

evaluate_utility_value(patient: Patient, utility: Utility, variables: Dict[str, Any]) Any[source]

Evaluates the actual value of a utility.

Evaluates the utility’s value expression in the context of the patient and variables. The value can be a constant or an expression using available variables.

Parameters:
  • patient (Patient) – The patient context for evaluation

  • utility (Utility) – The utility whose value should be evaluated

  • variables (Dict[str, Any]) – Variables available for value evaluation

Returns:

The evaluated utility value

Return type:

Any

evaluate_variables(patient: Patient) Dict[str, Any][source]

Evaluates all variables for a given patient at the current timestep.

Processes different variable types:

  • scalar: Returns constant value

  • resource: Returns current resource level

  • property: Returns patient-specific property value

  • simulation: Returns simulation-specific values (time remaining, elapsed time, current timestep)

  • function: Executes and returns function result

Parameters:

patient (Patient) – The patient to evaluate variables for

Returns:

Dictionary mapping variable IDs to their evaluated values

Return type:

Dict[str, Any]

Raises:

AssertionError – If number of evaluated variables doesn’t match total variables

get_all_utility_units() List[str][source]

Gets all unique utility unit types used in the workflow.

Examines all utilities across all states and transitions to find unique unit types (e.g. β€œQALY”, β€œUSD”).

Returns:

List of unique utility unit names

Return type:

List[str]

init_patients(patients: List[Patient])[source]

Initializes patients for simulation.

For each patient: - Sets initial state to β€˜start’ - Initializes empty history list

Parameters:

patients (List[Patient]) – List of patients to initialize, modified in place

init_run(random_seed: int = 0)[source]

Initializes the simulation for a new run.

Resets simulation state by: - Setting current timestep to 0 - Initializing all variables - Setting random seeds for reproducibility

Parameters:

random_seed (int, optional) – Seed for random number generators. Defaults to 0.

init_variables(variables: Dict[str, dict])[source]

Initializes resource variables with their starting values.

Note: Modifies β€˜variables’ in place

For each resource variable, sets:

  • Initial resource level from init_amount

  • Last refill timestep to 0

  • Empty history tracking list

Parameters:

variables (Dict[str, dict]) – Dictionary of variables from config, modified in place

is_valid_patients(patients: List[Patient]) Tuple[bool, str][source]

Validates a list of patients for simulation.

Currently checks: - All patients have unique IDs

Parameters:

patients (List[Patient]) – List of patients to validate

Returns:

(is_valid, error_message) where is_valid is True if validation passes,

and error_message contains details if validation fails

Return type:

Tuple[bool, str]

load_seismometer_patients_for_simulation(patients: List[Patient], df_patients_seismometer: pandas.DataFrame, func_match_patient_to_property_column: Callable | None = None, random_seed: int = 0) List[Patient][source]

Loads patient properties from an Epic Seismometer dataframe.

Similar to create_patients_for_simulation but loads properties from a provided dataframe instead of CSV file.

Parameters:
  • patients (List[Patient]) – Template patients to initialize

  • df_patients_seismometer (pd.DataFrame) – Dataframe containing patient properties

  • func_match_patient_to_property_column (Callable, optional) – Function to match patients to rows in dataframe. Takes (patient_id, random_idx, df, column). Required if not using ID column. Defaults to None.

  • random_seed (int, optional) – Random seed for reproducibility. Defaults to 0.

Returns:

Patients with initialized properties

Return type:

List[Patient]

Raises:

ValueError – If property configuration is invalid or required matching function missing

log(string: str)[source]

Logs a message if logging is enabled.

Format: t=<current timestep of simulation> | {string}

Parameters:

string (str) – Message to log

run(all_patients: List[Patient], max_timesteps: int | None = None, random_seed: int = 0, is_print_log: bool = False, is_print_tqdm: bool = True) List[Patient][source]

Runs the simulation by progressing all patients through the workflow.

Core simulation loop that:

  1. Processes patients in order of admission time and preference sorting

  2. Moves each patient through states based on transitions

  3. Tracks utilities and resource usage

  4. Handles patient pausing/unpausing for state/transition durations

  5. Continues until all patients finish or max timesteps reached

Performance: Takes ~3 seconds for 15,000 patients, ~10 seconds for 50,000 patients.

Parameters:
  • all_patients (List[Patient]) – All patients to simulate across all admit days. Modified in place - only history and current_state attributes are changed.

  • max_timesteps (int, optional) – Maximum timesteps to simulate. If reached, simulation stops with current_timestep = max_timesteps - 1. Defaults to None.

  • random_seed (int, optional) – Random seed for reproducibility. Defaults to 0.

  • is_print_log (bool, optional) – Whether to print debug logs. Defaults to False.

  • is_print_tqdm (bool, optional) – Whether to print a progress bar. Defaults to True.

Returns:

The patients after simulation with updated histories.

Return type:

List[Patient]

Raises:

AssertionError – If patients are invalid or simulation state becomes invalid

select_transition(patient: Patient, transitions: List[Transition], variables: Dict[str, Any]) int[source]

Determines which transition a patient should take from their current state.

First evaluates all conditional (β€˜if’) transitions in order. If none are true, then evaluates probabilistic transitions. For probabilistic transitions, ensures probabilities sum to 1 and handles default probability case.

Parameters:
  • patient (Patient) – The patient attempting to transition

  • transitions (List[Transition]) – Available transitions from current state

  • variables (Dict[str, Any]) – Variables available for transition evaluation

Returns:

Index of selected transition, or None if no valid transition found

Return type:

int

Raises:

AssertionError – If no valid transition can be selected

aplusml.sim.get_unit_utility_baselines(patients: List[Patient], utilities: Dict[str, float], y_true_column_name: str = 'ground_truth') Dict[str, float][source]

Calculates baseline utility metrics for different treatment strategies.

Computes average per-patient utility for: - Treat all patients - Treat no patients - Perfect treatment (treat only true positives)

Parameters:
  • patients (List[Patient]) – Patients to analyze

  • utilities (Dict[str, float]) – Utility values for TP/FP/FN/TN outcomes

  • y_true_column_name (str, optional) – Patient property containing ground truth. Defaults to β€˜ground_truth’.

Returns:

Average utilities for β€˜all’, β€˜none’, and β€˜perfect’ strategies

Return type:

Dict[str, float]

aplusml.sim.log_patients(simulation: Simulation, patients: List[Patient])[source]

Prints detailed debug information about patients.

For each patient, prints:
  • ID and start timestep

  • Properties

  • State history

  • Sum of utilities

Parameters:
  • simulation (Simulation) – Simulation context

  • patients (List[Patient]) – Patients to log

aplusml.sim.sort_patient_by_preference(patients: List[Patient], property_to_sort_by: str | None = None, is_ascending: bool = True) List[int][source]

Returns indices that would sort patients by specified property.

Can sort by:
  • Direct Patient attributes (id, start_timestep)

  • Patient properties dictionary values

Parameters:
  • patients (List[Patient]) – Patients to sort

  • property_to_sort_by (str, optional) – Property name to sort by. Defaults to None.

  • is_ascending (bool, optional) – Sort order. Defaults to True.

Returns:

Indices that would sort the patients list

Return type:

List[int]


Run

Wrapper functions around sim.py to help run/track multiple simulations

aplusml.run.run_test(simulation: Simulation, all_patients: list[Patient], labels: list, keys2values: list[dict[dict]], df: pandas.DataFrame | None = None, func_run_test: Callable | None = None, func_match_patient_to_property_column: Callable | None = None, is_refresh_patients: bool = False, is_use_multi_processing: bool = False) pandas.DataFrame[source]

Runs multiple simulation tests with different variable settings.

This is the main entry point for running simulation experiments. It supports:

  1. Testing multiple configurations in parallel or serial

  2. Custom test functions (e.g. threshold testing)

  3. Patient property refreshing between runs

  4. Appending results to existing dataframes

  5. Multiprocessing for improved performance

The function pairs each label with its corresponding variable settings from keys2values and runs the simulation with those settings. Results from all runs are combined into a single dataframe.

Parameters:
  • simulation (sim.Simulation) – Base simulation object

  • all_patients (list[sim.Patient]) – List of patients to simulate

  • labels (list) – Names for each test configuration

  • keys2values (list[dict[dict]]) – List of variable settings to test. Each dict maps variable names to new values/settings for that test run.

  • df (pd.DataFrame, optional) – Existing results to append to. Defaults to None.

  • func_run_test (Callable, optional) – Custom function to run each test. Typically test_diff_thresholds. If None, runs basic simulation. Defaults to None.

  • func_match_patient_to_property_column (Callable, optional) – Function to match patients to properties in CSV. Required if refreshing patients. Defaults to None.

  • is_refresh_patients (bool, optional) – If True, recreates patients with new properties between runs. Defaults to False.

  • is_use_multi_processing (bool, optional) – If True, runs tests in parallel using available CPU cores. Defaults to False.

Returns:

Combined results from all test runs. Format depends on func_run_test,

but always includes a label column identifying the test configuration.

Return type:

pd.DataFrame

References

For usage examples, see: - Tutorial (PAD)

aplusml.run.test_diff_thresholds(simulation: Simulation, all_patients: list[Patient], thresholds: list[float], utility_unit: str = '', positive_outcome_state_ids: list[str] = ['positive_end_state'], **kwargs) pandas.DataFrame[source]

Tests different model threshold values to find optimal cutoff point for binary predictions.

For each threshold value, runs the simulation and calculates utility metrics. The simulation must contain a model_threshold variable that is used to binarize probabilistic predictions. After testing all thresholds, sets the simulation and patients to use the threshold that maximizes mean utility.

Parameters:
  • simulation (sim.Simulation) – Simulation object containing workflow definition

  • all_patients (list[sim.Patient]) – List of patients to simulate through workflow

  • thresholds (list[float]) – List of threshold values to test (between 0 and 1)

  • utility_unit (str, optional) – Name of utility unit to optimize (e.g. 'qaly', 'usd'). Defaults to ''.

  • positive_outcome_state_ids (list[str], optional) – State IDs that represent positive outcomes (treatment administered). Used to calculate work per timestep. Defaults to ['positive_end_state'].

  • **kwargs – Additional arguments passed to simulation.run()

Returns:

Results for each threshold tested with columns

  • threshold: Threshold value tested

  • mean_utility: Mean utility achieved across all patients

  • std_utility: Standard deviation of utilities

  • sem_utility: Standard error of mean utility

  • mean_work_per_timestep: Average number of positive outcomes per timestep

Return type:

pd.DataFrame

Raises:

AssertionError – If model_threshold variable is not defined in simulation.variables


Draw

Functions for drawing graphs using graphviz.

This module provides utilities for creating and formatting graphviz diagrams of workflows, including HTML table generation and text escaping.

aplusml.draw._html_escape(text: str) str[source]

Escape HTML special characters for use in HTML tables in graphviz.

Replaces special characters with their HTML entity equivalents to ensure proper rendering in graphviz HTML-like labels.

Parameters:

text (str) – The input text to escape.

Returns:

The escaped text with the following replacements:

  • & β†’ &amp;

  • < β†’ &lt;

  • > β†’ &gt;

  • Newline β†’ <br align="left"/>

Return type:

str

aplusml.draw.create_node_label(title: str, id: str | None, duration: float, utilities: list, resource_deltas: dict, is_edge: bool = False) str[source]

Creates an HTML-formatted label string for use in graphviz diagrams, containing information about the node’s title, duration, utilities, and resource changes.

Parameters:
  • title (str) – Title of the node.

  • duration (float) – Duration of the node.

  • utilities (list) – List of utilities associated with the node.

  • resource_deltas (dict) – Dictionary of resource deltas associated with the node.

  • is_edge (bool, optional) – Whether the node represents an edge. Defaults to False.

Returns:

An HTML-formatted string containing the node’s label.

Return type:

str

Note

The returned string uses graphviz’s HTML-like label syntax and includes: * A title section with optional edge formatting * Duration information * List of utilities * List of resource changes


Plot

Plotting functions for visualizing simulation results.

Some quick notes:
  • For all functions, the df_preds dataframe must contain two columns – β€˜y’ (ground truth) and β€˜y_hat’ (predicted labels).

  • For all functions, the utilities dictionary must contain four keys – β€˜tp’, β€˜fp’, β€˜tn’, β€˜fn’ – which map to the utility values for each class.

aplusml.plot.calc_model_performance_metrics(df_preds: pandas.DataFrame, thresholds: list[float] = [0.5]) dict[source]

Returns a dictionary containing performance metrics for this model

Parameters:
  • df_preds (pd.DataFrame) – Dataframe with two columns, β€˜y’ and β€˜y_hat’

  • thresholds (list[float]) – Cutoffs for thresholding predictions to binary 0/1 (used for accuracy, Matthews correlation)

Returns:

Dictionary containing performance metrics

Return type:

dict

aplusml.plot.calc_pearsonr(a: numpy.ndarray, b: numpy.ndarray) float[source]

Pearson correlation coefficient between two 1d vectors

Parameters:
  • a (np.ndarray) – First vector

  • b (np.ndarray) – Second vector

Returns:

Pearson correlation coefficient

Return type:

float

aplusml.plot.calc_spearmanr(a: numpy.ndarray, b: numpy.ndarray) float[source]

Spearman correlation coefficient between two 1d vectors

Parameters:
  • a (np.ndarray) – First vector

  • b (np.ndarray) – Second vector

Returns:

Spearman correlation coefficient

Return type:

float

aplusml.plot.get_df_utility_from_df_preds(df_preds: pandas.DataFrame, utilities: dict, thresholds: numpy.ndarray, is_add_0_and_1: bool = True) pandas.DataFrame[source]

Adds metrics (Utilities + TP/FP/TN/FN) to DataFrame of predictions

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • utilities (dict) – Dictionary containing utility values for each class. Must contain four keys: β€˜tp’, β€˜fp’, β€˜tn’, β€˜fn’

  • thresholds (np.ndarray) – Array of model thresholds to consider

  • is_add_0_and_1 (bool) – If TRUE, add 0 and 1 to the start/end of the thresholds array

Returns:

DataFrame that mirrors β€œdf_preds”, but with additional columns

Return type:

pd.DataFrame

aplusml.plot.make_model_utility_plots(df_preds: pandas.DataFrame, utilities: dict, is_show: bool = False, path_to_plots_prefix: str | None = None)[source]

Run all plotting functions

Parameters:
  • df_preds (pd.DataFrame) –

    • dataframe with two columns, β€˜y’ and β€˜y_hat’

  • is_show (bool, optional) – If TRUE, show plots. Defaults to False.

  • path_to_plots_prefix (str, optional) – Prefix for path to saved plot files. Defaults to None.

aplusml.plot.plot_bar_mean_utilities(title: str, plot_avg_utilities: dict[float]) plotnine.ggplot[source]

Plots bar chart of mean utilities for different settings

Parameters:
  • title (str) – Title of the plot

  • plot_avg_utilities (dict[float]) – Dictionary of mean utilities for different settings

Returns:

A plotnine plot object showing bar chart of mean utilities

Return type:

p9.ggplot

aplusml.plot.plot_calibration_curve(df_preds: pandas.DataFrame, ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]
Generate Calibration curve where:
  • x-axis = mean predicted probability for each bin

  • y-axis = fraction of positive cases in each bin

The calibration curve shows how well the predicted probabilities match the actual observed frequencies. A perfectly calibrated model will follow the diagonal line y=x.

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure

aplusml.plot.plot_confusion_matrix(df_preds: pandas.DataFrame, utilities: dict | None = None, threshold: float | None = None, ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]
Generate Confusion Matrix @ highest utility threshold (if threshold is not specified) where:
  • x-axis = predictions (β€˜y_hat’)

  • y-axis = ground truth (β€˜y’)

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • utilities (dict, optional) – Dictionary containing utility values for each class

  • threshold (float, optional) – Cutoff threshold for confusion matrix. If not specified, uses the threshold with highest utility

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure

aplusml.plot.plot_decision_curve(df_preds: pandas.DataFrame, utilities: dict, ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]
Generate Decision Curve for model where:
  • y-axis = X - U_none (i.e. everything is relative to Treat None)

  • x-axis = R

Risk Threshold (R) is defined as β€œthe cutpoint for calling a result positive that maximizes expected utility”

Thus, β€œTreat None” on this graph = U_none - U_none = 0.

Thus, at R = r, the β€œTreat All” graph shows what the net benefit would be if you treated everyone AND the optimal risk threshold = r.

Guarantee: β€œTreat All” and β€œTreat None” lines should intersect at R = Prevalence.

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • utilities (dict, optional) – Dictionary containing utility values for each class

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure

aplusml.plot.plot_dodged_bar_mean_utilities(title: str, df: pandas.DataFrame, label_sort_order: list[str] | None = None, label_names: list[str] | None = None, color_sort_order: list[str] | None = None, color_names: list[str] | None = None, is_percent_of_optimistic: bool = False, x_label: str | None = None) plotnine.ggplot[source]

Plot relative utility using the optimistic model as a baseline.

Creates a bar plot where bars are grouped by label (x-axis) and colored by a secondary category. Utility values are measured relative to a baseline, optionally as a percentage of the optimistic model’s performance.

Parameters:
  • title (str) – Title of the plot

  • df (pd.DataFrame) – DataFrame containing columns: y, label, color

  • label_sort_order (list[str], optional) – Custom ordering for x-axis labels. Defaults to None.

  • label_names (list[str], optional) – Display names for labels. Defaults to None.

  • color_sort_order (list[str], optional) – Custom ordering for color categories. Defaults to None.

  • color_names (list[str], optional) – Display names for color categories. Defaults to None.

  • is_percent_of_optimistic (bool, optional) – If True, show utilities as percentage of optimistic model. Defaults to False.

  • x_label (str, optional) – Custom x-axis label. Defaults to None.

Returns:

A plotnine plot object showing grouped bars of relative utilities

Return type:

p9.ggplot

aplusml.plot.plot_hist_pred(df_preds: pandas.DataFrame, ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]
Generate Histogram of predictions
  • x-axis = predictions (β€˜y_hat’)

  • y-axis = density

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure

aplusml.plot.plot_line_compare_multiple_settings(title: str, df: list[pandas.DataFrame], x_label: str | None = None, line_labels: list[str] | None = None) plotnine.ggplot[source]

Plots multiple lines on the same x- and y-axis. Used to generate Figure 2 from Jung et al. 2021

Parameters:
  • title (str) – Title of the plot

  • df (list[pd.DataFrame]) – List of DataFrames, each containing columns: x, y, line

  • x_label (str, optional) – Label for the x-axis. Defaults to None.

  • line_labels (list[str], optional) – Labels for the lines. Defaults to None.

Returns:

A plotnine plot object showing multiple lines

Return type:

p9.ggplot

aplusml.plot.plot_line_mean_utilities(title: str, df: pandas.DataFrame, group_sort_order: list | None = None, groups_to_drop: list | None = None, label_sort_order: list | None = None, color_sort_order: list | None = None, color_names: list[str] | None = None, shape_sort_order: list | None = None, is_percent_of_optimistic: bool = False, color_title: str | None = None, shape_title: str | None = None, x_label: str | None = None) plotnine.ggplot[source]

Creates a line plot comparing mean utilities across different groups and settings.

This function creates a line plot where each line represents a group of related points. Lines can be differentiated by color and points by shape. Unlike the dodged bar plot, this supports multiple grouping dimensions.

Parameters:
  • title (str) – Title of the plot

  • df (pd.DataFrame) – DataFrame containing at minimum these columns: - y: numeric utility values - label: categories for x-axis - group: groups points that should be connected by lines Optional columns: - color: categories for line colors - shape: categories for point shapes

  • group_sort_order (list, optional) – Custom ordering for line groups. Defaults to None.

  • groups_to_drop (list, optional) – Groups to exclude from plot. Defaults to None.

  • label_sort_order (list, optional) – Custom ordering for x-axis labels. Defaults to None.

  • color_sort_order (list, optional) – Custom ordering for colors. Defaults to None.

  • color_names (list[str], optional) – Display names for color categories. Defaults to None.

  • shape_sort_order (list, optional) – Custom ordering for point shapes. Defaults to None.

  • is_percent_of_optimistic (bool, optional) – If True, show utilities as percentage of optimistic model. Defaults to False.

  • color_title (str, optional) – Title for color legend. Defaults to None.

  • shape_title (str, optional) – Title for shape legend. Defaults to None.

  • x_label (str, optional) – Custom x-axis label. Defaults to None.

Returns:

A plotnine plot object showing utility comparisons across groups

Return type:

p9.ggplot

aplusml.plot.plot_mean_utility_v_threshold(title: str, df: pandas.DataFrame, label_sort_order: list[str] | None = None, label_names: list[str] | None = None, label_title: str = '') plotnine.ggplot[source]

Plot mean utility vs. threshold

Parameters:
  • title (str) – Title of the plot

  • df (pd.DataFrame) – Output of run.run_test(). Columns: threshold, mean_utility, std_utility, sem_utility, mean_work_per_timestep, label

  • label_sort_order (list[str], optional) – Custom ordering for x-axis labels. Defaults to None.

  • label_names (list[str], optional) – Display names for labels. Defaults to None.

  • label_title (str, optional) – Title of label legend. Defaults to None.

Returns:

Plot object

Return type:

p9.ggplot

aplusml.plot.plot_ppv_v_utility_curve(df_preds: pandas.DataFrame, utilities: dict, ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]
Generate PPV v. Unit Utility curve where:
  • x-axis = precision (PPV)

  • y-axis = unit utility

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • utilities (dict, optional) – Dictionary containing utility values for each class

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure

aplusml.plot.plot_precision_recall_curve(df_preds: pandas.DataFrame, utilities: dict | None = None, ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]
Generate Precision-Recall curve with utility contours where:
  • x-axis = recall (TPR)

  • y-axis = precision (PPV)

Explanation of Expected Utility (EU) calculations (Appendix A): https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2804257/pdf/nihms-108324.pdf

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • utilities (dict, optional) – Dictionary containing utility values for each class

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure

aplusml.plot.plot_pred_v_true(df_preds: pandas.DataFrame, ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]

Generate a scatter plot comparing predicted vs true values with density coloring.

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure

aplusml.plot.plot_relative_utility_curve(lines: list[dict[pandas.DataFrame]], utilities: dict, xlim: Tuple = (0, 1), ylim: Tuple = (-0.1, None), ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]
Generate Relative Utility Curve where:
  • y-axis = relative utility

  • x-axis = risk threshold

β€œThe relative utility curve plots the fraction of the expected utility of perfect prediction obtained by the risk prediction model at the optimal cut point associated with the risk threshold R”

Parameters:
  • lines (list[dict[pd.DataFrame]]) – List of dictionaries containing β€˜df’ (dataframe) and β€˜label’ (string)

  • utilities (dict, optional) – Dictionary containing utility values for each class

  • xlim (Tuple, optional) – Tuple containing the x-axis limits. Defaults to (0, 1).

  • ylim (Tuple, optional) – Tuple containing the y-axis limits. Defaults to (-0.1, None).

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure

aplusml.plot.plot_roc_curve(df_preds: pandas.DataFrame, utilities: dict | None = None, ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]
Generate ROC curve with indifference curves where:
  • x-axis = false positive rate (FPR)

  • y-axis = true positive rate (TPR)

Explanation of indifference curves: http://www0.cs.ucl.ac.uk/staff/ucacbbl/roc/

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • utilities (dict, optional) – Dictionary containing utility values for each class

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure

aplusml.plot.plot_threshold_v_ppv_tpr_work(df_preds: pandas.DataFrame, utilities: dict, ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]
Generate Cutoff Threshold v. PPV/TPR/Work curve where:
  • x-axis = cutoff threshold

  • y-axis = PPV/TPR/Work

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • utilities (dict, optional) – Dictionary containing utility values for each class

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure

aplusml.plot.plot_threshold_v_utility_curve(df_preds: pandas.DataFrame, utilities: dict, ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]
Generate Cutoff Threshold v. Unit Utility curve where:
  • x-axis = cutoff threshold

  • y-axis = unit utility

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • utilities (dict, optional) – Dictionary containing utility values for each class

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure

aplusml.plot.plot_work_v_ppv_tpr_fpr(df_preds: pandas.DataFrame, utilities: dict, ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]

Generate Work v. PPV/TPR/FPR curve where:

  • x-axis = work (%)

  • y-axis = PPV/TPR/FPR

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • utilities (dict, optional) – Dictionary containing utility values for each class

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure

aplusml.plot.plot_work_v_utility(df_preds: pandas.DataFrame, utilities: dict, ax: matplotlib.pyplot.Axes | None = None) matplotlib.pyplot.figure[source]
Generate Work v. Unit Utility curve where:
  • x-axis = work (%)

  • y-axis = unit utility

Parameters:
  • df_preds (pd.DataFrame) – DataFrame containing β€˜y_hat’ (predictions) and β€˜y’ (ground truth) columns

  • utilities (dict, optional) – Dictionary containing utility values for each class

  • ax (plt.Axes, optional) – Matplotlib axes to plot on. If None, creates new figure. Defaults to None.

Returns:

Matplotlib figure object containing the plot

Return type:

plt.figure


Parse

Functions to parse APLUS config YAML files into Python objects for use in sim.py. Contains a list of required and optional keys for each config section.

aplusml.parse.is_keys_valid(yaml_entry: dict, yaml_entry_id: str, yaml_entry_type: str, valid_keys: dict[str], is_check_required_keys: bool = True, is_check_optional_keys: bool = True) bool[source]

Check that the keys in a given YAML entry are valid (i.e. all required keys are present, and no extraneous keys are present).

Parameters:
  • yaml_entry (dict) – The YAML entry to check

  • yaml_entry_id (str) – The ID of the YAML entry

  • yaml_entry_type (str) – The type of YAML entry

  • valid_keys (dict) – A dictionary of valid keys for the YAML entry

  • is_check_required_keys (bool) – Whether to check that all required keys are present

  • is_check_optional_keys (bool) – Whether to check that no extraneous keys are present

Returns:

True if the keys are valid, False otherwise

Return type:

bool

aplusml.parse.is_valid_config_yaml(yaml: dict) bool[source]

Return TRUE if the provided YAML config is valid, FALSE otherwise.

aplusml.parse.load_yaml(path_to_yaml: str) dict[source]

Loads YAML file into a Python dictionary.

Parameters:

path_to_yaml (str) – Path to YAML file

Returns:

Python dictionary

Return type:

dict

aplusml.parse.parse_yaml_into_config(path_to_yaml: str) Config[source]

Parse a YAML file into a Config object, printing errors if parsing or validation fails.

Parameters:

path_to_yaml (str) – Path to YAML file.

Returns:

Config object.

Return type:

Config