multiml.task.MLBaseTask

class multiml.task.MLBaseTask(phases=None, input_var_names=None, output_var_names=None, save_var_names=None, pred_var_names=None, true_var_names=None, var_names=None, model=None, model_args=None, optimizer=None, optimizer_args=None, scheduler=None, scheduler_args=None, loss=None, loss_args=None, max_patience=None, loss_weights=None, load_weights=False, save_weights=False, metrics=None, num_epochs=10, batch_size=64, num_workers=0, verbose=None, **kwargs)

Base task class for (deep) machine learning tasks.

__init__(phases=None, input_var_names=None, output_var_names=None, save_var_names=None, pred_var_names=None, true_var_names=None, var_names=None, model=None, model_args=None, optimizer=None, optimizer_args=None, scheduler=None, scheduler_args=None, loss=None, loss_args=None, max_patience=None, loss_weights=None, load_weights=False, save_weights=False, metrics=None, num_epochs=10, batch_size=64, num_workers=0, verbose=None, **kwargs)

Initialize ML base task.

This base class will be inherited by deep learning task classes, KerasBaseTask() and PytorchBaseTask(). input_var_names and output_var_names specify data for model inputs and outputs. If input_var_names is list, e.g. [‘var0’, ‘var1’], model will receive data with format of [(batch size, k), (batch size, k)], where k is arbitrary shape of each variable. If input_var_names is tuple, e.g. (‘var0’, ‘var1’), model will receive data with (batch size, M, k), where M is the number of variables. If output_var_names` is list, model must returns list of tensor data for each variable. If output_var_names` is tuple, model must returns a tensor data. pred_var_names and true_var_names specify data for loss calculations. If pred_var_names is given, only variables indicated by pred_var_names are selected from model outputs before being passed to loss calculation. Please see KerasBaseTask()` or ``PytorchBaseTask() for actual examples.

Parameters:
  • phases (list) – list to indicates ML phases, e.g. [‘train’, ‘test’]. If None is given, [‘train’, ‘valid’, ‘test’] is set.

  • input_var_names (str or list or tuple) – input variable names in StoreGate.

  • output_var_names (str or list or tuple) – output variable names of model.

  • save_var_names (str or list) – variable names saved to StoreGate..

  • pred_var_names (str or list) – prediction variable names passed to loss.

  • true_var_names (str or list or tuple) – true variable names.

  • var_names (str) – str of “input output true” variable names for shortcut. This is not valid to specify multiple variables.

  • model (str or obj) – name of model, or class object of model.

  • model_args (dict) – args of model, e.g. dict(param0=0, param1=1).

  • optimizer (str or obj) – name of optimizer, or class object of optimizer

  • optimizer_args (dict) – args of optimizer.

  • scheduler (str or obj) – name of scheduler, or class object of scheduler

  • scheduler_args (dict) – args of scheduler.

  • loss (str or obj) – name of loss, or class object of loss

  • loss_args (dict) – args of loss.

  • max_patience (int) – max number of patience for early stopping. early_stopping is enabled if ``max_patience is given.

  • loss_weights (list) – scalar coefficients to weight the loss.

  • load_weights (bool or str) – user defined algorithms should assume the following behavior. If False, not load model weights. If True, load model weights from default location. If str, load weights from given path.

  • save_weights (bool or str) – user defined algorithms should assume the following behavior. If False, not save model weights. If True, save model weights to default location. If str, save weights to given path.

  • metrics (list) – metrics of evaluation.

  • num_epochs (int) – number of epochs.

  • batch_size (int or dict) – size of mini batch, you can set different batch_size for test, train, valid.

  • num_workers (int) – number of workers for dataloaders.

  • verbose (int) – verbose option for fitting step. If None, it’s set based on logger.MIN_LEVEL

Methods

__init__([phases, input_var_names, ...])

Initialize ML base task.

build_model()

Build model.

compile()

Compile model, optimizer and loss.

compile_loss()

Compile loss.

compile_model()

Compile model.

compile_optimizer()

Compile optimizer.

compile_var_names()

Compile var_names.

do_test()

Perform test phase or not.

do_train()

Perform train phase or not.

do_valid()

Perform valid phase or not.

dump_model([extra_args])

Dump current model to saver.

execute()

Execute a task.

finalize()

Finalize base task.

fit([train_data, valid_data])

Fit model.

fit_predict([fit_args, predict_args])

Fit and predict model.

get_input_true_data(phase)

Get input and true data.

get_input_var_shapes([phase])

Get shape of input_var_names.

get_metadata(metadata_key)

Returns metadata.

get_pred_index()

Returns prediction index passed to loss calculation.

get_unique_id()

Returns unique identifier of task.

load_metadata()

Load metadata.

load_model()

Load pre-trained model path from Saver.

predict([data, phase])

Predict model.

predict_update([data, phase])

Predict and update data in StoreGate.

set_hps(params)

Set hyperparameters to this task.

show_info()

Print information.

update(data[, phase])

Update data in storegate.

Attributes

input_saver_key

Return input_saver_key.

input_var_names

Returns input_var_names.

job_id

Return job_id of task.

ml

Returns ML data class.

name

Return name of task.

output_saver_key

Return output_saver_key.

output_var_names

Returns output_var_names.

phases

Returns ML phases.

pool_id

Return pool_id of task.

pred_var_names

Returns pred_var_names.

save_var_names

Returns save_var_names.

saver

Return saver of task.

storegate

Return storegate of task.

subtask_id

Return subtask_id of task.

task_id

Return task_id of task.

trial_id

Return trial_id of task.

true_var_names

Returns true_var_names.

__init__(phases=None, input_var_names=None, output_var_names=None, save_var_names=None, pred_var_names=None, true_var_names=None, var_names=None, model=None, model_args=None, optimizer=None, optimizer_args=None, scheduler=None, scheduler_args=None, loss=None, loss_args=None, max_patience=None, loss_weights=None, load_weights=False, save_weights=False, metrics=None, num_epochs=10, batch_size=64, num_workers=0, verbose=None, **kwargs)

Initialize ML base task.

This base class will be inherited by deep learning task classes, KerasBaseTask() and PytorchBaseTask(). input_var_names and output_var_names specify data for model inputs and outputs. If input_var_names is list, e.g. [‘var0’, ‘var1’], model will receive data with format of [(batch size, k), (batch size, k)], where k is arbitrary shape of each variable. If input_var_names is tuple, e.g. (‘var0’, ‘var1’), model will receive data with (batch size, M, k), where M is the number of variables. If output_var_names` is list, model must returns list of tensor data for each variable. If output_var_names` is tuple, model must returns a tensor data. pred_var_names and true_var_names specify data for loss calculations. If pred_var_names is given, only variables indicated by pred_var_names are selected from model outputs before being passed to loss calculation. Please see KerasBaseTask()` or ``PytorchBaseTask() for actual examples.

Parameters:
  • phases (list) – list to indicates ML phases, e.g. [‘train’, ‘test’]. If None is given, [‘train’, ‘valid’, ‘test’] is set.

  • input_var_names (str or list or tuple) – input variable names in StoreGate.

  • output_var_names (str or list or tuple) – output variable names of model.

  • save_var_names (str or list) – variable names saved to StoreGate..

  • pred_var_names (str or list) – prediction variable names passed to loss.

  • true_var_names (str or list or tuple) – true variable names.

  • var_names (str) – str of “input output true” variable names for shortcut. This is not valid to specify multiple variables.

  • model (str or obj) – name of model, or class object of model.

  • model_args (dict) – args of model, e.g. dict(param0=0, param1=1).

  • optimizer (str or obj) – name of optimizer, or class object of optimizer

  • optimizer_args (dict) – args of optimizer.

  • scheduler (str or obj) – name of scheduler, or class object of scheduler

  • scheduler_args (dict) – args of scheduler.

  • loss (str or obj) – name of loss, or class object of loss

  • loss_args (dict) – args of loss.

  • max_patience (int) – max number of patience for early stopping. early_stopping is enabled if ``max_patience is given.

  • loss_weights (list) – scalar coefficients to weight the loss.

  • load_weights (bool or str) – user defined algorithms should assume the following behavior. If False, not load model weights. If True, load model weights from default location. If str, load weights from given path.

  • save_weights (bool or str) – user defined algorithms should assume the following behavior. If False, not save model weights. If True, save model weights to default location. If str, save weights to given path.

  • metrics (list) – metrics of evaluation.

  • num_epochs (int) – number of epochs.

  • batch_size (int or dict) – size of mini batch, you can set different batch_size for test, train, valid.

  • num_workers (int) – number of workers for dataloaders.

  • verbose (int) – verbose option for fitting step. If None, it’s set based on logger.MIN_LEVEL