multiml.task package

Subpackages

Submodules

Module contents

class multiml.task.Task

Bases: object

Tasks need be inherited this base class.

Multi-ai agents assume that initialize, execute, finalize, set_hps, methods are available.

abstract execute()

Execute the task.

abstract finalize()

Finalize the task.

abstract set_hps(params)

Set hyperparameters of this task.

class multiml.task.BaseTask(saver=None, input_saver_key='tmpkey', output_saver_key='tmpkey', storegate=None, data_id=None, name=None)

Bases: Task

Base task class for the default functions.

All subtasks defined by users, need to inherit this BaseTask. In user defined class, super.__init__() must be called in __init__() method. A task class is assumed to call its methods by following sequence: set_hps() -> execute() -> finalize(). If task class instance is registered to TaskScheduler as subtask, self._task_id and self._subtask_id are automatically set by TaskScheduler.

Examples

>>> task = BaseTask()
>>> task.set_hps({'hp_layer': 5, 'hp_epoch': 256})
>>> task.execute()
>>> task.finalize()
__init__(saver=None, input_saver_key='tmpkey', output_saver_key='tmpkey', storegate=None, data_id=None, name=None)

Initialize base task.

Parameters:
  • saver (Saver) – Saver class instance to record metadata data.

  • input_saver_key (int) – unique saver key to retrieve metadata.

  • output_saver_key (int) – unique saver key to save metadata.

  • storegate (Storegate) – Storegate class instance to manage data.

  • data_id (str) – data_id of Storegate, which is set by set_hps().

  • name (str) – task’s name. If None, classname is used alternatively.

execute()

Execute base task.

Users implement their algorithms.

finalize()

Finalize base task.

Users implement their algorithms.

set_hps(params)

Set hyperparameters to this task.

Class attributes (self._XXX) are automatically set based on keys and values of given dict. E.g. dict of {‘key0’: 0, ‘key1’: 1} is given, self._key0 = 0 and self._key1 = 1 are created.

property name

Return name of task.

property job_id

Return job_id of task.

property trial_id

Return trial_id of task.

property task_id

Return task_id of task.

property subtask_id

Return subtask_id of task.

property pool_id

Return pool_id of task.

property storegate

Return storegate of task.

property saver

Return saver of task.

property input_saver_key

Return input_saver_key.

property output_saver_key

Return output_saver_key.

get_unique_id()

Returns unique identifier of task.

class multiml.task.MLBaseTask(phases=None, input_var_names=None, output_var_names=None, save_var_names=None, pred_var_names=None, true_var_names=None, var_names=None, model=None, model_args=None, optimizer=None, optimizer_args=None, scheduler=None, scheduler_args=None, loss=None, loss_args=None, max_patience=None, loss_weights=None, load_weights=False, save_weights=False, metrics=None, num_epochs=10, batch_size=64, num_workers=0, verbose=None, **kwargs)

Bases: BaseTask

Base task class for (deep) machine learning tasks.

__init__(phases=None, input_var_names=None, output_var_names=None, save_var_names=None, pred_var_names=None, true_var_names=None, var_names=None, model=None, model_args=None, optimizer=None, optimizer_args=None, scheduler=None, scheduler_args=None, loss=None, loss_args=None, max_patience=None, loss_weights=None, load_weights=False, save_weights=False, metrics=None, num_epochs=10, batch_size=64, num_workers=0, verbose=None, **kwargs)

Initialize ML base task.

This base class will be inherited by deep learning task classes, KerasBaseTask() and PytorchBaseTask(). input_var_names and output_var_names specify data for model inputs and outputs. If input_var_names is list, e.g. [‘var0’, ‘var1’], model will receive data with format of [(batch size, k), (batch size, k)], where k is arbitrary shape of each variable. If input_var_names is tuple, e.g. (‘var0’, ‘var1’), model will receive data with (batch size, M, k), where M is the number of variables. If output_var_names` is list, model must returns list of tensor data for each variable. If output_var_names` is tuple, model must returns a tensor data. pred_var_names and true_var_names specify data for loss calculations. If pred_var_names is given, only variables indicated by pred_var_names are selected from model outputs before being passed to loss calculation. Please see KerasBaseTask()` or ``PytorchBaseTask() for actual examples.

Parameters:
  • phases (list) – list to indicates ML phases, e.g. [‘train’, ‘test’]. If None is given, [‘train’, ‘valid’, ‘test’] is set.

  • input_var_names (str or list or tuple) – input variable names in StoreGate.

  • output_var_names (str or list or tuple) – output variable names of model.

  • save_var_names (str or list) – variable names saved to StoreGate..

  • pred_var_names (str or list) – prediction variable names passed to loss.

  • true_var_names (str or list or tuple) – true variable names.

  • var_names (str) – str of “input output true” variable names for shortcut. This is not valid to specify multiple variables.

  • model (str or obj) – name of model, or class object of model.

  • model_args (dict) – args of model, e.g. dict(param0=0, param1=1).

  • optimizer (str or obj) – name of optimizer, or class object of optimizer

  • optimizer_args (dict) – args of optimizer.

  • scheduler (str or obj) – name of scheduler, or class object of scheduler

  • scheduler_args (dict) – args of scheduler.

  • loss (str or obj) – name of loss, or class object of loss

  • loss_args (dict) – args of loss.

  • max_patience (int) – max number of patience for early stopping. early_stopping is enabled if ``max_patience is given.

  • loss_weights (list) – scalar coefficients to weight the loss.

  • load_weights (bool or str) – user defined algorithms should assume the following behavior. If False, not load model weights. If True, load model weights from default location. If str, load weights from given path.

  • save_weights (bool or str) – user defined algorithms should assume the following behavior. If False, not save model weights. If True, save model weights to default location. If str, save weights to given path.

  • metrics (list) – metrics of evaluation.

  • num_epochs (int) – number of epochs.

  • batch_size (int or dict) – size of mini batch, you can set different batch_size for test, train, valid.

  • num_workers (int) – number of workers for dataloaders.

  • verbose (int) – verbose option for fitting step. If None, it’s set based on logger.MIN_LEVEL

set_hps(params)

Set hyperparameters to this task.

Class attributes (self._XXX) are automatically set based on keys and values of given dict. Hyperparameters start with ‘model__’, ‘optimizer__’ and ‘loss__’ are considred as args of model, optimizer, loss, respectively. If value of hyperparameters is str and starts with ‘saver__’, value is retrieved from `Saver` instance, please see exampels below.

Parameters:

params (dict) – key and value of hyperparameters.

Example

>>> hps_dict = {
>>>    'num_epochs': 10, # normal hyperparameter
>>>    'optimizer__lr': 0.01 # hyperparameter of optimizer
>>>    'saver_hp': 'saver__key__value' # hyperparamer from saver
>>> }
>>> task.set_hps(hps_dict)
execute()

Execute a task.

fit(train_data=None, valid_data=None)

Fit model.

Parameters:
  • train_data (ndarray) – training data.

  • valid_data (ndarray) – validation data.

predict(data=None, phase=None)

Predict model.

Parameters:

data (ndarray) – prediction data.

update(data, phase='auto')

Update data in storegate.

Parameters:
  • data (ndarray) – new data.

  • phase (str) – train, valid, test, auto.

fit_predict(fit_args=None, predict_args=None)

Fit and predict model.

Parameters:
  • fit_args (dict) – arbitrary dict passed to fit().

  • predict_args (dict) – arbitrary dict passed to predict().

Returns:

results of prediction.

Return type:

ndarray or list

predict_update(data=None, phase=None)

Predict and update data in StoreGate.

Parameters:

data (ndarray) – data passed to predict() method.

property phases

Returns ML phases.

property input_var_names

Returns input_var_names.

property output_var_names

Returns output_var_names.

property save_var_names

Returns save_var_names.

property pred_var_names

Returns pred_var_names.

property true_var_names

Returns true_var_names.

property ml

Returns ML data class.

compile()

Compile model, optimizer and loss.

Compiled objects will be avaialble via self.ml.model, self.ml.optimizer and self.ml.loss.

Examples

>>> # compile all together,
>>> self.compile()
>>> # which is equivalent to:
>>> self.build_model() # set self._model
>>> self.compile_model() # set self.ml.model
>>> self.compile_optimizer() # set self.ml.optimizer
>>> self.compile_loss() # set self.ml.loss
build_model()

Build model.

compile_var_names()

Compile var_names.

compile_model()

Compile model.

compile_optimizer()

Compile optimizer.

compile_loss()

Compile loss.

load_model()

Load pre-trained model path from Saver.

Returns:

model path.

Return type:

str

dump_model(extra_args=None)

Dump current model to saver.

Parameters:

extra_args (dict) – extra metadata to be stored together with model.

load_metadata()

Load metadata.

get_input_true_data(phase)

Get input and true data.

Parameters:

phase (str) – data type (train, valid, test or None).

Returns:

(input, true) data for model.

Return type:

tuple

get_input_var_shapes(phase='train')

Get shape of input_var_names.

Parameters:

phase (str) – train, valid, test or None.

Returns:

shape of a variable, or list of shapes

Return type:

ndarray,shape of lit

get_metadata(metadata_key)

Returns metadata.

Parameters:

metadata_key (str) – key of Saver().

Returns:

arbitrary object stored in Saver.

Return type:

Obj

get_pred_index()

Returns prediction index passed to loss calculation.

Returns:

list of prediction index.

Return type:

list

do_train()

Perform train phase or not.

do_valid()

Perform valid phase or not.

do_test()

Perform test phase or not.

show_info()

Print information.

class multiml.task.SkleanPipelineTask(phases=None, input_var_names=None, output_var_names=None, save_var_names=None, pred_var_names=None, true_var_names=None, var_names=None, model=None, model_args=None, optimizer=None, optimizer_args=None, scheduler=None, scheduler_args=None, loss=None, loss_args=None, max_patience=None, loss_weights=None, load_weights=False, save_weights=False, metrics=None, num_epochs=10, batch_size=64, num_workers=0, verbose=None, **kwargs)

Bases: MLBaseTask

Wrapper task to process sklean object.

execute()

Execute fit.