multiml.task.basic.ml_base module
MLBaseTask module.
- class multiml.task.basic.ml_base.MLBaseTask(phases=None, input_var_names=None, output_var_names=None, save_var_names=None, pred_var_names=None, true_var_names=None, var_names=None, model=None, model_args=None, optimizer=None, optimizer_args=None, scheduler=None, scheduler_args=None, loss=None, loss_args=None, max_patience=None, loss_weights=None, load_weights=False, save_weights=False, metrics=None, num_epochs=10, batch_size=64, num_workers=0, verbose=None, **kwargs)
Bases:
BaseTask
Base task class for (deep) machine learning tasks.
- __init__(phases=None, input_var_names=None, output_var_names=None, save_var_names=None, pred_var_names=None, true_var_names=None, var_names=None, model=None, model_args=None, optimizer=None, optimizer_args=None, scheduler=None, scheduler_args=None, loss=None, loss_args=None, max_patience=None, loss_weights=None, load_weights=False, save_weights=False, metrics=None, num_epochs=10, batch_size=64, num_workers=0, verbose=None, **kwargs)
Initialize ML base task.
This base class will be inherited by deep learning task classes,
KerasBaseTask()
andPytorchBaseTask()
.input_var_names
andoutput_var_names
specify data for model inputs and outputs. Ifinput_var_names
is list, e.g. [‘var0’, ‘var1’], model will receive data with format of [(batch size, k), (batch size, k)], where k is arbitrary shape of each variable. Ifinput_var_names
is tuple, e.g. (‘var0’, ‘var1’), model will receive data with (batch size, M, k), where M is the number of variables. If output_var_names` is list, model must returns list of tensor data for each variable. If output_var_names` is tuple, model must returns a tensor data.pred_var_names
andtrue_var_names
specify data for loss calculations. Ifpred_var_names
is given, only variables indicated bypred_var_names
are selected from model outputs before being passed to loss calculation. Please seeKerasBaseTask()` or ``PytorchBaseTask()
for actual examples.- Parameters:
phases (list) – list to indicates ML phases, e.g. [‘train’, ‘test’]. If None is given, [‘train’, ‘valid’, ‘test’] is set.
input_var_names (str or list or tuple) – input variable names in StoreGate.
output_var_names (str or list or tuple) – output variable names of model.
save_var_names (str or list) – variable names saved to
StoreGate
..pred_var_names (str or list) – prediction variable names passed to loss.
true_var_names (str or list or tuple) – true variable names.
var_names (str) – str of “input output true” variable names for shortcut. This is not valid to specify multiple variables.
model (str or obj) – name of model, or class object of model.
model_args (dict) – args of model, e.g. dict(param0=0, param1=1).
optimizer (str or obj) – name of optimizer, or class object of optimizer
optimizer_args (dict) – args of optimizer.
scheduler (str or obj) – name of scheduler, or class object of scheduler
scheduler_args (dict) – args of scheduler.
loss (str or obj) – name of loss, or class object of loss
loss_args (dict) – args of loss.
max_patience (int) – max number of patience for early stopping.
early_stopping
is enabled if ``max_patience is given.loss_weights (list) – scalar coefficients to weight the loss.
load_weights (bool or str) – user defined algorithms should assume the following behavior. If False, not load model weights. If True, load model weights from default location. If str, load weights from given path.
save_weights (bool or str) – user defined algorithms should assume the following behavior. If False, not save model weights. If True, save model weights to default location. If str, save weights to given path.
metrics (list) – metrics of evaluation.
num_epochs (int) – number of epochs.
batch_size (int or dict) – size of mini batch, you can set different batch_size for test, train, valid.
num_workers (int) – number of workers for dataloaders.
verbose (int) – verbose option for fitting step. If None, it’s set based on logger.MIN_LEVEL
- set_hps(params)
Set hyperparameters to this task.
Class attributes (self._XXX) are automatically set based on keys and values of given dict. Hyperparameters start with ‘model__’, ‘optimizer__’ and ‘loss__’ are considred as args of model, optimizer, loss, respectively. If value of hyperparameters is str and starts with ‘saver__’, value is retrieved from
`Saver`
instance, please see exampels below.- Parameters:
params (dict) – key and value of hyperparameters.
Example
>>> hps_dict = { >>> 'num_epochs': 10, # normal hyperparameter >>> 'optimizer__lr': 0.01 # hyperparameter of optimizer >>> 'saver_hp': 'saver__key__value' # hyperparamer from saver >>> } >>> task.set_hps(hps_dict)
- execute()
Execute a task.
- fit(train_data=None, valid_data=None)
Fit model.
- Parameters:
train_data (ndarray) – training data.
valid_data (ndarray) – validation data.
- predict(data=None, phase=None)
Predict model.
- Parameters:
data (ndarray) – prediction data.
- update(data, phase='auto')
Update data in storegate.
- Parameters:
data (ndarray) – new data.
phase (str) –
train
,valid
,test
,auto
.
- fit_predict(fit_args=None, predict_args=None)
Fit and predict model.
- Parameters:
fit_args (dict) – arbitrary dict passed to
fit()
.predict_args (dict) – arbitrary dict passed to
predict()
.
- Returns:
results of prediction.
- Return type:
ndarray or list
- predict_update(data=None, phase=None)
Predict and update data in StoreGate.
- Parameters:
data (ndarray) – data passed to
predict()
method.
- property phases
Returns ML phases.
- property input_var_names
Returns input_var_names.
- property output_var_names
Returns output_var_names.
- property save_var_names
Returns save_var_names.
- property pred_var_names
Returns pred_var_names.
- property true_var_names
Returns true_var_names.
- property ml
Returns ML data class.
- compile()
Compile model, optimizer and loss.
Compiled objects will be avaialble via
self.ml.model
,self.ml.optimizer
andself.ml.loss
.Examples
>>> # compile all together, >>> self.compile() >>> # which is equivalent to: >>> self.build_model() # set self._model >>> self.compile_model() # set self.ml.model >>> self.compile_optimizer() # set self.ml.optimizer >>> self.compile_loss() # set self.ml.loss
- build_model()
Build model.
- compile_var_names()
Compile var_names.
- compile_model()
Compile model.
- compile_optimizer()
Compile optimizer.
- compile_loss()
Compile loss.
- load_model()
Load pre-trained model path from
Saver
.- Returns:
model path.
- Return type:
str
- dump_model(extra_args=None)
Dump current model to
saver
.- Parameters:
extra_args (dict) – extra metadata to be stored together with model.
- load_metadata()
Load metadata.
- get_input_true_data(phase)
Get input and true data.
- Parameters:
phase (str) – data type (train, valid, test or None).
- Returns:
(input, true) data for model.
- Return type:
tuple
- get_input_var_shapes(phase='train')
Get shape of input_var_names.
- Parameters:
phase (str) – train, valid, test or None.
- Returns:
shape of a variable, or list of shapes
- Return type:
ndarray,shape of lit
- get_metadata(metadata_key)
Returns metadata.
- Parameters:
metadata_key (str) – key of
Saver()
.- Returns:
arbitrary object stored in
Saver
.- Return type:
Obj
- get_pred_index()
Returns prediction index passed to loss calculation.
- Returns:
list of prediction index.
- Return type:
list
- do_train()
Perform train phase or not.
- do_valid()
Perform valid phase or not.
- do_test()
Perform test phase or not.
- show_info()
Print information.