12 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	AutoML Overview
Nano provides built-in AutoML support through hyperparameter optimization.
By simply changing imports, you are able to search the model architecture (e.g. by specifying search spaces in layer/activation/function arguments when defining the model), or the training procedure (e.g. by specifying search spaces in learning_rate or batch_size). You can simply use model.search (tensorflow) / Trainer.search(pytorch) to launch search trials, and model.search_summary(tensorflow) / Trainer.search_summary(pytorch) to review the search results.
Under the hood, the objects (layers, activations, model, etc.) are implicitly turned into searchable objects at creation, which allows search spaces to be specified in their init arguments. Nano HPO collects those search spaces and passes them to the underlying HPO engine (i.e. Optuna) which generates hyperparameter suggestions accordingly. The instantiation and execution of the corresponding objects are delayed until the hyperparameter values are available in each trial.
Install
If you have not installed BigDL-Nano, follow Nano Install Guide to install it according to your system and framework (i.e. tensorflow or pytorch).
Next, install a few dependencies required for Nano HPO using below commands.
pip install ConfigSpace
pip install optuna
Search Spaces
Search spaces are value range specifications that the search engine uses for sampling hyperparameters. The available search spaces in Nano HPO is defined in bigdl.nano.automl.hpo.space. Refer to Search Space API doc for more details.
For Tensorflow Users
Enable/Disable HPO for tensorflow
For tensorflow training, you should call hpo_config.enable_hpo_tf before using Nano HPO.
hpo_config.enable_hpo_tf will dynamically add searchable layers, activations, functions, optimizers, etc into the bigdl.nano.tf module. When importing layers, you need to change the imports from tf.keras.layers to bigdl.nano.tf.keras.layers, so that you can specify search spaces in their init arguments. Note even if you don't need to search the model architecture, you still need to change the imports to use HPO.
import bigdl.nano.automl as nano_automl
nano_automl.hpo_config.enable_hpo_tf()
To disable HPO, use hpo_config.disable_hpo_tf. This will remove the searchable objects from bigdl.nano.tf module.
import bigdl.nano.automl as nano_automl
nano_automl.hpo_config.disable_hpo_tf()
Search the Model Architecture
To search different versions of your model, you can specify search spaces when defining the model using either sequential API, functional API or by subclassing tf.keras.Model.
using Sequential API
You can specify search spaces in layer arguments. Note that search spaces can only be specified in key-word argument (which means Dense(space.Int(...)) should be changed to Dense(units=space.Int(...))). Remember to import Sequential from bigdl.nano.automl.tf.keras instead of tensorflow.keras
from bigdl.nano.tf.keras.layers import Dense, Conv2D, Flatten
from bigdl.nano.automl.tf.keras import Sequential
model = Sequential()
model.add(Conv2D(
    filters=space.Categorical(32, 64),
    kernel_size=space.Categorical(3, 5),
    strides=space.Categorical(1, 2),
    activation=space.Categorical("relu", "linear"),
    input_shape=input_shape))
model.add(Flatten())
model.add(Dense(10, activation="softmax"))
using Functional API
You can specify search spaces in layer arguments. Note that if a layer is used more than once in the model, we strongly suggest you specify a prefix for each search space in such layers to distinguish them, or they will share the same search space (the last space will override all previous definition), as shown in the below example. Remember to import Model from bigdl.nano.automl.tf.keras instead of tensorflow.keras
import bigdl.nano.automl.hpo.space as space
from bigdl.nano.tf.keras import Input
from bigdl.nano.tf.keras.layers import Dense, Dropout
from bigdl.nano.automl.tf.keras import Model
inputs = Input(shape=(784,))
x = Dense(units=space.Categorical(8,16,prefix='dense_1'), activation="linear")(inputs)
x = Dense(units=space.Categorical(32,64,prefix='dense_2'), activation="tanh")(x)
x = Dropout(rate=space.Real(0.1,0.5, prefix='dropout'))(x)
outputs = Dense(units=10)(x)
model = Model(inputs=inputs, outputs=outputs, name="mnist_model")
by Subclassing tf.keras.Model
For models defined by subclassing tf.keras.Model, use the decorator @hpo.tfmodel to turn the model into a searchable object. Then you will able to specify either search spaces or normal values in the model init arguments.
import bigdl.nano.automl.hpo.space as space
import bigdl.nano.automl.hpo as hpo
@hpo.tfmodel()
class MyModel(tf.keras.Model):
    def __init__(self, filters, kernel_size, strides, num_classes=10):
        super().__init__()
        self.conv1 = tf.keras.layers.Conv2D(filters=filters,
                            kernel_size=kernel_size,
                            strides=strides,
                            activation="relu")
        self.max1  = tf.keras.layers.MaxPooling2D(3)
        self.bn1   = tf.keras.layers.BatchNormalization()
        self.gap   = tf.keras.layers.GlobalAveragePooling2D()
        self.dense = tf.keras.layers.Dense(num_classes)
    def call(self, inputs, training=False):
        x = self.conv1(inputs)
        x = self.max1(x)
        x = self.bn1(x)
        x = self.gap(x)
        return self.dense(x)
model = MyModel(
    filters=hpo.space.Categorical(32, 64),
    kernel_size=hpo.space.Categorical(3, 5),
    strides=hpo.space.Categorical(1, 2)
)
Search the Learning Rate
To search the learning rate, specify search space in learning_rate argument in the optimizer argument in model.compile. Remember to import the optimizer from bigdl.nano.tf.optimizers instead of tf.keras.optimizers.
import bigdl.nano.automl.hpo.space as space
from bigdl.nano.tf.optimizers import RMSprop
model.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=RMSprop(learning_rate=space.Real(0.0001, 0.01, log=True)),
    metrics=["accuracy"],
)
Search the Batch Size
To search the batch size, specify search space in batch_size argument in model.search.
import bigdl.nano.automl.hpo.space as space
model.search(n_trials=2, target_metric='accuracy', direction="maximize",
    x=x_train, y=y_train,validation_data=(x_valid, y_valid),
    batch_size=space.Categorical(128,64))
Launch Hyperparameter Search and Review the Results
To launch hyperparameter search, call model.search after compile, as shown below. model.search runs the n_trials number of trials (meaning n_trials set of hyperparameter combinations are searched), and optimizes the target_metric in the specified direction. Besides search arguments, you also need to specify fit arguments in model.search which will be used in the fitting process in each trial. Refer to API docs for details.
Call model.search_summary to retrieve the search results, which you can use to get all trial statistics in pandas dataframe format, pick the best trial, or do visualizations.  Examples of search results analysis and visualization can be found here.
Finally, model.fit will automatically fit the model using the best set of hyper parameters found in the search. You can also use the hyperparameters from a particular trial other than the best one. Refer to API docs for details.
model = ... # define the model
model.compile(...)
model.search(n_trials=100, target_metric='accuracy', direction="maximize",
    x=x_train, y=y_train, batch_size=32, epochs=20, validation_split=0.2)
study = model.search_summary()
model.fit(...)
For PyTorch Users
Nano-HPO now only supports hyperparameter search for pytorch-lightning modules.
Search the Model Architecture
To search the model architecture, use the decorator @hpo.plmodel() to turn the model into a searchable object. Put the arguments that you want to search in the init arguments and use the arguments to construct the model. The arguments can be either space or non-space values, as shown below.
import bigdl.nano.automl.hpo.space as space
import bigdl.nano.automl.hpo as hpo
@hpo.plmodel()
class MyModel(pl.LightningModule):
    """Customized Model."""
    def __init__(self,out_dim1,out_dim2,dropout_1,dropout_2):
        super().__init__()
        layers = []
        input_dim = 32
        for out_dim, dropout in [(out_dim1, dropout_1),(out_dim2,dropout_2)]:
            layers.append(torch.nn.Linear(input_dim, out_dim))
            layers.append(torch.nn.Tanh())
            layers.append(torch.nn.Dropout(dropout))
            input_dim = out_dim
        layers.append(torch.nn.Linear(input_dim, 2))
        self.layers: torch.nn.Module = torch.nn.Sequential(*layers)
        self.save_hyperparameters()
    def forward(self, x):
        return self.layers(x)
model = MyModel(
    out_dim1=space.Categorical(16,32),
    out_dim2=space.Categorical(16,32),
    dropout_1=space.Categorical(0.1, 0.2, 0.3, 0.4, 0.5),
    dropout_2 = 0.5)
Search the Learning Rate
learning_rate can be specified in the init arguments of your model. You can use learning_rate to construct the optimizer in configure_optimizers(), as shown below.
import bigdl.nano.automl.hpo.space as space
import bigdl.nano.automl.hpo as hpo
@hpo.plmodel()
class MyModel(pl.LightningModule):
    def __init__(self, ..., learning_rate=0.1):
        ...
        self.save_hyperparameters()
    def configure_optimizers(self):
        # set learning rate in the optimizer
        self.optimizer = torch.optim.Adam(self.layers.parameters(),
                                          lr=self.hparams.learning_rate)
        return [self.optimizer], []
model = MyModel(..., learning_rate=space.Real(0.001,0.01,log=True))
Search the Batch Size
batch_size can be specified in the init arguments of your model. You can use the batch_size to construct the DataLoader in train_dataloader(), as shown below.
import bigdl.nano.automl.hpo.space as space
import bigdl.nano.automl.hpo as hpo
@hpo.plmodel()
class MyModel(pl.LightningModule):
    def __init__(self, ..., batch_size=16):
        ...
        self.save_hyperparameters()
    def train_dataloader(self):
        # set the batch size in train dataloader
        return DataLoader(RandomDataset(32, 64),
                          batch_size=self.hparams.batch_size)
model = MyModel(..., batch_size = space.Categorical(32,64))
Launch Hyperparameter Search and Review the Results
First of all, import Trainer from bigdl.nano.pytorch instead of pytorch_lightning. Remember to set use_hpo=True when initializing the Trainer.
To launch hyperparameter search, call Trainer.search after model is defined. Trainer.search takes the decorated model as input. Similar to tensorflow, trainer.search runs the n_trials number of trials (meaning n_trials set of hyperparameter combinations are searched), and optimizes the target_metric in the specified direction. There's an extra argument max_epochs which is used only in the fitting process in search trials without affecting Trainer.fit. Trainer.search returns a model configured with the best set of hyper parameters.
Call model.search_summary to retrieve the search results, which you can use to get all trial statistics in pandas dataframe format, pick the best trial, or do visualizations.  Examples of search results analysis and visualization can be found here.
Finally you can use Trainer.fit() to fit the best model. You can also get a model constructed with hyperparameters from a particular trial other than the best one. Refer to Trainer.search API doc for more details.
from bigdl.nano.pytorch import Trainer
model = MyModel(...)
trainer = Trainer(...,use_hpo=True)
best_model = trainer.search(
    model,
    target_metric='val_loss',
    direction='minimize',
    n_trials=100,
    max_epochs=20,
)
study = trainer.search_summary()
trainer.fit(best_model)