Implementing new searchers

Searchers determine the policy used to search a search space. Search spaces encode sets of architectures to be considered.

Searcher API

We have a very simple API for a searcher, composed of two main methods and two auxiliary methods.

class Searcher:
    """Abstract base class from which new searchers should inherit from.

    A search takes a function that when called returns a new search space, i.e.,
    a search space where all the hyperparameters have not being specified.
    Searchers essentially sample a sequence of models in the search space by
    specifying the hyperparameters sequentially. After the sampled architecture
    has been evaluated somehow, the state of the searcher can be updated with
    the performance information, guaranteeing that future architectures
    are sampled from the search space in a more informed manner.

    Args:
        search_space_fn (() -> (dict[str,deep_architect.core.Input], dict[str,deep_architect.core.Output], dict[str,deep_architect.core.Hyperparameter])):
            Search space function that when called returns a dictionary of
            inputs, dictionary of outputs, and dictionary of hyperparameters
            encoding the search space from which models can be sampled by
            specifying all hyperparameters (i.e., both those arising in the
            graph part and those in the dictionary of hyperparameters).
    """

    def __init__(self, search_space_fn):
        self.search_space_fn = search_space_fn

    def sample(self):
        """Returns a model from the search space.

        Models are encoded via a dictionary of inputs, a dictionary of outputs,
        and a dictionary of hyperparameters. The forward computation for the
        model can then be done as all values for the hyperparameters have been
        chosen.

        Returns:
            (dict[str, deep_architect.core.Input], dict[str, deep_architect.core.Output], dict[str, deep_architect.core.Hyperparameter], list[object], dict[str, object]):
                Tuple encoding the model sampled from the search space.
                The positional arguments have the following semantics:
                1: Dictionary of names to inputs of the model.
                2: Dictionary of names to outputs of the model.
                3: List with list of values that can be to replay the sequence
                of values assigned to the hyperparameters, and therefore,
                reproduce, given the search space, the model sampled.
                4: Searcher evaluation token that is sufficient for the searcher
                to update its state when combined with the results of the
                evaluation.
        """
        raise NotImplementedError

    def update(self, val, searcher_eval_token):
        """Updates the state of the searcher based on the searcher token
        for a particular evaluation and the results of the evaluation.

        Args:
            val (object): Result of the evaluation to use to update the state of the searcher.
            searcher_eval_token (dict[str, object]): Searcher evaluation token
                that is sufficient for the searcher to update its state when
                combined with the results of the evaluation.
        """
        raise NotImplementedError

Implementing a new searcher is done by inheriting from this class. A searcher is initialized with a function that returns a search space, which is simply a dictionary of inputs and dictionary of outputs. The sample method is used to pick an architecture from this search space. The architecture returned is generated by assigning values to all the independent hyperparameters in the search space, until no unassigned independent hyperparameters are left.

Sample returns the inputs and outputs, which encode the architecture. The other two returned elements deserve a bit more explanation. The third element returned by sample is the list of hyperparameter values that led to the returned architecture. If we call the search space function, iterate over the unassigned independent hyperparameters, and assign the values from this list in order, we will obtain back the same architecture. The combination of the search space function and this list of hyperparameters values encodes the architecture sampled.

The fourth element returned is what we call a searcher evaluation token. This token is used to keep any information necessary for the searcher to update its state once the result of evaluating the sample architecture is passed back to the searcher in the call to update. This token is especially useful when we have multiple workers being serviced by a searcher. In this case, the results of architecture evaluations may arrive in a different order than they were relayed to the workers. The searcher evaluation token allows the searcher to identify the architecture for which the results correspond to.

Finally, we look at the update function of the searcher. The update function takes the results of the evaluation and the searcher evaluation token (which allows the searcher to identify which architecture the results refer to) and updates the state of the searcher with this new information. Updates to the searcher change the state of the searcher and therefore, the behavior of the searcher.

The other two searcher auxiliary functions (that we have ommitted) are save_state and load_state, which allows us to save and load the state of the searcher to disk. This is especially useful for long running searches that require multiple times due to job length limits or hardware issues.

We will now go over a two different searchers to help the reader ground the discussion.

Random searcher

The simplest possible searcher is a random searcher, which assigns a random value to each of the unassigned hyperparameters.

from deep_architect.searchers.common import random_specify, Searcher


class RandomSearcher(Searcher):

    def __init__(self, search_space_fn):
        Searcher.__init__(self, search_space_fn)

    def sample(self):
        inputs, outputs = self.search_space_fn()
        vs = random_specify(outputs)
        return inputs, outputs, vs, {}

    def update(self, val, searcher_eval_token):
        pass

The implementation of this searcher is very short. It uses the implementation of random_specify, which is also very compact. We copy it here for reference.

def random_specify_hyperparameter(hyperp):
    """Choose a random value for an unspecified hyperparameter.

    The hyperparameter becomes specified after the call.

    hyperp (deep_architect.core.Hyperparameter): Hyperparameter to specify.
    """
    assert not hyperp.has_value_assigned()

    if isinstance(hyperp, hp.Discrete):
        v = hyperp.vs[np.random.randint(len(hyperp.vs))]
        hyperp.assign_value(v)
    else:
        raise ValueError
    return v


def random_specify(output_lst):
    """Chooses random values to all the unspecified hyperparameters.

    The hyperparameters will be specified after this call, meaning that the
    compile and forward functionalities will be available for being called.

    Args:
        output_lst (list[deep_architect.core.Output]): List of output which by being
            traversed back will reach all the modules in the search space, and
            correspondingly all the current unspecified hyperparameters of the
            search space.
    """
    hyperp_value_lst = []
    for h in co.unassigned_independent_hyperparameter_iterator(output_lst):
        v = random_specify_hyperparameter(h)
        hyperp_value_lst.append(v)
    return hyperp_value_lst

These are the two main auxiliary functions to randomly specify hyperparameters and to pick a random architecture from the search space by picking values for all the hyperparameters independently at random. These are concise and self-explanatory.

SMBO searcher

Let us now see a SMBO searcher, which is more complex than the random searcher. We copy the implementation here for ease of reference.

from deep_architect.searchers.common import random_specify, specify, Searcher
from deep_architect.surrogates.common import extract_features
import numpy as np


class SMBOSearcher(Searcher):

    def __init__(self, search_space_fn, surrogate_model, num_samples, eps_prob):
        Searcher.__init__(self, search_space_fn)
        self.surr_model = surrogate_model
        self.num_samples = num_samples
        self.eps_prob = eps_prob

    def sample(self):
        if np.random.rand() < self.eps_prob:
            inputs, outputs = self.search_space_fn()
            best_vs = random_specify(outputs)
        else:
            best_model = None
            best_vs = None
            best_score = -np.inf
            for _ in range(self.num_samples):
                inputs, outputs = self.search_space_fn()
                vs = random_specify(outputs)

                feats = extract_features(inputs, outputs)
                score = self.surr_model.eval(feats)
                if score > best_score:
                    best_model = (inputs, outputs)
                    best_vs = vs
                    best_score = score

            inputs, outputs = best_model

        searcher_eval_token = {'vs': best_vs}
        return inputs, outputs, best_vs, searcher_eval_token

    def update(self, val, searcher_eval_token):
        (inputs, outputs) = self.search_space_fn()
        specify(outputs, searcher_eval_token['vs'])
        feats = extract_features(inputs, outputs)
        self.surr_model.update(val, feats)

This searcher can be found in searchers/smbo_random.py. A SMBO (surrogate model based optimization) searcher relies on a surrogate function on the space of architectures that gives us a performance estimate or a score.

An architecture from the search space is sampled by optimizing the surrogate function. In the implementation above, the optimization of the surrogate function is done by sampling a number of random architectures from the search space, evaluating the surrogate function for each of them, and picking the best one. Additionally, we pick an architecture at random with fixed probability.

In this case, updates to the searcher correspond to updates to the surrogate function with observed results. The searcher policy hopefully improves as the surrogate function becomes more accurate as we get more data for the search space. The API definition for a surrogate function can be found in surrogates/common.py.

Concluding remarks

Implementing a new searcher amounts to implementing the sample and update methods for it. We point the reader to the searchers folder for more examples. There is a single searcher per file. We very much welcome searcher contributions. If you would like to contribute with a search algorithm that you developed, please write a issue to discuss the implementation.