Configuration

Syntax to set the hyperparameter ranges, search strategy, and other aspects of your swe

Use these configuration fields to customize your sweep. There are two ways to specify your configuration:

  1. YAML file: best for distributed sweeps. See examples here.

  2. Python data structure: best for running a sweep from a Jupyter Notebook

Top-level key

Meaning

name

The name of the sweep, displayed in the W&B UI

description

Text description of the sweep (notes)

program

Training script to run (required)

metric

Specify the metric to optimize (used by some search strategies and stopping criteria)

method

Specify the search strategy (required)

early_terminate

Specify the stopping criteria (optional, defaults to no early stopping)

parameters

Specify parameters bounds to search (required)

project

Specify the project for this sweep

entity

Specify the entity for this sweep

command

Specify command line for how the training script should be run

Metric

Specify the metric to optimize. This metric should be logged explicitly to W&B by your training script. For example, if you want to minimize the validation loss of your model:

# [model training code that returns validation loss as valid_loss]
wandb.log({"val_loss" : valid_loss})

metric sub-key

Meaning

name

Name of the metric to optimize

goal

minimize or maximize (Default is minimize)

target

Value that you'd like to achieve for the metric you're optimizing. When any run in the sweep achieves that target value, the sweep's state will be set to "Finished." This means all agents with active runs will finish those jobs, but no new runs will be launched in the sweep.

The metric specified needs to be a "top level" metric:

This will NOT work: Sweep configuration: metric: name: my_metric.nested Code: nested_metrics = {"nested": 4} wandb.log({"my_metric", nested_metrics}

To work around this limitation the script should log the nested metric at the top level like this: Sweep configuration: metric: name: my_metric_nested Code: nested_metrics = {"nested": 4} wandb.log{{"my_metric", nested_metric} wandb.log({"my_metric_nested", nested_metric["nested"]})

Examples

Maximize
Minimize
Target
Maximize
metric:
name: val_loss
goal: maximize
Minimize
metric:
name: val_loss
Target
metric:
name: val_loss
goal: maximize
target: 0.1

Search Strategy

Specify the search strategy with the method key in the sweep configuration.

method

Meaning

grid

Grid search iterates over all possible combinations of parameter values.

random

Random search chooses random sets of values.

bayes

Bayesian Optimization uses a gaussian process to model the function and then chooses parameters to optimize probability of improvement. This strategy requires a metric key to be specified.

Examples

Random search
Grid search
Bayes search
Random search
method: random
Grid search
method: grid
Bayes search
method: bayes
metric:
name: val_loss
goal: minimize

Stopping Criteria

Early termination is an optional feature that speeds up hyperparameter search by stopping poorly-performing runs. When the early stopping is triggered, the agent stops the current run and gets the next set of hyperparameters to try.

early_terminate sub-key

Meaning

type

specify the stopping algorithm

We support the following stopping algorithm(s):

type

Meaning

hyperband

Use the hyperband method

Hyperband stopping evaluates whether a program should be stopped or permitted to continue at one or more brackets during the execution of the program. Brackets are configured at static iterations for a specified metric (where an iteration is the number of times a metric has been logged — if the metric is logged every epoch, then there are epoch iterations).

In order to specify the bracket schedule, eithermin_iter or max_iter needs to be defined.

early_terminate sub-key

Meaning

min_iter

specify the iteration for the first bracket

max_iter

specify the maximum number of iterations for the program

s

specify the total number of brackets (required for max_iter)

eta

specify the bracket multiplier schedule (default: 3)

Examples

Hyperband (min_iter)
Hyperband (max_iter)
Hyperband (min_iter)
early_terminate:
type: hyperband
min_iter: 3

Brackets: 3, 9 (3*eta), 27 (9 * eta), 81 (27 * eta)

Hyperband (max_iter)
early_terminate:
type: hyperband
max_iter: 27
s: 2

Brackets: 9 (27/eta), 3 (9/eta)

Parameters

Describe the hyperparameters to explore. For each hyperparameter, specify the name and the possible values as a list of constants (for any method) or specify a distribution (for random or bayes ).

Values

Meaning

values: [(type1), (type2), ...]

Specifies all valid values for this hyperparameter. Compatible with grid.

value: (type)

Specifies the single valid value for this hyperparameter. Compatible with grid.

distribution: (distribution)

Selects a distribution from the distribution table below. If not specified, will default to categorical if values is set, to int_uniform if max and min are set to integers, to uniform if max and min are set to floats, or toconstant if value is set.

min: (float) max: (float)

Maximum and minimum valid values for uniform -distributed hyperparameters.

min: (int) max: (int)

Maximum and minimum values for int_uniform -distributed hyperparameters.

mu: (float)

Mean parameter for normal - or lognormal -distributed hyperparameters.

sigma: (float)

Standard deviation parameter for normal - or lognormal -distributed hyperparameters.

q: (float)

Quantization step size for quantized hyperparameters

Example

grid - single value
grid - multiple values
random or bayes - normal distribution
grid - single value
parameter_name:
value: 1.618
grid - multiple values
parameter_name:
values:
- 8
- 6
- 7
- 5
- 3
- 0
- 9
random or bayes - normal distribution
parameter_name:
distribution: normal
mu: 100
sigma: 10

Distributions

Name

Meaning

constant

Constant distribution. Must specify value.

categorical

Categorical distribution. Must specify values.

int_uniform

Discrete uniform distribution on integers. Must specify max and min as integers.

uniform

Continuous uniform distribution. Must specify max and min as floats.

q_uniform

Quantized uniform distribution. Returns round(X / q) * q where X is uniform. q defaults to 1.

log_uniform

Log-uniform distribution. Returns a value between exp(min) and exp(max)such that the natural logarithm is uniformly distributed between min and max.

q_log_uniform

Quantized log uniform. Returns round(X / q) * q where X is log_uniform. q defaults to 1.

normal

Normal distribution. Return value is normally-distributed with mean mu (default 0) and standard deviation sigma (default 1).

q_normal

Quantized normal distribution. Returns round(X / q) * q where X is normal. Q defaults to 1.

log_normal

Log normal distribution. Returns a value X such that the natural logarithm log(X) is normally distributed with mean mu (default 0) and standard deviation sigma (default 1).

q_log_normal

Quantized log normal distribution. Returns round(X / q) * q where X is log_normal. q defaults to 1.

Example

constant
categorical
uniform
q_uniform
constant
parameter_name:
distribution: constant
value: 2.71828
categorical
parameter_name:
distribution: categorical
values:
- elu
- celu
- gelu
- selu
- relu
- prelu
- lrelu
- rrelu
- relu6
uniform
parameter_name:
distribution: uniform
min: 0
max: 1
q_uniform
parameter_name:
distribution: q_uniform
min: 0
max: 256
q: 1

Command Line

The sweep agent constructs a command line in the following format by default:

/usr/bin/env python train.py --param1=value1 --param2=value2

On Windows machines the /usr/bin/env will be omitted. On UNIX systems it ensures the right python interpreter is chosen based on the environment.

This command line can be modified by specifying a command key in the configuration file.

By default the command is defined as:

command:
- ${env}
- ${interpreter}
- ${program}
- ${args}

Command Macro

Expansion

${env}

/usr/bin/env on UNIX systems, omitted on Windows

${interpreter|

Expands to "python".

${program}

Training script specified by the sweep configuration program key

${args}

Expanded arguments in the form --param1=value1 --param2=value2

${args_no_hyphens}

Expanded arguments in the form param1=value1 param2=value2

${json}

Arguments encoded as JSON

${json_file}

The path to a file containing the args encoded as JSON

Examples:

Set python interpreter
Add extra parameters
Omit arguments
Use with Hydra
Set python interpreter

In order to hardcode the python interpreter you can can specify the interpreter explicitly:

command:
- ${env}
- python3
- ${program}
- ${args}
Add extra parameters

Add extra command line arguments not specified by sweep configuration parameters:

command:
- ${env}
- ${interpreter}
- ${program}
- "-config"
- your-training-config
- ${args}
Omit arguments

If your program does not use argument parsing you can avoid passing arguments all together and take advantage of wandb.init() picking up sweep parameters automatically:

command:
- ${env}
- ${interpreter}
- ${program}
Use with Hydra

You can change the command to pass arguments they way tools like Hydra expect.

command:
- ${env}
- ${interpreter}
- ${program}
- ${args_no_hyphens}

Common Questions

Nested Config

Right now, sweeps do not support nested values, but we plan on supporting this in the near future.