2026-05-02

The Preflight Check That Runs Before Every Sweep

Before sweep() submits a single task, it runs two checks against your parameter space and function. Here is what it looks for, and why five targeted boundary probes catch bugs that random local tests miss.

You know you should validate before a large cloud run. In practice, the validation step is the one that gets skipped when the deadline is close and the function looks like it works on the three configurations you tested by hand.

sweep() now runs that check for you.

What fires before your function runs

The first check is static. Before executing anything, the SDK reads your parameter dictionary and flags structural patterns that commonly cause runtime failures:

These warnings appear in the terminal before a single probe runs. No execution required.

Five probes that reach the corners

The second check actually runs your function, but against five specific parameter sets rather than a random sample:

  1. all-min — every dimension at its minimum value
  2. all-max — every dimension at its maximum value
  3. midpoint — every dimension at the midpoint of its range
  4. cross (lo→hi) — first half of dimensions at min, second half at max
  5. cross (hi→lo) — first half at max, second half at min

The all-min and all-max probes catch obvious edge cases. The cross-probes are where boundary probes earn their place. Many bugs only appear when two parameters are simultaneously at their extremes in opposite directions. The probability of drawing that combination in a small random local sample is low. The cross-probes guarantee it.

A worked example: portfolio risk estimation

Consider a parameter sweep over a mean-variance portfolio optimizer. The function generates synthetic returns, builds a covariance matrix, and inverts it to find optimal weights:

Python
import numpy as np


def estimate_portfolio_risk(
    n_assets: int,
    target_return: float,
    lookback_days: int,
    risk_aversion: float,
) -> dict:
    rng = np.random.default_rng(42)
    returns = rng.standard_normal((lookback_days, n_assets)) * 0.01

    cov = np.cov(returns.T)
    means = returns.mean(axis=0)

    inv_cov = np.linalg.inv(cov)  # fails when lookback_days < n_assets
    weights = inv_cov @ (means - target_return) / risk_aversion
    weights /= weights.sum()

    return {"portfolio_volatility": float((weights @ cov @ weights) ** 0.5)}

The parameter space spans a wide range of asset counts and history lengths:

Python
from combinate import run_preflight

params = {
    "n_assets":      {"type": "range", "min": 5,   "max": 50},
    "target_return": {"type": "range", "min": 0.0,  "max": 0.05},
    "lookback_days": {"type": "range", "min": 20,  "max": 252},
    "risk_aversion": {"type": "range", "min": 0.5, "max": 10.0},
}
spec = {"method": "random", "sampler": "sobol", "samples": 500}

report = run_preflight(estimate_portfolio_risk, params, "random", spec)

The terminal output:

Output
  Combinate  pre-sweep analysis
  Method:               random · sobol
  Tasks (estimated):    500
  Parameter dims:       4

  ⚠  'target_return' has min=0 — division by 'target_return' or log('target_return') will raise at the minimum boundary

  Running boundary probes...

  ✓  all-min                 1ms       {portfolio_volatility: 0.0071}
  ✓  all-max                 2ms       {portfolio_volatility: 0.0009}
  ✓  midpoint                1ms       {portfolio_volatility: 0.0031}
  ✓  cross (lo→hi)           1ms       {portfolio_volatility: 0.0082}
  ✗  cross (hi→lo)           LinAlgError: Singular matrix
       n_assets=50, target_return=0.05, lookback_days=20, risk_aversion=0.5

  ⚠  1/5 boundary probes failed (~20% boundary failure rate)
       Errors that appear at boundaries often affect a proportional share of cloud tasks,
       leaving correlated gaps in your results. Recommend fixing before submitting.

Two findings before a single cloud task was submitted:

  1. Static: target_return has min=0, which the SDK flags without running anything — if target_return ever ends up in a denominator or a log(), that's a crash at the boundary.

  2. Probe failure: the cross (hi→lo) probe set n_assets=50 (max) and lookback_days=20 (min) at the same time. With 50 assets and only 20 days of return history, the covariance matrix is rank-deficient — fewer observations than variables — and np.linalg.inv raises LinAlgError. This combination is unlikely to appear in a small hand-tested sample, but it covers a real region of the 500-task parameter space.

The fix is one guard at the top of the function:

Python
def estimate_portfolio_risk(
    n_assets: int,
    target_return: float,
    lookback_days: int,
    risk_aversion: float,
) -> dict:
    if lookback_days <= n_assets:
        return {
            "portfolio_volatility": float("nan"),
            "valid": False,
            "reason": "underdetermined_covariance",
        }

    rng = np.random.default_rng(42)
    returns = rng.standard_normal((lookback_days, n_assets)) * 0.01
    cov = np.cov(returns.T)
    means = returns.mean(axis=0)
    inv_cov = np.linalg.inv(cov)
    weights = inv_cov @ (means - target_return) / risk_aversion
    weights /= weights.sum()

    return {
        "portfolio_volatility": float((weights @ cov @ weights) ** 0.5),
        "valid": True,
    }

Run preflight again and all five probes pass. Then scale to the full 500-task cloud sweep.

Accessing the report programmatically

run_preflight returns a PreflightReport with everything the terminal printed, structured for code:

Python
print(report.task_count)           # estimated number of tasks
print(report.static_warnings)      # list of warning strings from static analysis
print(report.probe_failures)       # list of ProbeResult where error is not None
print(report.median_task_s)        # median probe wall time in seconds
print(report.estimated_wall_s)     # task_count / parallelism * median_task_s
print(report.estimated_cost_usd)   # planning estimate

Each ProbeResult in report.probe_results has label, params, elapsed_s, output, and error. The params field is the exact dict that was passed to your function, so you can reproduce any failing probe directly.

It runs automatically

sweep() calls run_preflight() before every submission by default. The analysis prints to stderr and does not block or modify the sweep. To skip it:

Python
result = sweep(fn, params=params, config=config, preflight=False)

To run it manually and gate on the result before submitting:

Python
import combinate as cb

report = cb.run_preflight(fn, params, "random", spec)
if report.probe_failures:
    print("fix these before submitting")
else:
    result = cb.sweep(fn, params=params, sampling_spec=spec, config=config, preflight=False)

The two-step path is useful when the preflight output is going into a log or notebook where you want to inspect it separately from the sweep run.

← All posts Join the Private Beta