Parameter Sweeps for Growth Model Tuning

Most growth and affiliate teams working on credit card products have a Python model somewhere that scores offer configurations. It takes APR, credit limit, rewards rate, risk threshold. It returns projected LTV or approval rate or both. Someone built it, it lives in a repo, and the people who run it mostly run it against two or three configurations at a time because that's what's tractable locally.

The data side is usually in decent shape — Meta, Google, TikTok APIs into a warehouse, reasonable tooling for trend analysis. The bottleneck is rarely the data. It's that the model itself only gets interrogated at a handful of operating points, because running 500 combinations in a notebook means either waiting or writing job orchestration that nobody wants to write.

Here's what that scoring function typically looks like, using analytical approximations to stay self-contained (in production it would pull from pre-computed warehouse metrics):

● Python
import combinate as cb

def score_offer(
    credit_limit: float,
    apr: float,
    rewards_rate: float,
    risk_threshold: float,
) -> dict:
    approval_rate = max(0.0, 1.0 - risk_threshold * 0.4)
    avg_monthly_balance = credit_limit * 0.28
    annual_interest_revenue = avg_monthly_balance * 12 * (apr / 100) * 0.55
    annual_rewards_cost = avg_monthly_balance * 12 * rewards_rate
    net_annual_revenue = (annual_interest_revenue - annual_rewards_cost) * approval_rate
    predicted_ltv = net_annual_revenue * 3.2  # ~3.2-year average tenure
    return {
        "predicted_ltv": round(predicted_ltv, 2),
        "approval_rate": round(approval_rate, 3),
        "net_annual_revenue": round(net_annual_revenue, 2),
    }

result = cb.sweep(
    score_offer,
    params={
        "credit_limit": {"type": "range", "min": 2000, "max": 20000},
        "apr": [15.99, 19.99, 24.99, 29.99],
        "rewards_rate": {"type": "range", "min": 0.01, "max": 0.025},
        "risk_threshold": {"type": "range", "min": 0.25, "max": 0.65},
    },
    sampling_spec={
        "method": "random",
        "sampler": "sobol",
        "samples": 500,
        "seed": 42,
    },
)

print(result.describe())

500 Sobol-sampled points run in parallel. When they come back, you have the full input/output table:

● Python
import pandas as pd

rows = [
    {**task.parameter_values, **task.inline_output}
    for task in result.succeeded_tasks
]
df = pd.DataFrame(rows)

# Which APR tier produces the best median predicted LTV?
print(df.groupby("apr")["predicted_ltv"].median().sort_values(ascending=False))

# Top combinations by predicted LTV
print(
    df.nlargest(10, "predicted_ltv")[
        ["credit_limit", "apr", "rewards_rate", "risk_threshold", "predicted_ltv"]
    ]
)

The DSP side of this looks nearly identical. If your paid social pod has aggregated performance data from a recent flight, the model is a function that takes bid floor, frequency cap, audience multiplier:

● Python
def model_bid_performance(
    cpm_floor: float,
    frequency_cap: int,
    audience_multiplier: float,
) -> dict:
    # Baseline metrics pre-loaded from your data warehouse
    base_ctr = 0.0042
    base_cvr = 0.031
    base_reach = 1_200_000

    effective_reach = base_reach * audience_multiplier * (1.0 - frequency_cap * 0.04)
    predicted_conversions = effective_reach * base_ctr * base_cvr
    spend = (cpm_floor / 1000) * effective_reach
    estimated_cpa = spend / max(predicted_conversions, 1)
    return {
        "estimated_cpa": round(estimated_cpa, 2),
        "predicted_conversions": round(predicted_conversions),
        "effective_reach": round(effective_reach),
    }

result = cb.sweep(
    model_bid_performance,
    params={
        "cpm_floor": {"type": "range", "min": 2.0, "max": 18.0},
        "frequency_cap": [3, 5, 7, 10],
        "audience_multiplier": {"type": "range", "min": 0.6, "max": 1.4},
    },
    sampling_spec={
        "method": "random",
        "sampler": "halton",
        "samples": 300,
        "seed": 7,
    },
)

A nested loop over those three parameters gives you maybe 36 combinations if you pick three values per continuous input. That's fine until you want coverage across a real range, or you want to hand it off to a coding agent to run overnight, or you want to run the same model against two different baseline datasets and compare.

One thing worth being clear about: Combinate runs the function. It does not pull from Meta or Google or your warehouse — that's your problem and you've already solved it. The way it works in practice is you pre-compute whatever baselines your function needs, load them at startup (as module-level constants or from a file the worker can reach), and sweep from there. The function signature is the interface. What it does internally is up to you.

What you start building once the execution bottleneck is gone

The first sweep you run is usually a direct replacement for the nested loop you were already doing. But once you're getting 500-point coverage of a parameter space in roughly the same time it used to take to run 12 combinations, the model itself tends to evolve.

One thing that shows up pretty quickly: most of the interesting questions in offer and campaign design are two-objective problems, not one. Maximizing predicted LTV and maximizing approval rate pull in different directions — higher risk thresholds lower both credit risk and the approved population. When you can actually see 500 samples across that space, you can extract the Pareto frontier: the set of configurations where you genuinely can't improve one objective without giving something up on the other.

● Python
import pandas as pd

rows = [
    {**task.parameter_values, **task.inline_output}
    for task in result.succeeded_tasks
]
df = pd.DataFrame(rows)

def pareto_front(df, obj1, obj2):
    """Return rows that are not dominated on both objectives (both maximized)."""
    dominated = []
    vals = df[[obj1, obj2]].values
    for i, (a1, a2) in enumerate(vals):
        if any((vals[:, 0] >= a1) & (vals[:, 1] >= a2) & ((vals[:, 0] > a1) | (vals[:, 1] > a2))):
            dominated.append(i)
    return df.drop(index=dominated).sort_values(obj1)

frontier = pareto_front(df, "predicted_ltv", "approval_rate")
print(frontier[["credit_limit", "apr", "rewards_rate", "risk_threshold", "predicted_ltv", "approval_rate"]])

That frontier is a portfolio decision, not an optimization answer. A product manager picking between a 24.99% APR card that clears a 68% approval rate and a 19.99% card that clears 54% has real information to work with. Without the sweep, you'd have two data points and a gut feeling.

The other direction that gets interesting: running the same sweep against different population baselines. If your underwriting team segments applicants into prime and near-prime pools with meaningfully different balance-to-limit ratios and tenure distributions, you can parameterize those baselines and sweep the same offer model twice:

● Python
# Run once per segment, compare frontiers
for segment_name, baselines in [
    ("prime", {"avg_btl": 0.22, "avg_tenure_years": 4.1}),
    ("near_prime", {"avg_btl": 0.41, "avg_tenure_years": 2.6}),
]:
    def score_offer_segment(
        credit_limit: float,
        apr: float,
        rewards_rate: float,
        risk_threshold: float,
        avg_btl: float = baselines["avg_btl"],
        avg_tenure_years: float = baselines["avg_tenure_years"],
    ) -> dict:
        approval_rate = max(0.0, 1.0 - risk_threshold * 0.4)
        avg_monthly_balance = credit_limit * avg_btl
        annual_interest_revenue = avg_monthly_balance * 12 * (apr / 100) * 0.55
        annual_rewards_cost = avg_monthly_balance * 12 * rewards_rate
        net_annual_revenue = (annual_interest_revenue - annual_rewards_cost) * approval_rate
        return {
            "segment": segment_name,
            "predicted_ltv": round(net_annual_revenue * avg_tenure_years, 2),
            "approval_rate": round(approval_rate, 3),
        }

    seg_result = cb.sweep(
        score_offer_segment,
        params={
            "credit_limit": {"type": "range", "min": 2000, "max": 20000},
            "apr": [15.99, 19.99, 24.99, 29.99],
            "rewards_rate": {"type": "range", "min": 0.01, "max": 0.025},
            "risk_threshold": {"type": "range", "min": 0.25, "max": 0.65},
        },
        sampling_spec={"method": "random", "sampler": "sobol", "samples": 500, "seed": 42},
    )
    # collect and compare frontiers across segments...

The near-prime frontier typically sits lower on LTV but the approval rate curve behaves differently because balance utilization is higher. Whether that tradeoff makes sense for a given product depends on your cost of acquisition and your servicing model — but at least you're looking at the actual surface, not extrapolating from a few manual test configs.

None of this requires changing how the model is written. The function signature stays the same. You're just asking more of it.