Jacobian diagnostics#

When JacobianAutograd and JacobianNumerical disagree on a model, the next question is why. This notebook is a runnable companion to Differentiability and Jacobian Tools. It walks through three progressively trickier models, shows how compare_jacobian_dicts localises the disagreement to specific parameters, and closes with a decision tree for reading a report.

LINEAR2D: the test model with an analytical Jacobian and a zero parameter. Demonstrates parameter_space="direct".
GL06SIMEX: a small SFC model whose autograd and numerical Jacobians agree cleanly under the default log-space setting.
GL06INSOUT: the production Tobin portfolio model, which uses a step-function torch.where inside portfolio_switches. The diagnostic shows up as a subset of parameters with visibly worse agreement than the others, and testing different values of epsilon in the numerical differentiation confirms that the disagreement is caused by the step function.

import sys, os
sys.path.insert(0, os.path.abspath('.'))

import macrostat
import torch

from macrostat.models import get_model
from macrostat.diff import (
    JacobianAutograd,
    JacobianNumerical,
    compare_jacobian_dicts,
)
from _doc_helpers import (
    mean_nominal_output,
    mean_national_income,
    final_state_norm,
)

print('macrostat version:', macrostat.__version__)
print('torch version:    ', torch.__version__)

macrostat version: 0.5.7.post1.dev7+g35c4d7655.d20260409
torch version:     2.11.0+cu130

LINEAR2D: direct-space differences#

LINEAR2D is a 2x2 linear map \(x_{t+1} = A x_t\) with six parameters: four matrix entries and two initial conditions. The initial condition x0_2 defaults to zero, so log-space perturbation is undefined for it. We therefore set parameter_space="direct" with a small epsilon=1e-5, which is the standard central-difference setup. With an analytical Jacobian available on this model, we expect low but non-zero relative error dominated by truncation of the finite-difference formula.

model = get_model('LINEAR2D')()

jac_auto = JacobianAutograd(model).compute(
    loss_fn=final_state_norm, mode='rev',
)
jac_num = JacobianNumerical(
    model, parameter_space='direct', epsilon=1e-5,
).compute(loss_fn=final_state_norm, mode='central', num_workers=1)

report = compare_jacobian_dicts(
    jac_auto, jac_num,
    method_a='autograd', method_b='numerical (direct, eps=1e-5)',
    rtol=1e-3, atol=1e-6,
)
print(report.summary())

Jacobian comparison: autograd vs numerical (direct, eps=1e-5)
Parameters compared: 6
Overall max abs diff: 5.045e-03  max rel diff: 5.045e-03

Parameter                                   max_abs   mean_abs    max_rel   mean_rel        close nan/inf
---------------------------------------------------------------------------------------------------------
x0_2                                      5.045e-03  5.045e-03  5.045e-03  5.045e-03          0/1      no
x0_1                                      4.689e-03  4.689e-03  4.689e-03  4.689e-03          0/1      no
a12                                       3.227e-04  3.227e-04  2.999e-04  2.999e-04          1/1      no
a22                                       1.304e-04  1.304e-04  1.304e-04  1.304e-04          1/1      no
a11                                       2.871e-04  2.871e-04  9.710e-05  9.710e-05          1/1      no
a21                                       1.221e-04  1.221e-04  9.525e-05  9.525e-05          1/1      no

All six parameters land in the ballpark of 1e-4 to 1e-3 relative error. The two initial conditions x0_1 and x0_2 sit at the top of the list because they enter linearly and so pick up the full truncation error of the central-difference formula; the matrix entries roll into the propagation and average down. There is no outlier. This is the reference shape of a clean report.

GL06SIMEX: log-space on a smooth SFC model#

GL06SIMEX is the smallest SFC model in the suite: three free parameters (TaxRate, PropensityToConsumeIncome, PropensityToConsumeSavings) and no portfolio switch. All behaviour is smooth, so log-space perturbation with the default epsilon=1e-3 should agree with autograd to roughly 1e-4 relative error.

model = get_model('GL06SIMEX')()

jac_auto = JacobianAutograd(model).compute(
    loss_fn=mean_national_income, mode='rev',
)
jac_num = JacobianNumerical(model).compute(
    loss_fn=mean_national_income, mode='central', num_workers=1,
)

report = compare_jacobian_dicts(
    jac_auto, jac_num,
    method_a='autograd', method_b='numerical (log, eps=1e-3)',
    rtol=1e-3, atol=1e-6,
)
print(report.summary())

Jacobian comparison: autograd vs numerical (log, eps=1e-3)
Parameters compared: 3
Overall max abs diff: 2.362e-02  max rel diff: 4.386e-04

Parameter                                   max_abs   mean_abs    max_rel   mean_rel        close nan/inf
---------------------------------------------------------------------------------------------------------
PropensityToConsumeIncome                 4.344e-03  4.344e-03  4.386e-04  4.386e-04          1/1      no
PropensityToConsumeSavings                2.014e-03  2.014e-03  2.034e-04  2.034e-04          1/1      no
TaxRate                                   2.362e-02  2.362e-02  5.301e-05  5.301e-05          1/1      no

All three parameters agree to better than 5e-4 relative error. The max absolute differences look larger (2e-2 on TaxRate) but that reflects the scale of the Jacobian entry, not a disagreement: once normalised by the magnitude, the report is clean. This is the pattern to expect on a smooth model: relative error sits near the log-space truncation floor.

GL06INSOUT: localising a step-function disagreement#

The Tobin portfolio model behaves differently. Some parameters agree well and others show visibly worse agreement. The goal here is not to fix the disagreement but to use compare_jacobian_dicts to identify which parameters are affected, and then to confirm, by testing different values of epsilon in the numerical differentiation, that the cause is a step-function operation in the model code rather than a smooth numerical artefact.

model = get_model('GL06INSOUT')()

sample_params = [
    'WealthShareM2_Constant',
    'WealthShareBills_Constant',
    'WealthShareBonds_Constant',
    'TaxRate',
    'InventorySalesRatioBaseline',
]

jac_auto = JacobianAutograd(model).compute(
    loss_fn=mean_nominal_output, mode='rev',
)
jac_num = JacobianNumerical(model).compute(
    loss_fn=mean_nominal_output, mode='central', num_workers=1,
    param_names=sample_params,
)

jac_auto_subset = {k: jac_auto[k] for k in sample_params}

report = compare_jacobian_dicts(
    jac_auto_subset, jac_num,
    method_a='autograd', method_b='numerical (log, eps=1e-3)',
    rtol=1e-3, atol=1e-6,
)
print(report.summary())

Jacobian comparison: autograd vs numerical (log, eps=1e-3)
Parameters compared: 5
Overall max abs diff: 1.899e+00  max rel diff: 2.077e-01

Parameter                                   max_abs   mean_abs    max_rel   mean_rel        close nan/inf
---------------------------------------------------------------------------------------------------------
WealthShareBills_Constant                 1.899e+00  1.899e+00  2.077e-01  2.077e-01          0/1      no
WealthShareM2_Constant                    5.841e-02  5.841e-02  5.841e-02  5.841e-02          0/1      no
WealthShareBonds_Constant                 1.534e-01  1.534e-01  7.729e-03  7.729e-03          0/1      no
InventorySalesRatioBaseline               9.693e-02  9.693e-02  3.941e-03  3.941e-03          0/1      no
TaxRate                                   3.356e-01  3.356e-01  1.868e-04  1.868e-04          1/1      no

The report is sorted by descending relative disagreement. TaxRate and InventorySalesRatioBaseline sit at the bottom with relative errors in the 1e-4 to 1e-3 range, consistent with the clean-model baseline from GL06SIMEX. The three Tobin constants are two to three orders of magnitude worse: WealthShareBonds_Constant at 8e-3, WealthShareM2_Constant at 6e-2, and WealthShareBills_Constant at 2e-1.

The disagreement is concentrated on the three Tobin portfolio constants and not on the unrelated parameters. This tells us the affected parameters share a structural role in the model, and that role must involve a non-smooth operation. In GL06INSOUT, the portfolio_switches method uses a step-function torch.where to select between the M1- and M2-dominated regimes of the Tobin allocation. Parameters that shift the baseline portfolio composition can push the economy across that switch for some timesteps. When a central-difference perturbation of size epsilon straddles the discontinuity, the numerical Jacobian sees a finite jump while autograd reports zero gradient at the switch.

Testing different values of epsilon in the numerical differentiation#

Testing different values of epsilon in the numerical differentiation will reveal whether the disagreement comes from a step-function operation or from a smooth numerical artefact. A step-function operation produces a non-monotone relationship between epsilon and the relative disagreement: at large epsilon the perturbation spans a region much wider than the discontinuity and the numerical derivative is dominated by whichever side of the switch each perturbed trajectory lands on, producing a large disagreement with autograd; at very small epsilon roundoff and cancellation in the central-difference formula erode the agreement from the other direction; and between these two regimes sits a value where the disagreement is smallest. A smooth numerical artefact, by contrast, produces a monotone curve.

The following cell computes the numerical Jacobian at three values of epsilon on the parameter with the largest disagreement in the previous report, WealthShareBills_Constant.

worst = 'WealthShareBills_Constant'
eps_values = [1e-2, 1e-3, 1e-5]
rel_diffs = {}

for eps in eps_values:
    jn = JacobianNumerical(model, epsilon=eps).compute(
        loss_fn=mean_nominal_output, mode='central', num_workers=1,
        param_names=[worst],
    )
    a = jac_auto[worst]
    n = jn[worst]
    denom = torch.maximum(a.abs(), n.abs()).clamp(min=1e-12)
    rel_diffs[eps] = ((a - n).abs() / denom).max().item()

print(f'Relative disagreement on {worst} vs epsilon:')
for eps, r in rel_diffs.items():
    print(f'  eps={eps:>.0e}  max_rel_diff={r:.3e}')

Relative disagreement on WealthShareBills_Constant vs epsilon:
  eps=1e-02  max_rel_diff=1.186e+00
  eps=1e-03  max_rel_diff=2.077e-01
  eps=1e-05  max_rel_diff=2.503e-01

The three values show the expected non-monotone pattern. At eps=1e-2 the perturbation is large enough that the relative disagreement exceeds 100%, well above the other two values. At eps=1e-3 the disagreement is smallest, around 20%. At eps=1e-5 it increases again as floating-point cancellation erodes the finite-difference signal. A smooth numerical source of error would produce a monotone curve in epsilon; the non-monotone pattern observed here is consistent with a step-function operation, and matches the portfolio_switches diagnosis.

Decision tree: reading a `compare_jacobian_dicts` report#

When an autograd vs numerical comparison fails on a MacroStat model, use the following checklist before reaching for deeper debugging.

Every parameter at max_rel_diff below roughly 1e-3. Nothing is wrong. The report is clean.
One or more parameters with nan/inf. Autograd found a NaN or Inf in the gradient. Typical causes are divide-by-zero or log(0) inside the model behaviour; instrument with torch.autograd.set_detect_anomaly(True) and rerun.
A derived parameter (residual of a LinearConstraint) reports max_rel_diff of order unity, but autograd gives exactly zero. Expected. The constraint system absorbs all sensitivity into the free parameters of that group. See the constraint-system notes in Differentiability and Jacobian Tools and the worked example in Parameter constraints. If you are numerically perturbing the derived parameter directly, stop: that perturbation violates the constraint. Perturb the free parameters instead.
A small subset of parameters shows disagreement two or more orders of magnitude worse than the rest, and the worst offenders share a structural role in the model. Suspect a step-function operation. Recompute the numerical Jacobian at several values of epsilon on the worst offender as shown in the previous section; a non-monotone relationship between epsilon and the relative disagreement confirms the diagnosis. Locate the switch with a grep for torch.where, torch.clamp, or explicit if branches in the behaviour code, and replace it with one of the smooth alternatives provided by Behavior.diffwhere, Behavior.tanhmask, Behavior.diffmin, or Behavior.diffmax.
All parameters disagree uniformly, around 1e-2 or worse. Likely a bug in the numerical configuration rather than the model. Check whether parameter_space="log" is being used with zero parameters (should raise, but verify), whether epsilon is appropriate for the parameter scale, and whether the loss function itself is well-conditioned.

For the constraint system specifically, see Constraints and the constraints example notebook.