Differentiability and Jacobian Tools#
This page describes the macrostat.diff module, which provides utilities
for computing and validating Jacobians of MacroStat models using PyTorch.
Overview#
The differentiability tools are designed for two use cases:
Development-time checks that a model is differentiable and that gradients are sensible (no NaNs/Infs, forward vs reverse AD match, etc.).
Calibration and analysis that require Jacobians or gradient information with respect to model parameters.
The core pieces live in macrostat.diff:
macrostat.diff.JacobianBasemacrostat.diff.JacobianAutogradmacrostat.diff.JacobianNumericalmacrostat.diff.check_model_differentiability()macrostat.diff.compare_jacobian_dicts()
Quick start#
Given a MacroStat model instance (for example macrostat.models.GL06SIMEX
or the small linear test model macrostat.models.LINEAR2D), you can
define a scalar loss and compute gradients as follows:
import torch
from macrostat.models import get_model
from macrostat.diff import JacobianAutograd, JacobianNumerical
# Build a model
ModelClass = get_model("LINEAR2D")
model = ModelClass()
# Define a loss on the model outputs
target = torch.tensor([1.0, 0.0])
def loss_fn(output: dict[str, torch.Tensor]) -> torch.Tensor:
y = output["State"][-1] # final 2D state
return torch.nn.functional.mse_loss(y, target)
# Autograd-based gradients
jac_auto = JacobianAutograd(model)
grads_rev = jac_auto.compute(loss_fn=loss_fn, mode="rev")
# Numerical finite-difference gradients (default: log-space, eps=1e-3)
jac_num_log = JacobianNumerical(model)
grads_num_log = jac_num_log.compute(loss_fn=loss_fn, mode="central")
# Or in direct space, useful when any parameter equals zero
jac_num_direct = JacobianNumerical(
model, epsilon=1e-5, parameter_space="direct"
)
grads_num_direct = jac_num_direct.compute(loss_fn=loss_fn, mode="central")
Direct vs log-space perturbations#
JacobianNumerical supports two perturbation
schemes, controlled by the parameter_space argument.
Direct space parameter_space="direct" perturbs each parameter
additively:
This is the textbook central difference. It works for any parameter
value, including zero, but the absolute step size has no relation to
the parameter scale, so the same epsilon is too small for
parameters with magnitudes near unity and too large for parameters
near zero.
Log space parameter_space="log" (the default) perturbs each
parameter multiplicatively:
The perturbation is now a fixed fraction of the parameter value,
so the same epsilon (default 1e-3, roughly a 0.1% relative
step) is appropriate across parameters of very different magnitudes.
This is the recommended default for SFC and behavioral models, where
parameter scales differ by many orders of magnitude.
Choice of epsilon is a tradeoff: large values amplify
truncation error from the finite-difference formula, small values
amplify floating-point roundoff. For float32 in log space, values
between 1e-4 and 1e-2 typically work well; 1e-3 is the
package default.
Zero-parameter caveat. Log-space perturbation is undefined when a
parameter equals zero (you cannot take the log of zero, and the
denominator vanishes). JacobianNumerical
detects this case during compute and raises
ValueError: Cannot use log-space with zero parameters: [...].
For example, the LINEAR2D test model has
a parameter x0_2 that defaults to zero, so its tests must
explicitly pass parameter_space="direct". Negative parameters
are handled in log space using a sign-preserving transformation
sign(p) * exp(log(abs(p)) +/- eps).
High-level differentiability check#
For a more complete diagnostic, use
macrostat.diff.check_model_differentiability():
from macrostat.diff import check_model_differentiability
report = check_model_differentiability(
model=model,
loss_fn=loss_fn,
scenario=0,
rtol=1e-5,
atol=1e-8,
compare_forward_reverse=True,
compare_numerical=True,
numerical_mode="central",
epsilon=1e-3,
parameter_space="log",
)
print(report.summary())
The resulting macrostat.diff.DifferentiabilityReport contains:
nan_or_inf– whether any gradient contains NaNs or Infs.rel_err_fwd_rev– relative discrepancy between forward- and reverse-mode autograd gradients.rel_err_autodiff_num– relative discrepancy between autograd and numerical finite-difference gradients.
Rather than only returning a pass/fail flag, the report is intended to help
you interpret how accurate the gradients are (e.g. “accurate to about
1e-3 relative error” on a given model and loss).
Set parameter_space="direct" if your model has any parameter
equal to zero. The default is "log".
Per-parameter Jacobian comparison#
When the high-level check reports a disagreement, the next question
is which parameter is responsible.
compare_jacobian_dicts() performs an
element-wise, per-parameter comparison of two Jacobian dictionaries
and returns a JacobianComparisonReport with
the following fields per parameter:
max_abs_diff,mean_abs_diff– absolute differences across all elements of the parameter’s Jacobian slice.max_rel_diff,mean_rel_diff– the same, normalised bymax(|a|, |b|)clipped to one.num_close/num_elements– how many entries are within theatol/rtoltolerance.has_nan_inf– whether either Jacobian contains NaNs or Infs.
A typical workflow:
from macrostat.diff import (
JacobianAutograd, JacobianNumerical, compare_jacobian_dicts,
)
jac_auto = JacobianAutograd(model).compute(loss_fn=loss_fn, mode="rev")
jac_num = JacobianNumerical(model).compute(loss_fn=loss_fn, mode="central")
report = compare_jacobian_dicts(
jac_auto, jac_num, method_a="autograd", method_b="numerical"
)
print(report.summary())
The summary prints one row per parameter, sorted by descending relative disagreement, so the offending parameters are at the top. See Jacobian diagnostics for a runnable walkthrough that applies this workflow to LINEAR2D and several GL06 variants.
Expected disagreements#
Two classes of “disagreement” between autograd and numerical Jacobians are expected and not bugs.
Constraint-derived parameters. When a model declares a
LinearConstraint, the derived
(residual) parameter is no longer free: its value is fixed by the
adding-up identity. Autograd correctly reports a zero gradient for
the derived parameter and absorbs the sensitivity into the free
parameters of the same group. Numerical Jacobians computed by
perturbing the free parameters agree with autograd; numerical
Jacobians computed by perturbing the derived parameter directly
will disagree, because such a perturbation would violate the
constraint. See Constraints for the residual
parameterization and Parameter constraints with LinearConstraint for a worked
example.
Step-function operations. Operations like
torch.where(x > 0, a, b), torch.clamp, and any other
discontinuous switch are non-differentiable through the condition
argument. Autograd reports zero gradient at the discontinuity, while
finite differences see a finite jump whenever the perturbation
straddles it. The disagreement is real but local, and it disappears
when the step function is replaced by a smooth approximation. The
Behavior base class provides
diffwhere(),
tanhmask(),
diffmin(), and
diffmax() for this purpose.
Jacobian diagnostics shows how to localize this kind of
disagreement to a specific parameter, timestep, and output.
Test model: LINEAR2D#
To validate the implementation itself, MacroStat ships with a tiny linear
test model macrostat.models.LINEAR2D.LINEAR2D. Its dynamics are:
where \(x_t\) is a 2D state vector and \(A\) is a fully parameterised 2x2 matrix. This model has an analytical Jacobian, so it is used in the test suite to confirm that:
Forward and reverse autograd agree to high precision, and
Autograd gradients match finite-difference gradients up to a small relative error.
Because LINEAR2D has a parameter (x0_2) that defaults to zero,
log-space perturbation is undefined for it, and its differentiability
tests set parameter_space="direct" explicitly.