PyTorch API¶

The PyTorch layer extends torch.nn.Module for seamless integration with PyTorch models.

CvxpyLayer¶

class cvxpylayers.torch.CvxpyLayer[source]¶

Bases: Module

A differentiable convex optimization layer for PyTorch.

This layer wraps a parametrized CVXPY problem, solving it in the forward pass and computing gradients via implicit differentiation in the backward pass.

Example

>>> import cvxpy as cp
>>> import torch
>>> from cvxpylayers.torch import CvxpyLayer
>>>
>>> # Define a simple QP
>>> x = cp.Variable(2)
>>> A = cp.Parameter((3, 2))
>>> b = cp.Parameter(3)
>>> problem = cp.Problem(cp.Minimize(cp.sum_squares(A @ x - b)), [x >= 0])
>>>
>>> # Create the layer
>>> layer = CvxpyLayer(problem, parameters=[A, b], variables=[x])
>>>
>>> # Solve with gradients
>>> A_t = torch.randn(3, 2, requires_grad=True)
>>> b_t = torch.randn(3, requires_grad=True)
>>> (solution,) = layer(A_t, b_t)
>>> solution.sum().backward()

__init__(problem, parameters, variables, solver=None, gp=False, verbose=False, canon_backend=None, solver_args=None)[source]¶

Initialize the differentiable optimization layer.

Parameters:

problem (Problem) – A CVXPY Problem. Must be DPP-compliant (problem.is_dpp() must return True).
parameters (list[Parameter]) – List of CVXPY Parameters that will be filled with values at runtime. Order must match the order of tensors passed to forward().
variables (list[Variable]) – List of CVXPY Variables whose optimal values will be returned by forward(). Order determines the order of returned tensors.
solver (str | None) – CVXPY solver to use (e.g., cp.CLARABEL, cp.SCS). If None, uses the default diffcp solver.
gp (bool) – If True, problem is a geometric program. Parameters will be log-transformed before solving.
verbose (bool) – If True, print solver output.
canon_backend (str | None) – Backend for canonicalization. Options are ‘diffcp’, ‘cuclarabel’, or None (auto-select).
solver_args (dict[str, Any] | None) – Default keyword arguments passed to the solver. Can be overridden per-call in forward().

Raises:

AssertionError – If problem is not DPP-compliant.
ValueError – If parameters or variables are not part of the problem.

Return type:

None

forward(*params, solver_args=None)[source]¶

Solve the optimization problem and return optimal variable values.

Parameters:

*params (Tensor) – Tensor values for each CVXPY Parameter, in the same order as the parameters argument to __init__. Each tensor shape must match the corresponding Parameter shape, optionally with a batch dimension prepended. Batched and unbatched parameters can be mixed; unbatched parameters are broadcast.
solver_args (dict[str, Any] | None) – Keyword arguments passed to the solver, overriding any defaults set in __init__.

Returns:

Tuple of tensors containing optimal values for each CVXPY Variable specified in the variables argument to __init__. If inputs are batched, outputs will have matching batch dimensions.

Raises:

RuntimeError – If the solver fails to find a solution.

Return type:

tuple[Tensor, …]

Example

>>> # Single problem
>>> (x_opt,) = layer(A_tensor, b_tensor)
>>>
>>> # Batched: solve 10 problems in parallel
>>> A_batch = torch.randn(10, 3, 2)
>>> b_batch = torch.randn(10, 3)
>>> (x_batch,) = layer(A_batch, b_batch)  # x_batch.shape = (10, 2)

Usage Example¶

import cvxpy as cp
import torch
from cvxpylayers.torch import CvxpyLayer

# Define problem
n, m = 2, 3
x = cp.Variable(n)
A = cp.Parameter((m, n))
b = cp.Parameter(m)
problem = cp.Problem(
    cp.Minimize(cp.sum_squares(A @ x - b)),
    [x >= 0]
)

# Create layer
layer = CvxpyLayer(problem, parameters=[A, b], variables=[x])

# Solve with gradients
A_t = torch.randn(m, n, requires_grad=True)
b_t = torch.randn(m, requires_grad=True)
(x_sol,) = layer(A_t, b_t)

# Backpropagate
x_sol.sum().backward()

GPU Usage¶

For GPU acceleration with CuClarabel:

import cvxpy as cp
import torch
from cvxpylayers.torch import CvxpyLayer

device = torch.device("cuda")

layer = CvxpyLayer(
    problem,
    parameters=[A, b],
    variables=[x],
    solver=cp.CUCLARABEL
).to(device)

A_gpu = torch.randn(m, n, device=device, requires_grad=True)
b_gpu = torch.randn(m, device=device, requires_grad=True)
(x_sol,) = layer(A_gpu, b_gpu)

Integration with nn.Module¶

import torch.nn as nn

class OptNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Linear(10, 6)

        x = cp.Variable(2)
        A = cp.Parameter((3, 2))
        b = cp.Parameter(3)
        problem = cp.Problem(cp.Minimize(cp.sum_squares(A @ x - b)))

        self.cvx = CvxpyLayer(problem, parameters=[A, b], variables=[x])

    def forward(self, features, b):
        A = self.fc(features).view(-1, 3, 2)
        (x,) = self.cvx(A, b)
        return x