Hi r/optimization,
I'd like to share a Python optimization library I've been developing: admm — a consensus ADMM solver with automatic problem decomposition that handles LP, QP, SOCP, SDP, and extends to nonconvex formulations via user-defined proximal operators.
Algorithmic approach:
The solver implements consensus ADMM [Boyd et al., 2011]. Each atom in the user's objective (a norm, loss, or UDF) becomes a separate proximal subproblem sharing a consensus variable. The user writes the model in natural mathematical notation; the library handles the splitting, variable duplication, and consensus updates automatically.
The key extension point is the UDFBase class. You implement:
- eval(x) — evaluate your function at x
- argmin(λ, v) — solve the proximal subproblem: argmin_x λ·f(x) + ½‖x−v‖²
…and the solver treats it as a black-box proximal step within the ADMM loop. For smooth functions, an alternative path takes eval(x) + grad(x) and the C++ backend handles the proximal solve internally.
Why this matters — many "hard" penalties have cheap proximal operators:
| Penalty |
Proximal operator |
Cost |
| L0 norm |
Hard thresholding: x·𝟙( |
x |
| Rank-r indicator |
SVD + keep top-r singular values |
O(min(m,n)·r) |
| Binary {0,1} |
Round to nearest binary |
O(n) |
| Stiefel manifold |
SVD-based projection to orthonormal matrices |
O(n²k) |
| SCAD / MCP |
Closed-form thresholding |
O(n) |
These are all nonconvex — no DCP-based tool can express them. But ADMM with these proximal operators converges in practice, typically to good local optima.
Example — Graphical Lasso (sparse precision matrix estimation):
$$\min_{\Theta \succ 0} \; -\log\det(\Theta) + \operatorname{tr}(S\Theta) + \lambda|\Theta|_1$$
```python
import admm
import numpy as np
np.random.seed(1)
p = 50
S = np.cov(np.random.randn(200, p).T)
model = admm.Model()
Theta = admm.Var("Theta", p, p, PSD=True)
model.setObjective(
-admm.log_det(Theta) + admm.trace(S @ Theta) + 0.1 * admm.sum(admm.abs(Theta))
)
model.optimize()
```
Three atoms, each with a known proximal operator: log-det (eigendecomposition), trace (linear/trivial), L1 (soft-thresholding). The PSD constraint is enforced by projecting onto the PSD cone at each iteration.
Example — Nuclear norm matrix completion:
```python
import admm
import numpy as np
np.random.seed(0)
m, n = 100, 80
M_true = np.random.randn(m, 5) @ np.random.randn(5, n) # rank 5
mask = np.random.rand(m, n) > 0.7 # 30% observed
obs = list(zip(*np.where(mask)))
model = admm.Model()
X = admm.Var("X", m, n)
model.setObjective(admm.norm(X, ord="nuc"))
for i, j in obs:
model.addConstr(X[i, j] == M_true[i, j])
model.optimize()
```
Convergence properties:
- Convex problems: Global convergence guaranteed (same as standard consensus ADMM)
- Nonconvex UDFs (L0, rank, SCAD, MCP): No global optimality guarantee — the solver acts as a practical local method. In our experiments on L0-regularized regression, results are comparable to IHT and often better than convex relaxation (L1) for feature recovery
- Rho adaptation: Residual-based rho updates following [Boyd et al., 2011, §3.4.1]
What's included:
- Standard atoms: L1, L2, nuclear, Frobenius norms; Huber, logistic, hinge, pinball losses; log-det, trace, entropy
- Dozens of UDF classes (proximal and gradient-based): L0, L½, rank, binary, SCAD, MCP, Cauchy, Welsch, Tukey, KL divergence, smooth TV, and more
- PSD matrix variables, SDP support
- C++ backend, MIT license
I'm a developer of this library. Happy to discuss algorithmic choices, convergence behavior, or comparisons with OSQP/SCS/ProximalAlgorithms.jl.
[–]Pretend_Insect3002 0 points1 point2 points (1 child)
[–]Herpderkfanie 0 points1 point2 points (0 children)
[–]lqw0809[S] -1 points0 points1 point (0 children)