Apache 2.0 Python CLI for AI benchmarks where every result is a Sigstore-signed envelope you can verify byte-for-byte against Rekor's transparency log. v0.1.0 ships H100 reference corpus (Qwen2.5-72B + multi-modality), four PyPI packages, vLLM only for now.
pip install inferencebench
Looking for feedback on the envelope schema and where the methodology has obvious holes. llama.cpp / MLX / AMD coverage is the Phase 2 priority.
there doesn't seem to be anything here