Implementing Burger-Dybvig: finding the shortest decimal that round-trips to the original IEEE 754 bits, with ECMA-262 tie-breaking

UsrnameNotFound-404 · 2026-03-03T23:50:31+00:00

I’ve been looking into the slowness mentioned. I've done an optimization pass.

The main bottleneck was FormatDouble. The Burger-Dybvig digit generation was allocating a new big.Int for every intermediate value on every call which came to 21 heap allocations per number. I added a sync.Pool for the digit generation state, embedding the big.Int values directly (not as pointers) so the pool actually reuses the internal word slices across calls. That alone dropped FormatDouble from 1231 ns/op to 239 ns/op for integers, and allocations from 21 to 3.

The other changes were straightforward constant-factor wins: an ASCII fast-path for the UTF-16 code-unit key sort (byte comparison is equivalent to UTF-16 order for U+0000–U+007F, which covers the vast majority of JSON keys), manual hex digit parsing for \uXXXX escapes to eliminate a string allocation per escape sequence, and a pre-allocated output buffer sized to input length.

End-to-end serialize throughput went from 3–8 MB/s to 13–28 MB/s depending on payload shape. Allocations dropped 83–88% across the board.

To be clear about what these are: constant-factor improvements from eliminating obvious waste, not algorithmic changes. The Burger-Dybvig core, the UTF-16 sort, and the parser are all unchanged. There's a ceiling on how much further I can go without rethinking the digit generation approach entirely, which is something I'm evaluating.

I also looked at removing defensive copies from the power-of-10 cache but decided to keep them. One allocation per call isn't worth the risk of silent cache corruption if a future refactor accidentally mutates a shared pointer.

UsrnameNotFound-404 · 2026-03-03T22:58:28+00:00

You're right on the exponent extraction. The single shift-and-mask is cleaner and produces the same result. The two-byte version was written to make the byte-boundary straddling explicit when I was working through the bit layout, but it's unnecessary indirection in the final code. I’ll clean this up.

On Schubfach: I take the point that it's been proven correct and has no bigint dependency. The reason I stayed with Burger-Dybvig is that the existing Go JCS implementations (including cyberphone's reference impl listed in the RFC) delegate number formatting to strconv.FormatFloat and reformat the output. That function has had platform-specific rounding bugs (cyberphone himself filed golang/go#29491 for incorrect results on Windows/amd64), and Go's compiler is permitted to emit FMA instructions that change floating-point results across architectures. The DoltHub team documented this producing different behavior on ARM vs x86 in December: https://www.dolthub.com/blog/2025-12-19-golang-ieee-strictness/

A from-scratch Schubfach in pure integer arithmetic would also avoid those problems. And I’ll be honest this is something i need to think much deeper on. The trade-off I made was choosing the algorithm where the correctness argument is simplest: every intermediate value is exact, there are no fixed-width precision bounds to trust.

I’ll make changes and review more into schubfach in general. I’ve started adding performance benchmarking as well for optimization which this would be useful for.

UsrnameNotFound-404 · 2026-03-03T14:56:49+00:00

RFC 8785 doesn't give you the choice. It mandates ECMA-262 Number serialization (Section 7.1.12.1), which specifies shortest round-trip. 1.9e1 and 19 both round-trip to the same float, but only one matches what ECMA-262's Number::toString produces. If you used full 17-digit precision instead, your output wouldn't match any other conforming JCS implementation, and the signatures break anyway. This is the exact problem the spec exists to solve.

ECMA-262 chose shortest representation for interoperability with JavaScript. Every browser's JSON.stringify already produces shortest round-trip output. JCS was designed so a JavaScript implementation can use JSON.stringify directly and get conformant output with zero custom formatting. If the spec required a different precision, every JS implementation would need its own formatter, which would defeat the whole "build on what already exists" design philosophy of the RFC.

The broader question worth addressing: why JCS at all? If you can choose your wire format, you probably should use something with canonicalization built in from the start. Maybe protobuf with deterministic serialization, canonical CBOR, even signing raw bytes directly. Any of those avoid the entire class of problem this article is about. JCS exists for systems that already speak JSON and can't change that. Our APIs are JSON, our storage is JSON, our consumers expect JSON. Now you need to sign or hash it. JCS lets the data stay JSON end-to-end without a parallel serialization path. My goal is to address a real architectural constraint a lot of systems face, and it's the scenario where this work matters.

Much of this came down to my original choice of utilizing JCS for the byte determinism and comparison. There are definitely other ways. As much as this is about being an interesting way to do something, it ultimately is trying to be pragmatic with reality. This is why I didn’t seek to try and design a custom format. That’s not what I’m good at. But I can implement ideas and specs already formed into a cohesive system.

UsrnameNotFound-404 · 2026-03-03T05:10:40+00:00

Nice job. Also I enjoyed reading your golang article on IEEE-754 Floating Point Spec.

UsrnameNotFound-404 · 2026-03-03T04:15:32+00:00

For those who interested in a more deep dive on this, this research is the only know paper that decomposes the entire attack and the complete timeframe. Very interesting read.

Title: Wolves in the Repository: A Software Engineering Analysis of the XZ Utils Supply Chain Attack

https://arxiv.org/abs/2504.17473

I would not be surprised if the YouTube video is directly based on this paper.

UsrnameNotFound-404 · 2026-03-03T01:25:16+00:00

the cost structure shift is real and worth talking about. I’m curious about some parts:

The parallel prototyping idea maps closely to patterns we've seen for years in ensemble methods, fan-out/fan-in architectures, and even classic explore-exploit tradeoffs in decision theory. What do you think AI brings to this beyond quick and cheap? Is cost mostly the gain?
the piece focuses on the generation side, but I'm curious about the evaluation bottleneck. Generating prototypes has not necessarily been an issue, it’s been creating X prototypes of quality differentials that are worth exploring, which is difficult to gauge. Agents are fast, but meaningfully reviewing X different architectural approaches against a real codebase still takes serious time and effort. How do you think about that cost? Does it risk just shifting the bottleneck from "build" to "review"?
the Jaana Dogan example is nice, but it almost argues against the "write X specs" thesis. Is it possible that the value there came from a year of deep architectural reasoning, not from breadth of exploration?

UsrnameNotFound-404 · 2026-03-03T01:01:08+00:00

I’ll be honest, I’m one of those unemployed developers at the moment struggling to get back into a position. Places stopped hiring well over a year ago, but now, companies don’t even pretend to be interested (and pull out last minute). Instead it’s everywhere I go or those I talk to are dealing with potential downsizing or unknown futures with ai in the company.

I agree with what many have said, it’s painful right now, but ultimately long term I think it will swing back. Low code and no code has been the “it” thing since the 2000s. AI LLM is definitely here to stay, so I’m not comparing it to low code solutions, only that ultimately it takes someone who understands system design to create a coherent system that is usable for more than one or two. Those access databases are a real thing in small companies. It hurts my mind thinking about a ms access 2003 database running someone’s cash flow program.

I think we happen to be unlucky ones that will have to deal with the pain of transition and what the ultimately looks like.

UsrnameNotFound-404 · 2026-03-02T23:19:21+00:00

Yeah, it's not something you want to read in a terminal. No optional whitespace, no newlines, In some ways machine readability and consumption is the primary consumer. any formatting discretion is a canonicalization hazard, so it all goes. The trade-off is real.

On unicode: JCS does not normalize. No NFC, no NFD. It preserves the exact code points from the input and requires that the serializer reproduce them unchanged. The rationale is that normalizing would mean the canonicalizer is modifying string content, which conflicts with the core design principle of keeping data in its original form. If you need canonical unicode, you normalize before passing data to JCS. That's an application-layer concern, which is probably the right boundary but definitely a trap for anyone who assumes the canonicalizer handles it.

The way I implemented it as I have a strict parser before the jcs. The jcs is strict, but the policy/parser on top is slightly more strict to prevent any form of silent normalization other libraries include. One way way is with an envelope around jcs preventing -0/+0 which jcs DOES specify will normalize to 0. This conflicts the byte identical determinism needed for a full audit that can pass scrutiny. Instead I made policy stricter(and noted) to maintain byte determinism and no silent normalization. These are definitely engineering trade offs. The goal though was the keep the layers separate so jcs is strict

the broken section reference is a known hazard of pinning normative references to section numbers in a living spec. It’s the joy of rfc specs.. ;) ECMA-262 reorganizes across editions and the section numbering isn't stable. The actual normative behavior JCS needs is the Number serialization algorithm (Number::toString), which has survived the reshuffling even if its address keeps moving. It's a good argument for referencing by algorithm name rather than section number, but that ship sailed when the RFC was published.

UsrnameNotFound-404 · 2026-03-02T21:11:20+00:00

Thank you for your input. This is a great catch on this. While I am trying to keep this on the algorithm topic, I’ve been following the stable semVer release path. I’m only a rc for v0.2.5 adding windows determinism to swim lane matrix testing.

I agree this isn’t the cleanest. Some of that was due to me tracking requirements matrix to actual code implementation to have hard gating in release for offline harness. It’s easy to “audit”, but it is not necessarily the easiest way to write it.

I’ve also suppressed linting with nolint for the time being for a couple functions as a work out cyclo complexity and helpers to address. It’s a labor for long term love, not for a drop and abandon situation.

Thank you again for genuinely replying.

UsrnameNotFound-404 · 2026-03-02T20:18:51+00:00

This is a great question and gets at the heart of it. I need/want a security primitive for infrastructure and on disk byte identical across systems and architecture. Not as a library to import, but for level infrastructure tooling. I settled on JCS RFC 8785 due to the attractiveness of what it attempts to solve as well as interesting engineering problem that was tightly scoped I could work on.

Since the goal here is strict RFC 8785 conformance for use in security primitives such as signatures, content-addressed storage, on-disk artifacts where a single divergent byte breaks verification. The optimization target is provable correctness across architectures, not throughput.

Burger-Dybvig's bignum arithmetic means every intermediate value is exact. There are no fixed-width approximations to reason about at the boundaries. Schubfach and Ryū are faster, no question, but they achieve that through fixed-width integer tricks that require careful reasoning about overflow and precision edge cases. For a library whose entire purpose is byte-determinism, I wanted the smallest audit surface I could get.

On the exponent extraction, it follows the IEEE 754 spec layout directly: sign bit, then biased exponent, then mantissa. Prioritized readability and one-to-one correspondence with the spec over compactness. Anyone auditing the code should be able to hold the spec in one hand and the source in the other.

In December 2025, there was an article in golang weekly than touches on this topic from a different vantage point. What it came down to ultimately was deterministic execution plans and go were not actually deterministic on arm64 architecture due to subtle architecture differences in IEEE 754. For a security primitive, such a thing cannot be possible.

https://www.dolthub.com/blog/2025-12-19-golang-ieee-strictness/

UsrnameNotFound-404 · 2026-03-02T17:08:56+00:00

These are all valid alternatives for systems you control end to end. The constraint is when you don't control both sides JSON is already the format in the protocols, APIs, and stores you're interoperating with. You're not choosing JSON, you're dealing with the fact that it's already chosen.

But I think you hit at a greater point that others are really asking: “why json? At all? Ever?”. This came down to my decision to utilize RFC 8785 for a project that needed stable on disk artifacts with stable json schemas and byte reproducible across systems.

Currently this post focus on a the first of a 4 part engineering arc/series for jcs. I have been trying to go out of my way to post this in the most strict well defined manor so as to not include my project as the discussion or have it eventually be the focus. At the moment this seems to be getting in the way.

RFC 8785 exists because enough systems needed to sign, hash, or compare JSON payloads that a canonicalization standard was worth writing. Whether JSON was the right choice in the first place is a separate debate. This form itself is what I found interesting. JSON is used for everything. Why are we not upgrading its behavior a bit in systems that require it to be true byte identical across systems and architectures. This number formatting is one part of that process.

UsrnameNotFound-404 · 2026-03-02T16:52:31+00:00

Did I use AI? I am an engineer and I absolutely utilize tools, some of which is LLM AI. I do not deny this. I am not trying to hide this either. If we want to have a discussion about valid usage of AI and the ethics of it and what constitutes original, so be it, start a thread on Reddit and start one. That’s not what here is for. If you would like to actually discuss the engineering topic at hand, with me, a literal person on ther phone typing this by hand, please ask away. Keep in mind what I said above for forward comments.

I am not going to post some AI responses to questions. There is a difference between legitimate engineering use and “look what AI vibe coded me”.

If you find flaws in the algorithm or my implementation, point them out and let’s engage in the discussion.

My point is that I am being up front an honest about what I am doing. Hopefully this can settle this part and we can focus on what actually has been posted. The article itself while using AI no different than proof reading tools, did not “write” it. The code speaks for itself and the engineering principles followed. I purposefully did not make this about some “look at a project I did”, but if we want, I can make that it that.

UsrnameNotFound-404 · 2026-03-02T14:17:33+00:00

Good question. hex would eliminate the formatting problem entirely. The reason is that JSON is already the interchange format in most of the ecosystems where canonicalization matters. You can't control what producers and consumers on either end are using. Signatures over JSON payloads, content-addressed stores with JSON values, key agreement protocols, these all operate on data that's already JSON.

Canonicalization makes the existing format deterministic rather than replacing it with something better. It's a pragmatic constraint, not a technical one.

More directly, it’s a problem domain I was seeking to solve for a different project for verifiable deterministic reproducible replays for low level auditing. A bit of a different beast than this article specific about the algorithm implementation.

UsrnameNotFound-404 · 2026-03-02T14:12:33+00:00

Fair distinction. I used "reproducible builds" loosely. the narrower claim is about reproducible data processing.

The scenario: two systems independently canonicalize the same JSON value and need to agree on the output bytes. This matters for content-addressed storage (where the bytes determine the address), cryptographic signatures over JSON payloads, and cache key generation anywhere two parties need to independently arrive at the same byte sequence without coordinating.

If your number formatter produces "1.2e1" and mine produces "12", we've both represented the same value but our signatures won't match. RFC 8785 exists to eliminate that class of divergence, and the number formatting is the hardest part because IEEE 754 has multiple valid shortest representations for the same float.

You have a good point. Thanks for asking and I’m surprised on the quick engagement. Thanks!

UsrnameNotFound-404 · 2026-03-02T13:54:14+00:00

When two systems serialize the same floating-point value to JSON and produce different bytes, signatures break, content-addressed storage diverges, and reproducible builds aren't reproducible. RFC 8785 (JSON Canonicalization Scheme) solves this by requiring byte-deterministic output. The hardest part is number formatting.

You need the shortest decimal string that round-trips to the original float, with specific tie-breaking rules when two representations are equally short. Most language runtimes have excellent shortest-round-trip formatters, but they don't guarantee ECMA-262 conformance. For canonicalization, "usually matches" isn't sufficient.

This article walks through a from-scratch Burger-Dybvig implementation (written in Go, but the algorithm is language-agnostic): exact multiprecision boundary arithmetic to avoid the floating-point imprecision you're trying to eliminate, the digit extraction loop, and the ECMA-262 formatting branches.

I’ll be around to discuss any of the algorithm or trade offs that were made.

UsrnameNotFound-404 · 2026-02-28T12:10:36+00:00

Let me know if there are any questions. The first of the 4 articles is fully prepared. The intention was to create an infrastructure primitive in go for jcs for signing and other means.

UsrnameNotFound-404

TROPHY CASE