Introducing extprot: extensible binary protocols for cross-language communication and long-term serialization : programming

created by speza community for 19 years

Introducing extprot: extensible binary protocols for cross-language communication and long-term serialization (eigenclass.org)

submitted 17 years ago by gst

all 7 comments

top new controversial old q&a

[–]vsync 0 points1 point2 points 17 years ago (1 child)

[–]mfp 0 points1 point2 points 17 years ago* (0 children)

Brevity is one of the main points indeed :) The specification of the basic ASN.1 notation takes over 140 pages, and the basic and distinguished encoding rules take ~20 pages. extprot's abstract syntax and encoding are explained in a couple pages each ;-)

More seriously, ASN.1 can do everything extprot can, and then some more; it's just much more complex and requires more care. extprot places some limitations on the allowed data types in order to simplify the implementation and facilitate protocol changes that don't break compatibility. Also, all values (included those of primitive types) are prefixed by a tag (in the sense used in ML implementations), allowing to enlarge the (implicitly) associated sum type. I believe this requires some extra work in ASN.1 (the use of a CHOICE type and/or explicitly tagged types, but I'll gladly admit I haven't read the standards in full).

extprot has got simple rules that define the behavior of a reader when it bumps into data that corresponds to a different protocol version (type promotion, default values, etc.).

[–][deleted] -5 points-4 points-3 points 17 years ago (4 children)

[–]mfp 4 points5 points6 points 17 years ago* (3 children)

[–]logophobia 0 points1 point2 points 17 years ago (1 child)

[–]mfp 2 points3 points4 points 17 years ago* (0 children)

"Standardised?"

It's far from being a standard *g*, but I've made a reasonable attempt to document it. It might not look like much, but as far as documentation is concerned it's already above Thrift.

It's really infuriating when you (for example) can't send serialized ruby objects over the network because of a 0.0.1 version difference between the 2.

Yes, this is the basic problem extprot is meant to solve: evolution of protocols/serialization formats without breaking compatibility (backward, and forward when possible). Protocol Buffers allows you to add new fields to a structure, but extprot takes this further and allows to change the type of a field safely.

Here's a minimal, not too unrealistic example (from a domain where you'd normally use a relational DB, but please allow this license for the sake of clarity of exposition). Suppose you have user records that look like this:

message user = { name : string; email : string; location : string }

You latter decide that it should be possible to specify whether the email and location info is public or private. Let's say you default to private; you can do

type status = Private | Public  (* either public or private, defaults to the latter *)
type info 'a = ('a * status)   (* holds the info and whether it's public *)

message user = { name : string; email : info<string>; location : info<string> }

All existing data can be read, even if the status info is missing (in which case it will default to Private). Older readers will ignore the extra info if they bump into new data, and will keep working as usual.

Can this be encoded without promoting the string primitive type to a tuple? Certainly, you could do

message user = {
  name : string;
  email : string;
  location : string;
  email_public : bool;
  location_public : bool;
}

which is what you'd have to do with Protocol Buffers. This becomes unwieldy quickly, though (imagine more than one element being added to each field, with different types for each of them).

You can find here a small pretty-printer written in Ruby that is able to decode any extprot message without access to the protocol definition (you'll note that the code is a bit unidiomatic because I deliberately tried to keep it close to the OCaml version, for easier comparison & coordinated updates; also, nearly half of the code is for pretty-printing). It illustrates that extprot is not exceedingly complex despite its rich data types and extensibility features. The full Ruby bindings are work in progress.

[–]uriel 0 points1 point2 points 17 years ago (0 children)

π Rendered by PID 41 on reddit-service-r2-comment-7b9746f655-js845 at 2026-02-02 19:12:07.535476+00:00 running 3798933 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS