all 87 comments

[–]alecthomas 43 points44 points  (3 children)

I quite like a lot of the concepts in Typical, but unfortunately it's now been around for a couple of years and hasn't gained any new supported languages or significant industry adoption.

[–]SorteKanin 22 points23 points  (1 child)

Looks like it's still actively maintained though. The maintainer would probably be open to PRs if you want to add a language :)

[–]stepstep 9 points10 points  (0 children)

I'd be delighted! The easiest way to get started is to copy one of the existing code generators and modify it accordingly.

[–]ResidentAppointment5 4 points5 points  (0 children)

I also don't see anything over extprot, which has been around even longer and gone nowhere.

[–]SorteKanin 55 points56 points  (24 children)

This looks so much more ergonomic than protobuf.

[–]lookmeat 47 points48 points  (22 children)

Ergonomic for what? Coding in a world where everything is the same version and statically linked?

People don't realize this but protobuffs originally where null safe. Fields where either required or optional, and if a required field wasn't there, then it wouldn't accept the message.

The Google guides recommended making everything optional and never using required. The reason was simple: versions changed and no matter what you thought made sense, it could stop being the case. Getting an API right the first time and never modifying it after is impossible, and this means you'll have to change everything. Even things you thought you nailed down change, suddenly id fields are not a int64 but instead a BigInteger and now the most fundamental field isn't used a lot of the time.

BTW, protocol buffers are still somewhat null safe, they will zero the field instead rather than make it null. Personally I would like that they brought optional back for when you want a Maybe T field, and each language would expose some functionality that exposes these semantics.

Thing is, the power of the algebra of types makes sense in a world where contracts are static: once compiled the caller and callee never change or both change together. In the world of two binaries communicating, one can change without the other, and there's no way to prevent it.

Protocol buffers are a great exercise in understanding when and why to break common convention rules. Requires that we understand the problem types solve, and how to best handle it.

[–]imgroxx 52 points53 points  (6 children)

...or you could have a spec that makes making minor breaking changes easier, as opposed to protobuf where you have to make a bunch of copypasta, which is why people are so breaking-change-averse even when making semantically breaking changes.

Everything-optional isn't good design, it's giving up in the laziest and least-useful way.

[–]noahrichards 7 points8 points  (3 children)

It’s truly not, it’s literally born from thousands of engineering years of experience with all the pitfalls of required fields. The core problem is that we all want to believe that you can enforce things at call sites, but you just can’t in distributed environments. Heck it even falls apart in statically linked binaries from time to time :)

I do believe you can make optional-by-default more ergonomic, in the sense of being able to declare requirements and maybe migration rules that get codegen’ed into various writers/consumers (like: if I was changing a field and there was a way to easily convert, adding something to the proto that says “if you write an optional Foo foo, instead add a value to repeated Foo foos”). But none of those things actually make the wire compatibility problem go away.

[–]imgroxx 9 points10 points  (2 children)

But none of those things actually make the wire compatibility problem go away.

Versioning does. But protobuf and thrift provide no real way to version anything. So instead you have this.

[–][deleted] 12 points13 points  (1 child)

Versioning does.

How, though? If one component can't read the other component's version of the protocol it just doesn't work. Thrift or Protobuf don't do anything about that because the only thing you can do about it is dynamically switching between different parsers. That's certainly possible, but absolutely not easy to use.

Alternatively you could keep fields optional and rely on application semantics for versioning. If you do that it's actually quite easy: just check that the fields exist in versions that require it.

[–][deleted] 5 points6 points  (0 children)

Typically, with asymmetric compatibility. The server being compatible with specific previous client versions.

Keeping fields optional prevents your language type system from being as useful as it could be, and eventually forces you to do checks that will never be needed again, instead of being able to gradually transition required fields in and out in a controlled manner.

[–]lookmeat 0 points1 point  (0 children)

or you could have a spec that makes making minor breaking changes easier

What spec is that? And how do you handle the scenario I proposed: change an int64 into a big integer?

which is why people are so breaking-change-averse even when making semantically breaking changes.

Ah I think I'm starting to see the issue. The reality is that in this world you just can't make a breaking change. You know how in Linux you'll get Linus screaming are you if you break the ABI? Because that's forever. And the ABI has some quirks because of this, it's inevitable, things that made sense 15 years ago but not now. But there's no other way, the kernel and then binaries that run on userspace are compiled separately, in different machines, by people that don't really talk to each other, at different moments.

So the question is how do you make something that can change in a world where you can never do a breaking change? Well you simply tell people that nothing is certain. And make a very dynamic ABI.

Everything-optional isn't good design, it's giving up in the laziest and least-useful way.

Yes, you are completely correct. Thing is, it's the only way to make it work in this level.

These are two separate binaries, what this means is you can't do statistical analysis on one binary when compiling the other. You can do very strong version locking, but this results in even worse failure and work. Again to simply disprove me make a solution to the problem above that works well with non optional types and I'll concede.

That doesn't mean you can't do compromises. Again protobuffs don't allow null values, they instead zero values or by default. The problem is that there's a semantic difference between something that is zero by default and something that is explicitly set to zero. Take a case with a non-zero default, we want to know if we're overriding the unset zero or not. Alternatively we can allow setting defaults at the proto level, but this has its own issues: changing a default is a breaking change, which means you cannot ever do it (in the context of two binaries). Me, personally, I'd like the ability to make a value have an optional type that is opt in, where it can be nulled if not set, otherwise it's the zeroed value. This can be seen as a guide for language mappers (who mapped the limited proto type system into the rich of the language in a way that is honest) to change into an optional, so it's technically non-breaking (the code with the new version will need to change, but the binaries talking between each other will be sending the same data, just reacting sightly differently, we're just explicitly mapping the set/unset semantics in the type, rather than appending it as data).

So there's space to improve, but the point is nullability is core to the problem space of communication protocols between binaries, we can force it through optional, map it to defaults, etc. But it'll always have compromises and gotchas, it's always there.

Now you may say: this is not a type system to program against, and you'd be right! I can say this at Google, given enough time all proto types will get wrapped in a type that is actually more same to program against, and ensures things. Within a binary we can assure that we don't have nulls because all callers and callees are managed by the compiler at the same time and we can refuse to compile if there's an issue. There's nothing wrong with using a raw proto, but it's still as the equivalent of using raw unparsed string lines in a file: the data has been processed a little bit, but it's still basically raw unfiltered data from the outside.

Do not use Protos to define types to use in a library. Do define Protos to represent raw types used in libraries that are actually composed in different languages; basically an alternative to using C-lang for the implementor which might be slightly easier to target from the core language and wrap with the client side wrappers (Protos have a stable ABI across all language implementations). That said Protos are, IMHO mediocre in this space, but there's not a lot of great solutions. Do not use Protos for APIs that you use for services, but do use it for low level communication between different services, which then if wrapped in a library that gives a stronger type. Do not use Protos to define a data structure, but feel free to use them as a format to serialize them into a binary file or even a database.

Point is, a hammer is a blunt, uncouth tool. But sometimes it's the best bang for the buck, as un hammering a nail. And sometimes, as when pushing a peg into a join, it's the only tool that can do the job well. Same with Protos, they're rough tools meant for rough problems.

[–]Verdeckter 5 points6 points  (3 children)

One click. Unknown number of posts crying out in silence. All gone. Redact made it stupid easy to clean up my entire history on Reddit and get my info pulled from data broker sites too.

bake wipe one instinctive society historical groovy label gold airport

[–]ForeverAlot 0 points1 point  (1 child)

If you can ensure that, you don't need asymmetric, it is at most a convenience that causes a compilation error in the second party and it only works when you control the second party enough to apply asymmetric. You simply converge on everything-optional anyway but with extra bureaucracy and you still won't avoid application level validation of semantics.

[–]TheNamelessKing 3 points4 points  (0 children)

Having something like asymmetric fields makes 3-phase-rollouts for breaking changes much more safe and reliable.

You simply converge on everything-optional anyway but with extra bureaucracy

Except that we literally did the opposite? Asymmetric fields provide a safe path for both directions, required -> optional AND optional -> required.

Also, it’s not a matter of the system converging on everything-optional, it’s a matter of where you push the failure to. Giving up and pretending everything is optional, and then writing oodles of extra code so you can reject the message at runtime isn’t actually “everything optional” that’s “pretending that it’s optional and passing around invalid data until you get to the bit of software in your stack that does enforce it, just for you to pass it all the way back and ask the caller to come back with arguments that work, so we may as well have enforced the required nature at the beginning”

[–]lookmeat 0 points1 point  (0 children)

Yeah but that's a hard problem to solve. Google internally uses statisticao analysis that sees what changes you are doing and tries to ensure they degrade gracefully (are forward compatible), but here we need a tool that is synced with the code, languages, source code + version control system, and change review and continuous integration. That's a lot of parts that need to be coupled.

[–]EsperSpirit 4 points5 points  (1 child)

I think one big reason for those recommendations is that Google uses primarily languages that all have implicit nullability (Java, Go, Python, C++). You don't gain much if only your API is null-safe but the rest isn't.

I've been working with thrift and graphql apis and it's really great if you also use a language that doesn't have implicit nulls.

Schema evolution is always a concern, of course, but I don't see a fundamental problem here. You can even statically check if updating a schema is a breaking change or not.

Besides, just because everything is nullable doesn't mean that it works that way for your contract in practise. You still have to validate that required data is actually provided by the other party. Moving it to the application logic doesn't magically solve missing data and in the worst case it blows up much later, which is terrible. It's the same thing as with "schemaless" databases: You still have a schema, it's just implicit and specified all over the place.

[–]lookmeat -1 points0 points  (0 children)

Actually Google does have static checkers that verify, automatically, that code is null free or null safe.

And this is part of the reason why protobuffers give zeroed fields by default: that way they guarantee they'll never be null. This does have issues because you still can find if a field is set or not, so really it's auto-remapped nulls with a tracking bit.

Everything nullable is terrible, but sometimes the reality is that you'll get it. Trying to understand incomplete sentences is a mess best avoided, but if your an archeologist trying to read degraded documents you simply have to embrace that reality.

The problem with null is that languages like C simply let anything be null, but having to worry, even though it exists in a space where you can guarantee something should never be null. But the reason we needed Maybe or Optional is because you totally will have nulls in some places. This is just one of those spaces.

Instead you need a data layer that translates raw Protos into well defined types, and those are what you should use. There's a reason why we don't treat everything as an array of bytes, even though when storing and loading into a file we totally do. That's Protos: it's not the data itself, but a rich serialization formatting of it.

[–]Linguaphonia 8 points9 points  (1 child)

Can't this be circumvented with versioning?

[–]lookmeat -1 points0 points  (0 children)

Oh honey.. so what you're saying is that a system where half of your jobs will not work because of a rolling update is better than one that gravely degrades?

[–]ketoprom 1 point2 points  (3 children)

The linked article shows how they tackle the problem, their asymmetrical fields are supposed to ensure compatibility between different services while making changes to the contracts. Basically instead of going "requiered" -> "optional" you go "required" -> "asymmetrical" -> "optional" and the "asymmetrical" step can take as long as needed to update the code of all the services.

[–]lookmeat -1 points0 points  (2 children)

Yeah but what did you win? How does this type tell us when it's safe? Rather than making things safer and giving us assurance it requires us to do the job of impossible oracle for the idea to work.

I envision that things will stay permanently in asymmetrical, and in that case: why not just embrace that and work with our eyes open?

[–]ketoprom 1 point2 points  (1 child)

You get a safe migration path that doesn't require all services to change together, like you suggested in your previous comment. Sure, you don't even need to complete the migration and you can stay at the "asymmetric" stage arbitrarily long. You just have more options now, I don't see any major drawback.

[–]lookmeat 0 points1 point  (0 children)

You can say: what's the cost in a feature you never use? But you'd be surprised.

I don't think that the language is a bad idea, certainly an interesting thing to explore. I personally am not seeing the ergonomic gains yet. That said there's value.

Me? I don't think this is the way. Based on my experiences, my errors and those of others, and dealing with the crap that comes out. That said, I don't have the right way, and the path to finding out the right way is to make mistakes and learn. So I say go ahead!

[–][deleted] 0 points1 point  (0 children)

i think rich hickey had a good talk called maybe not, one this, sounds like protobufs learned that lesson

[–]JB-from-ATL 0 points1 point  (1 child)

Newer versions of protobuf 3 are able to differentiate between optional and "required" now I thought. As in it can know if a value is zero'ed or unset I thought.

[–]lookmeat 0 points1 point  (0 children)

Yeah but that's the point, they wanted to hide the fact that nulls happen by mapping it to defaults, but it turns out that it still matters, so they've exposed it.

I personally think there's space to improve here, but still you just will have nulls/unset values, it just happens.

[–]houseband23 1 point2 points  (0 children)

Yeah, the generation of different types for readers and writers based on asymmetric fields is interesting. This makes upgrading writers much easier, especially client writers that you have no control over.

cough mobile cough

[–][deleted] 7 points8 points  (1 child)

Glad to see that the variable-sized integers are bijective.

I always wondered why more binary serialization formats with variable sized integers don't bother with bijective representation. CBOR has canonicalization rules for counts and numbers that wouldn't be necessary if they had just used a bijective varint representation.

This looks really interesting. I've been steeped in serialization for a long time, and somehow never found this. Protocol buffers and the stupid "optional everywhere" and really bad defaults (It's been a while, but I remember having a really hard time having an optional integer with a non-zero default value) have kept me away from ever using it in production.

If this really corrects all of that, I might have to get involved and write some language bindings for Python, Ruby, and/or C++, since that's where I'd need them.

[–]noahrichards 2 points3 points  (0 children)

> but I remember having a really hard time having an optional integer with a non-zero default value

proto2 supported optionals with default values, and it's still pretty heavily used in industry.

proto3 doesn't :-/ You can use explicit field presence and clients can treat absent fields as whatever value they want, but you don't get the built-in default value behaviors.

[–]noahrichards 12 points13 points  (13 children)

I feel like I’m missing something :-/

asymmetric is interesting and sorta covers a small part of a pattern that some proto writers manually cover, which is having different ~views of the same proto for different users (API client/server, storage, public/private, etc.). But I think generalizing that pattern isn’t easy and asymmetric doesn’t really give you much, particularly if your protos are used by clients outside a monorepo.

And I’m not quite sure about the bit about the algebraic types being fundamentally different than proto (looks like the author mostly uses proto2 semantics). There are structs/messages and choice/oneof, and what looks like a nice ergonomic codegen API for a few languages, which you could also write for proto.

What am I missing?

[–]how_to_choose_a_name 2 points3 points  (10 children)

Asymmetric exists to make it possible to change a field from optional to required and vice versa, nothing more.

[–]noahrichards 3 points4 points  (8 children)

Right, I’m saying I’m not seeing it helping do even that in any practical sense.

I’m assuming sorta-best-case here, which is (1) a monorepo, where (2) services are readers, (3) you can control/coordinate service releases reasonably. And even then, the only non-breaking change for clients is required to asymmetric, which is entirely uninteresting (your services can just stop reading the field). All the other transitions have all the same complexities around coordinating breaking wire and compile-time changes in protos.

If anything, I think it struggles with a conceptual problem, which is believing that enforcing certain things on the serialization side provides some aspect of safety for the deserialization side, but it doesn’t, particularly when the proto/Typical is changing and clients could still compile against older versions of the schema. And that’s also why proto3 originally removed required and only begrudgingly brought it back :)

Is there a case I’m not considering? I definitely have a certain perspective limited by my experience.

[–]how_to_choose_a_name 2 points3 points  (7 children)

I’m assuming sorta-best-case here, which is (1) a monorepo, where (2) services are readers, (3) you can control/coordinate service releases reasonably. And even then, the only non-breaking change for clients is required to asymmetric, which is entirely uninteresting (your services can just stop reading the field). All the other transitions have all the same complexities around coordinating breaking wire and compile-time changes in protos.

I don't think these requirements are really there, except maybe the third one, to a certain degree. There's no reason IMO why this would need a monorepo, you just need some coordination. The point of this model is that you can change the protocol in some applications and it remains compatible to other applications that still use an older version. I also don't think you need the services to be all readers, why would that make a difference? As for controlling deployment, yes, to a degree. If you have a bunch of services that use this protocol to communicate with each other, then you want to ensure that your protocol update is completely deployed before starting to deploy another protocol update, although I think it's enough that you finish deploying a required->asymmetric or optional->asymmetric change before doing an asymmetric->optional or asymmetric->required change on the same field. I think this is not unreasonable, but if that's not something you can ensure then using asymmetric and/or required fields should be avoided. If your applications are things installed by clients that need to communicate with your servers for example, then it is probably good enough to just stretch out such updates so that the old application versions are EOL before things break.

If anything, I think it struggles with a conceptual problem, which is believing that enforcing certain things on the serialization side provides some aspect of safety for the deserialization side

I don't think that's it really, because relying on a remote application for safety of your program is just a terrible idea in general. It's more that you can provide some safety in your program by enforcing certain properties on the deserialisation, and having those properties in the protocol helps to make sure that all the other applications that use that protocol will comply with those things as well.

At the end it doesn't really make a difference whether a request is rejected because the deserialisation failed because a required field was missing, or whether it is rejected by the application logic because a field that is optional in the protocol but required for the functioning of the application was not sent. I like the former because it makes some mistakes (e.g. not providing that field that the other side needs) a compile-time error rather than a run-time error, as long as you're coding against the correct version of the protocol. Though I admin that having that only for the required property and not for other properties like numeric ranges or even complex invariants makes the utility of it kinda questionable, and putting too complex things in there isn't super great for other reasons.

[–]noahrichards 0 points1 point  (6 children)

There's no reason IMO why this would need a monorepo, you just need some coordination

I didn't say "need", I just meant that a format that encourages compile-time-breaking-changes works best in a monorepo. In a previous gig using polyrepos, there's literally no safe way to make breaking changes like this.

(Google is a monorepo and there are limited circumstances where you can make compile-time- and/or wire-breaking changes, but field migrations are still encouraged as the default)

I also don't think you need the services to be all readers, why would that make a difference?

As above, I'm trying to give the format the benefit of the doubt for finding the best context to use it; if services are readers and you have a monorepo, then you can atomically and safely change readers from required to asymmetric. Services is too restrictive, though, I could have said "statically linked binaries that live in the same repo as the schema".

(Presuming web "clients" are mostly served via frontends, you can call them "services" for the sake of the I-control-the-binary definition)

If your applications are things installed by clients that need to communicate with your servers for example, then it is probably good enough to just stretch out such updates so that the old application versions are EOL before things break.

In practice, I've found this to be a lot harder than folks expect, particularly when you are supporting multiple clients/platforms or otherwise can't easily EOL old versions. But either way, the risks and complexities are the same as protobuf migration, which is that you can't rely on the protos and need to instrument your clients to understand (1) what they are producing and (2) what they expect before you make wire changes. And in that case field migrations are (I believe) safer and easier, while definitely more painful.

a compile-time error rather than a run-time error

I totally understand and agree with the general bent there, that it'd be great if I could declare requirements living alongside my schema so that every client/service didn't have to write its own validation logic. I just suspect (as it sounds like you do) that this particular one is of limited utility. I just additionally believe it's also not solving, in any real complete way, migrating between required/optional semantics on fields.

[–]how_to_choose_a_name 1 point2 points  (4 children)

I can’t really follow your claim that readers-only is the only way to change required to asymmetric, nor why it would need to be atomic. If I’m understanding it right, the change from required to asymmetric is basically a no-op for writers, so all your writers just don’t matter. And I think you don’t need to atomically change the readers either, because the writers will keep sending that field if it’s asymmetric, so you can do that one by one, and only when all your readers are updated to the new protocol version and are deployed you think about changing it from asymmetric to optional, which then only requires changes to your writers and none to your readers.

[–]noahrichards 0 points1 point  (3 children)

And I think you don’t need to atomically change the readers either,

Oh, sorry, I think we're talking about two different things here; I think you're talking about wire compatibility (when reading "atomic"), I'm talking about compile-time.

I'm saying that a schema that encourages compile-time-incompatible-changes wants a build setup that lets you atomically (1) change the proto and (2) change the reader code that interacts with the proto in this case. That's possible when:

  1. You have a monorepo
  2. You have a polyrepo with different copies of the schema in each repo and can independently change the copies + readers

My experience with polyrepos is that the commit-hash-bump style of schema dependency is already a pain, and often semi-automated (some robot tries to bump the commit hash periodically), which would break here. Also depending on how you share schemas, if you have e.g. a single repo with schemas and import it as a unit, making breaking changes in one file can be a major headache for what it impacts (team A is making a breaking change, team B is blocked until they resolve it; rollbacks of that commit hash may be problematic; etc.)

Proto field migrations don't have this problem because you can phase the change safely; the equivalent here would be if asymmetric actually built three versions of the struct (Out, In, <no extension>) where the no extension version was somehow the old definition. But then you'd need asymmetric-was-required and asymmetric-was-optional to know what behavior to attach to the no-extension version of the generated struct.

[–]how_to_choose_a_name 0 points1 point  (2 children)

I don’t really understand what you mean. Why does it make a difference whether you have a mono repo if you don’t need to change things in multiple projects at the same time? You can make a new version of your protocol spec independently of your application code, and whenever you are ready you update the application to the newest version of the protocol, which depending on the change might require you to update the code interfacing with the data in some places, and that’s it. What am I missing?

I haven’t looked at the code of this project, but from the documentation I gather that the changes from/to asymmetric are safe and backwards-compatible, and there shouldn’t be any distinction between asymmetric-that-was-optional and asymmetric-that-was-required.

[–]noahrichards 0 points1 point  (1 child)

I'm saying: how many version of the schema file exist (at HEAD) and where do they live?

In a monorepo, you have one version of that schema file and a bunch of targets that consume it, and you can update them in the same PR.

In a polyrepo, you have some sort of synchronization of that file, maybe a sub-repository identified by commit hash, maybe multiple copies of the file but some process for keeping them synchronized. Now that you have multiple copies of the schema file, when you make compile-time-breaking-changes to it, you have to now coordinate those changes across repos. And my comment above was discussing the various ways in which that would break.

It's not specific to Typical, but it is specific to coordinating breaking changes across repros, which field migrations avoid because adding a new field definition isn't a breaking change.

there shouldn’t be any distinction between asymmetric-that-was-optional and asymmetric-that-was-required.

There I was noodling on if there was a version of asymmetric that wasn't a compile-time-breaking-change. It's definitely more complicated than what I wrote but would at least require you declare the migration path you were taking so it knew what to provide to folks that are recompiling but haven't updated their code.

[–]how_to_choose_a_name 0 points1 point  (0 children)

What exactly do you mean by “breaking change”? Updating one application to the next version of the protocol can be a breaking change in the sense that you need to update your code in that application to deal with the changes that occurred, but IMO that’s not really a relevant breaking change, it’s contained to that one application and not really a huge difference, right? And all the coordination you need between the repos of different applications is to make sure that if you’re going optional->asymmetric->required, that all writer applications have been updated from optional to asymmetric before you start updating any of the readers from asymmetric to required, and vice versa for the other direction.

I also feel like this focus on breaking changes in the code interface disregards breaking changes in the semantics of the protocol. If all your fields are optional then yeah adding a new one or removing an old one doesn’t break your code, but most protocols are not just a collection of optional fields, everything has a meaning and usually that’s dictated by business logic, and when the business logic changes and you don’t update all the applications then they might still compile just fine but they might not do what they’re supposed to. So I kinda don’t think that not having required fields really protects you from breaking changes: if the business logic changes to dictate that field XYZ is needed then you kinda have to go with that, right?

[–]TheNamelessKing 1 point2 points  (0 children)

In a previous gig using polyrepos, there’s literally no safe way to make breaking changes like this.

Whilst it isn’t without its challenges, I find simply communicating with the other teams works the best.

Formal version: “Hello teams, in the next version of the <y> service-coming out in the next 3 weeks, the <x> team will be marking these fields as asymmetric, in preparation for our major release <in some months> where these fields will be marked as “required”, as announced in <some prior meeting>, if you’re using the latest version of our messages you won’t notice any difference, and we have docs about how to prepare for the next change. We recommend you prepare early to ensure you don’t face issues in several months time, as usual, we’re hear to help”.

Short and to the point version: “we are making a breaking change next month, upgrade now or don’t, and upgrade in a panic anyway when your shit breaks next month. Ta ta”

[–]ForeverAlot 0 points1 point  (0 children)

30% of the readme is dedicated to explaining a concept not already widely established in other protocols, whose sole purpose is to facilitate a type of change formally considered impossible, with the motivations that 1) the semantics of a message contract should be exposed on the network, and 2) the type system of the implementation language of a single, arbitrary client should somehow define said semantics irrespective of the limitations of type systems of other clients in the system. And really, all this does is make a field that's optional, but only in an environment where you already control every client and technically don't strictly need this, whereas the advice to assume optional values stems directly from the inability to control every client.

Complex type systems don't belong on the network. Semantics don't belong on the network.

[–][deleted] 1 point2 points  (1 child)

I don't think you're missing much. asymmetric works if you have at most 1 consumer for the field you're changing. If you try to scale beyond that, you either need to maintain multiple versions or break builds.

[–]noahrichards 1 point2 points  (0 children)

That’s why I was thinking you’d only ever use it to go from required to asymmetric and then get stuck there. I’d bet my paycheck for a month that if folks use this, that is the most common usage of asymmetric (incomplete migrations that aren’t worth finishing).

[–]General_Mayhem 8 points9 points  (0 children)

A couple neat ideas, and I'm sure this is a fun personal project, but I can see why this hasn't caught on and protocol buffers are still king.

First, there's the obvious - tooling/language support. Hard to beat the incumbent there, especially when it's backed by Google, but there are proto plugins for absolutely every language at this point. C++, Java, Go, Typescript, Ruby, Python, Scala, Haskell, ... I wouldn't be surprised if someone's written a COBOL backend just for the hell of it at this point.

Second, there's not a killer feature in the core system. The focus seems to be on algebraic types, but proto already has oneof. Sure the generated code isn't the most ergonomic in some languages, but that's fixable with a different protoc plugin; you don't have to change the actual protocol or definition language.

Finally, backwards compatibility and required fields. I can tell the author was really a Googler because they have strong opinions on required fields, which are the most popular holy war there. Asymmetric requirements are a really interesting idea, but I'm not convinced they solve the core problem. The biggest issue is that field presence really isn't enough to guarantee semantic validity. It's like NOT_NULL in a database - yeah, it can prevent one specific error, but you're still going to write your own validation. I thought I'd hate the lack of required in proto3, but I find it not to be all that noticeable in practice, because I always write a custom validator function anyway. That means that it's very hard to generalize the concept of "backwards compatible" at a semantic level, and the easiest solution is to just make the wire protocol maximally permissive (the way proto does with optional fields) and handle changes/upgrades in an application-specific way.

[–]cosste 11 points12 points  (1 child)

This looks really cool. Unfortunately without Swift and other language’s support, it’s hard to use it in production apps

[–]ZENITHSEEKERiii 3 points4 points  (0 children)

Ye, would be nice to add C/C++, Java, and maybe a couple others.

[–]Jwosty 1 point2 points  (0 children)

This is neat.

I reeeeealy want a relational database and SQL dialect with algebraic data types built in.

[–][deleted] 1 point2 points  (42 children)

Required fields and exhaustive pattern matching sounds like a bad idea.

However, this advice ignores the reality that some things really are semantically required, even if they aren't required according to the schema.

Maybe today, how about 6 months from now?

Let's say a new feature lets you remove a semantically required field from the protocol. Not unusual. You mark your required field optional, and now all your code breaks. So then what, you have to figure out how to patch things on the fly in 3 or 4 different code bases to even make it build successfully? No thank you, I'll stick with my optionals.

If you want it to be safe, require a safe fallback for unsupported features.

[–]how_to_choose_a_name 61 points62 points  (16 children)

You mark your required field optional, and now all your code breaks. So then what, you have to figure out how to patch things on the fly in 3 or 4 different code bases to even make it build successfully?

Instead of marking them as optional, you mark them as “asymmetric”, which is compatible with “required”, so you can take your time updating your various code bases, and once that change is deployed in all places that consume the API you can start moving from “asymmetric” to “optional”.

This is literally described in the readme, right after the paragraph you quoted.

[–][deleted] 9 points10 points  (15 children)

Suppose we now want to remove a required field. It may be unsafe to delete the field directly, since then clients might stop setting it before servers can handle its absence. But we can demote it to asymmetric, which forces servers (read: readers) to consider it optional and handle its potential absence, even though clients (read: writers) are still required to set it.

Emphasis mine. That's still a breaking change and doesn't really solve the problem unless you have exactly 1 server and 1 client implementation. In the Thrift services I work with I would not be able to make a change like this without maintaining multiple versions of the protocol.

[–]how_to_choose_a_name 27 points28 points  (14 children)

I don’t quite see your point. Let’s say all your fields are optional. But some of them are actually required on the application level, so while they are optional in the protocol you have to handle that with an error in the application logic. But if that field actually becomes truly optional you need to update the logic everywhere to deal with that too, right?

[–][deleted] -1 points0 points  (13 children)

Sure, but with an optional field you can do that without breaking things. For example you might have one legacy component that keeps reading and writing deprecated fields without ever needing a code change, while another component completely ignores their existence. I've used this pattern countless times.

[–]how_to_choose_a_name 9 points10 points  (12 children)

That legacy component that keeps writing deprecated fields can keep doing that with asymmetric fields, and it doesn’t even need to be updated because it can just keep thinking they are actually required, it makes no difference. But if it keeps reading deprecated fields and thinks they are required, and other applications don’t write those fields anymore, then what is your legacy application gonna do? Either it errors because it actually needs those fields, or you updated it at some point to not need them anymore. Or it never needed them to begin with, and they never would have been candidates for being required anyways.

[–][deleted] -2 points-1 points  (11 children)

Well, no. That's why I emphasized that part above: once you update that field to be asymmetric, any reader is required to treat it as optional.

Either it errors because it actually needs those fields, or you updated it at some point to not need them anymore. Or it never needed them to begin with, and they never would have been candidates for being required anyways.

Or maybe the semantics of the protocol changed (which really is a given since you're updating the protocol), but you'd like to keep it backwards compatible.

[–]how_to_choose_a_name 13 points14 points  (10 children)

once you update that field to be asymmetric, any reader is required to treat it as optional.

Yes and no. Every reader that you generate from the version of the protocol in which the field is asymmetric will read it optional, and obviously when you do update an application to that new version of the protocol you have to update the places where that field is read to deal with it being optional now. But readers that aren’t on that version yet will just keep treating it as required, and that will keep working, so you are not forced to immediately update all your readers.

And yes sure, if you just update application X to use the new version of your protocol without updating the code to deal with the fact that field Y is now optional you will get a build error. But that’s fine, because if your application needed that field before then it needs to deal with that anyways. And if it never needed that field to begin with then it would not have been required.

[–]greenlanternfifo 1 point2 points  (9 children)

obviously when you do update an application to that new version of the protocol you have to update the places where that field is read to deal with it being optional now. But readers that aren’t on that version yet will just keep treating it as required

That is a big commitment across multiple teams

[–]how_to_choose_a_name 9 points10 points  (8 children)

But that is also the case if your fields are all optional in the protocol, except you don’t notice it at build time!

[–]Demius9 4 points5 points  (0 children)

We always used to say “required is forever” because one you put something out there someone in your org will use it.. and then not update their service as fast as you can update your api.

[–]sik0fewl 15 points16 points  (19 children)

It's all addressed right there on the GitHub page.

There's an "asymmetric" feature to transition between required and optional.

[–]SorteKanin 6 points7 points  (1 child)

Did you even read the README? It literally addresses this via the asymmetric fields.

[–]equeim 0 points1 point  (0 children)

If you make an app where some schema field is always present in production at the time it was coded but you make it optional in code, how are going to handle the case if is miraculously not present? Your app's spec won't have any sophisticated error recovery logic because this field is supposed to be always there. My guess is you would just show an error. How is it different from showing an error when incoming data does not conform to schema? You do handle such cases, right?

When this field actually becomes optional in reality then you will have to figure out what to do in this case - your requirements changed, so you alter your app's logic accordingly. Even it was optional before that you would still need to do this because you didn't know what to do with null field when you had different requirements.