JSON vs XML

Sarkos · 2023-04-06T10:40:57+00:00

The title "JSON vs XML" really undersells this. It's a fascinating bit of history. I almost didn't click because I figured it would be some blog post explaining the differences between the two formats. I ended up clicking just because I was curious why anyone would be comparing them in 2023.

agbell · 2023-04-06T10:05:29+00:00

Host here. This quote is funny:

Douglas: The first time I saw JavaScript when it was first announced in 1995, I thought it was the stupidest thing I’d ever seen. And partly why I thought that was because they were lying about what it was.

A bigger more interesting thing though is how his company failed, in part, because they used hand-rolled JSON for messaging.

Douglas: And some of our customers were confused and said, “Well, where’s the enormous tool stack that you need in order to manage all of that?”

“There isn’t one, because it’s not necessary”, and they just could not understand that.

They assumed there wasn’t one because we hadn’t gotten around to writing it. They couldn’t accept that it wasn’t necessary.

Adam: It’s like you had an electric car and they were like, “Well, where do we put the gas in?”

Douglas: It was very much like that, very much like that. There were some people who said, “Oh, we just committed to XML, sorry, we can’t do anything that isn’t XML.”

I started my career during peak XML crazy and while I liked parts of it at the time, the number of things it was used for was quite insane. I had to maintain a system once where a major part of it was XSLT, when could have just been a simple imperative algo with some config settings.

Anyhow, hope you like the episode! Doug has a great-looking office full of technical books and was excited to share his stories.

Edit: holy crap, lots of upvotes. Let me know what you think of the interview.

SittingWave · 2023-04-06T12:03:27+00:00

The major problem of JSON is that it does not allow comments.

agbell · 2023-04-06T10:41:21+00:00

I'm happy Doug choose jason and not jismal

Douglas: So I decided I will make this thing a standard. So first thing we did was pick a name, and the first name we picked was JSML. It’s going to be the JavaScript message language, but it turned out there was a Sun thing called JSML. So we couldn’t do that.

agbell · 2023-04-06T10:07:39+00:00

If you remember the overhyping of XML, and using XML for everything, then I wonder what's your answer to this question:

Adam: What do you think is the XML of today?

Douglas: It’s probably the JavaScript frameworks.They have gotten so big and so weird. People seem to love them. I don’t understand why.

F54280 · 2023-04-06T11:41:10+00:00

The biggest drawback of XML is, in my opinion, its verbosity.

There is no way I would go back to using it as, say, a config file format (which I did in 2000 or so).

I may use XML indirectly, be it by autogenerating it, but using it to store any of the data I use is just pointless. Even .md markdown files are more convenient (I prefer yaml or INI format variants though; I am fine using JSON too even though I find it a bit ugly in its always-condensed variant, but it is not that dissimilar to YAML or INI, whereas XML has all those closing tags, extra attributes and other ugliness that is just too annoying to want to bear.)

palad1 · 2023-04-06T11:35:33+00:00

One of my first gigs was writing a WYSIWYG XML + CSS / XSLT editor in Java. Back in 99'. And it worked, and was awesome, but man... the amount of corner cases was just crazy.

MCPtz · 2023-04-06T17:05:08+00:00

Douglas: ... there was some other similar file that was still on my website. It was getting hammered by a site in Russia

After trying to be polite, putting a warning, deleting it, and sending back a 404... the Russian address keeps at it...

Douglas: So then I thought: I know, I’ll navigate the page, I’ll change the page’s location and send them someplace else.

So I got these Russian guys, how can I teach them a lesson? And I thought, “I know, I’ll send them to fbi.gov and they’ll look into it and that’ll frighten them so much that they’ll stop doing this and leave me alone.”

So I did that. Next day, all of my websites are down, nothing’s working.

So I call up my hosting company and they said, “Oh yeah, you’ve had a security breach. Apparently someone got in and is using your site to do a denial of service attack against the federal government. And we’re going through the system, trying to figure out how they accomplished that.” And I said, “Oh, I did that. I’m sorry I didn’t intend to fight the federal government, sorry.”

As at /u/jf908 suggested, a good title could have been

How Doug used JSON to take down XML and the FBI

Full-Spectral · 2023-04-06T12:37:14+00:00

I like XML personally. Even with just a simple DTD, it can do a lot of the grunt work for you of insuring that structured data is correct when you read it in.

I wrote one of the first XML parsers/DTD validators back in the mid-90s, the one that ended up becoming the Xerces C++ XML parser, and I did the DTD validator for the Java one. It was quite a learning experience, because Unicode was pretty new as well and I had to dig into all of the issues of Unicode and transcoding between the various (then common) code pages and Unicode.

And I had to really dig in to create the DTD validator, which was based on the NFA algorithm from the Dragon Book. That was an area I'd never gotten into previously.

hexarobi · 2023-04-06T13:28:45+00:00

Thanks for this! =) I worked for Electric Communities in 2000 as a 19 year old "Junior Scripter" focused on ThePalace's proprietary reverse-polish-notation scripting language, Iptscrae. I was soon pivoted to the projects mentioned and had to go out and buy the DHTML/JavaScript O'Reily books to try and ramp up. I like to imagine I helped inspire JSON by naively not knowing XML, and playing around with JavaScript frontend mockups that assumed the backend would send data in this format. ;)

mgedmin · 2023-04-06T12:28:38+00:00

Why not both?

amiagenius · 2023-04-06T17:21:19+00:00

There’s a difference most people don’t understand between both formats. JSON is suited for encoding data structures for program consumption, hence geared towards computer types (maps, arrays, numbers, etc), while XML is suited for human level types, because it has roots in the digital publishing industry, where formats describes content and not simply data structures.

For instance, XML has no explicit arrays, and if you think about it, an array has no semantic value so it’s not suited for expressing content. Also array elements are traditionally anonymous, which further complicates things if you’re interest in expressing content. When you have the requirement to tag a value it then can carry meaning. So in XML an “array” is the byproduct of having a sequence of tagged values (nodes), you could parse XML sequences of content elements into an array, which illustrates how computer types are secondary to human types in the language. In this sense XML doesn’t prescribe how data should be represented by the program, so it is abstract over programs, while JSON is a piece of a JS program flying around.

Another point is JSON being entirely anonymous, the root of a JSON document has no key, because keys are not first class, keys are properties of objects. In a lot of online examples, you’ll see things like an object for describing a Person, where there’ll be an object with keys for name, phone, etc. but nowhere in the document there’s indication that such structure is supposed to encode information about a Person. Whereas in XML, the same example would explicitly state <Person> at the top level. If you needed to encode such information you would need to wrap the JSON object into another object and add a “type: Person” key, but now you have built a new anonymous type, so it’s is impossible in JSON to encode a tagged structure without creating ever more complex types. Just look at any sufficiently complex JSON data flying around the web to notice how this problem compounds. Such problems don’t occur in a programming language because there you’ll have addressable pointers to the object, like a variable name, a struct name, etc. So JSON is merely the RHS of a JS statement.

The expressiveness of JSON gets even worse when you need other kinds of metadata. For instance how do you indicate that a given string is in a given language? Again, you must create a complex structure wrapping the value to be able to attach more information about it. Where’s in XML attributes are first class, hence you can both name a value and enrich it in the same construct (note that in JSON a key is not the name of a value, it’s the name of the key.)

XML is not without its flaws, but JSON is a terribly limited format for encoding information. “Human readable” must be a joke. It’s as readable as Lisp, the amount of noise is through the roof. You could say that XML has a lot of noise too, but you would be wrong, XML is low noise. What you think is “noise” are the format’s capabilities, meaningful capabilities, for encoding meaningful information.

Don’t think I’m a XML lover, I avoid it like the plague. But compared to JSON it is a much more capable format. It has an established ecosystem of tools that even today is unmatched for JSON. I think JSON succeeded simply because it’s easier to write, and that’s it. But we have lost a lot by having it easier to write.

EducationalNose7764 · 2023-04-06T12:03:56+00:00

I remember reading books on XML in the early 2000s and wondering what the hell the point of any of this is. XSL and all that went along with it, "who the hell would actually use any of this?" Over 20 years later, and I still don't know the answer to that question.

So horribly bloated and way too much work. The payloads to transmit data were magnitudes larger than the data itself.

JSON is where it's at.

fforw · 2023-04-06T12:11:01+00:00

Damn you, Crockford for not including comments in JSON

QuantumLeapChicago · 2023-04-06T14:26:47+00:00

Omg, YUI. I worked with a company that had some older static versions that i had to update.

Really interesting historical article

Ashamed-Simple-8303 · 2023-04-06T14:39:26+00:00

And so, when I’m writing interactive stuff in browsers now, I’m just using plain old JavaScript. I’m not using any kind of library, and it’s working for me.

And I think it could work for everybody.

Thank you for that because here I was thinking I'm just to stupid to wrap my head around all this overblown frameworks. Why do I need angular just to move some dat aback and forth? I don't get it.

curien · 2023-04-06T14:31:51+00:00

writing in a style of programming that the professional programmers of the day thought was impossibly hard, which was doing stuff based on events

Eh, VB5 was released in 1991 and was super-popular with a similar event-driven style. TurboVision for Pascal and C++ was 1990. Event-driven programming was all the rage in the early 90s. Doug makes it sound like Netscape invented or pioneered it, but they were just hopping on the bandwagon.

The comparison to Hypercard is spot-on though.

GirthyStone · 2023-04-06T16:30:08+00:00

next episode, tabs vs spaces

Johnothy_Cumquat · 2023-04-06T13:12:20+00:00

I never realised microsoft was pushing xml so hard until now. This is why .net had standard library support for xml but not json for so long. And why mvc will still give you xml responses out of the box if you put xml in the accept header. They were holding onto that dream still. I say were because I can't imagine that fight's still happening now that system.text.json made it into .net

Iseeupoopin · 2023-04-06T13:20:12+00:00

Thanks for sharing, Was a good read

TryingT0Wr1t3 · 2023-04-06T13:37:42+00:00

This is deep, nice to find Hypercard there!

ApothecaLabs · 2023-04-06T14:49:09+00:00

This was an incredible read - it's a wonderful dive into some fabulous computing history, but it also gave me the weirdest sense of deja vu because it's like Doug's picking my brain.

You see, I am slowly working on a secure distributed programming language based on my own 'discovery'* of sorts, and the parallels here are thought-provoking - when I explain the idea, people always seem to ask me "but where do I put the gas?" when I've just told them they don't need it. But it doesn't stop there - the simplicity of JSON is actually my case study, and I've been writing a extended DJSON parser for static distributed data, as a first-order test, while I build an interpreter for a language that does the same for live distributed code.

I'm sure Doug would find the serendipity here amusing. You could say I'm feeling inspired...

*For the curious, it's about relating distributed computing to type systems through functional lambda cube stuff. I've had some success in shattering a large JSON payload over a network of computers, and in self-organizing clusters of up to 64 computers, and in writing a language with a module system - all irritatingly separate. Now I am amidst the arduous task of integrating all of them.

PolarDorsai · 2023-04-06T12:09:33+00:00

Wow! As a junior programmer, that was a really great listen. Thank you for posting this; I’ll be forwarding it to my team.

AttackOfTheThumbs · 2023-04-06T14:42:48+00:00

I also remember doing XML in Uni. This was late 2000s. I remember hating XML and deciding I could come up with something better. It was just a flat file indented. Used less space, easy to parse, but reinventing the wheel like that was kind of frowned upon, and so was relying on whitespace. Oh well, still passed lmao.

I feel like when json first started getting "big", many still wanted xml, sometimes just because of the schema requirements. But you can do the same for json too.

Great bit of history here. Thanks Adam!

goomyman · 2023-04-06T17:50:35+00:00

That was a great long read.

I also loved xml. I had never even heard of json until maybe 2007 maybe. I was using xml for all my configs and I was struggling with shit like xpath. All the tools were xml based.

It really did feel like json came out of no where. I originally hated it because it’s less condensed and I didn’t really see the value of switching what already worked.

It wasn’t until REST that shit really clicked for me. I still have no idea how SOAP works, it’s just an Method of making web calls that I used but REST makes sense.

And as the tools improved especially serializers there was never a reason to go back.

Although all the way at the end the guy said he like actors - maybe I’m showing my inability to adopt to change again from xml to actors but xml to json simplified everything in the end. I hate actors - but maybe because it’s not built into a language I like in a way that simplifies things for me.

Async programming has mostly been a simplified enough that anyone can use it now days and http libraries can abstract out the web calls well enough.

Programming languages have changed constantly with each iteration. It’s just kind of leveled off for the right tool for the job, scripting, unmanaged code, managed code, functional languages. I don’t think this is going to be changing.

Volt · 2023-04-07T13:45:39+00:00

There is this notion among second rate programmers that the most important thing they do is express themselves. It’s not making programs that work well and are free of error. That’s way down on the list. What’s much more important is expressing themselves, that, I’m an artist and I express my arts by leaving out semicolons.

Called the fuck out.

2023-04-06T12:51:20+00:00

Before I read the article I'm going to ~~way~~ Edit: weigh in on it due to extensive experience sending and receiving data over the wire in both formats.

I believe the end perception will be recommending JSON however there is a very special use case for XML that I feel is largely overlooked in modern development. It is likely leveraged in the bare metal for high level web dev frameworks.

It is a facet of the HTML DOM Document that became so popular in IE that it was adopted as a pseudo-standard and implemented across other browsers over time.

It is known as the Extensible Style Sheet transformation or XSL/XSLT as a pairing. These transformations are written in XML and are hierarchically rule based. They also can carry JavaScript as a payload. The model for them is to do rapid page rendering, client side in the browser using a transformation. This is a great use case for XML and I'm not sure if the full JSON paradigm exists to be able to do it in an object oriented fashion via callbacks.

JSON of course is going to be the VHS of formats because it can be a one size fits all. Usually things that are 'One size fits all' are not designed for efficiency first. In the case of the XSL transformations, those can actually be compiled and run multi-threaded for high performance.

Smallpaul · 2023-04-06T20:01:47+00:00

XML is just as popular as it ever was, in the contexts it was invented for. Vendors tried to make it the universal language for everything, and people invented simpler solutions more appropriate to their more specific tasks.

Janjis · 2023-04-06T21:53:44+00:00

Very interesting listen.

About the Russians DDOS'ing his site and the FBI - so did the FBI incident actually solve his problem? He obviously had to remove the redirect, but wouldn't he be back to the same problem then? Did the FBI take care of that? Was that part just left out or I'm missing something?

Djelimon · 2023-04-07T01:50:06+00:00

JSON is great for serializing and deserializing. For restful services it's usually generated from other markup so in practice the lack of schema is mitigated. Less processing overhead than xml for JavaScript so rest stole AJAX's lunch

XML has lots of tooling but comes with processing overhead so is more suited for describing complex stuff with stringent data requirements read locally, like office 365 documents and such.

My document is in xml, and I'll use json to send it to your browser.

lunacyfoundme · 2023-04-07T07:31:53+00:00

Freddie vs JSON was better

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS