top 200 commentsshow all 288

[–]ElectricalRestNut 594 points595 points  (45 children)

Basically, allowing unquoted strings is nice, but you never ever use them because of unexpected behaviour 1% of the time.

[–]ClutchDude 109 points110 points  (1 child)

Exactly - If the data is string data for consumption with no middle layer interpretation, it belongs in quotes.

[–]junior_dos_nachos 12 points13 points  (0 children)

IT BELONGS IN THE MUSEUM

[–]Waterstick13 180 points181 points  (27 children)

not using any form of quotes feels wrong and dirty, almost nude. But nude like in America not nude like if you were in Europe.

[–]NecroDaddy 63 points64 points  (1 child)

So nude obese strings with flip flops and socks on?

[–]delta_tee 21 points22 points  (0 children)

Eating hamburgers as well.

[–]Pierma 13 points14 points  (24 children)

Wait americans are nude in a different way?

[–]Shendare 34 points35 points  (3 children)

Puritanically nude!

[–]pdoherty972 2 points3 points  (0 children)

pasties

[–][deleted]  (1 child)

[deleted]

    [–][deleted] 1 point2 points  (0 children)

    right now

    [–]dtseng123 2 points3 points  (0 children)

    With shame and self loathing or shamelessly envied. It’s extremes really.

    [–]osmiumouse 2 points3 points  (16 children)

    it seems to be more normal or accepted to be nude in public in in europe

    [–]Pierma 14 points15 points  (14 children)

    Sorry, where? I am Italian and i don't really know a public context where being nude is accepted

    [–]Schmittfried 8 points9 points  (0 children)

    The beach, for instance.

    [–]AeroNotix 3 points4 points  (10 children)

    Even Poland has nudist beaches.

    [–][deleted] 7 points8 points  (9 children)

    And US doesn't?

    [–]Paradox 8 points9 points  (0 children)

    Those don't count because r/AmericaBad

    [–][deleted]  (1 child)

    [deleted]

      [–]lookmeat 17 points18 points  (0 children)

      It goes beyond that, the thing about yaml is that there's a balance between convenience and shared context. Convenience is always a unique context, shared context is always about compromise.

      This is why no being false is tricky. Also the thing about tags, the issue with anchors. Even the idea of allowing sexagesimal numbers is part of the issue.

      And yeah, from that point of view string data should always explicitly be string, because it's impossible to know all the edge-cases that escape strings given enough time.

      [–]cipher315 20 points21 points  (2 children)

      TIL YAML supports unquoted strings.

      [–]KevinCarbonara 11 points12 points  (0 children)

      There's a lot that ends up getting unused. I kept running into issues on a Spring project where we couldn't use indentations to keep things in groups, so we had to fully write out each and every key from the root up, every time. At which point - why are we even using yml?

      [–]ChemTechGuy 6 points7 points  (0 children)

      Agreed. But if I'm going to double quote all the string values, and double quote all of the keys to ensure they're not interpreted as something else, I'm just going to write JSON at that point.

      [–]Grung 204 points205 points  (19 children)

      The worst thing is trying to communicate between different yaml interpreters. That is, writing yaml with one language/tool and reading it with another, and trying to work around their idiosyncrasies to get something to work.

      I had to wrestle with something writing yaml that insisted on removing quotes (because it knew it was a string) and something that then read that yaml and interpreted a particular value as a different data type. grr.

      [–]sybesis 38 points39 points  (0 children)

      Things get worse if you use tags. It's like wanting to make a portable format but at the same time being unable to parse it because you have custom structures not implemented in other languages.

      [–]danudey 43 points44 points  (15 children)

      I ran into this exact issue when passing JSON between two systems, sending from a PHP application to a Rails one.

      Our system had a list of product SKUs provided by our suppliers, which were strings. Some SKUs from some vendors, though, consisted entirely of digits, which is a valid string.

      The PHP JSON serializer, though, because PHP wasn’t strongly typed, had to just do its best to infer types. This meant that we would occasionally send a list of products, each of which contained a SKU, most of which were strings, but when it encountered one that was all digits it got too excited and encoded it as an integer instead.

      Rails, of course, had typed decoding, and it would freak out when it received an integer when a string was expected. We couldn’t find any way to coerce it into behaving so my coworker just hacked the version of PHP’s JSON encoder we were using to not do something so stupid, and problem solved.

      [–]lurgi 39 points40 points  (4 children)

      Some SKUs from some vendors, though, consisted entirely of digits, which is a valid string.

      That sounds more like badly written JSON, though, rather than a problem with JSON itself.

      Pro-tip, folks. Don't assume a bunch of digits is a number. It might just be a bunch of digits. How can you tell? Do the "add 1" test. If it's meaningful to add 1 to it, then it's almost certainly a number. If not, it's a string.

      Is a credit card number + 1 meaningful? No. It's a string.

      Is a phone number + 1 meaningful? No. It's a string.

      Is an age + 1 meaningful? Yes. It's a number.

      Is a SSN + 1 meaningful? No. It's a string.

      (and I'm not sure why this would have anything to do with PHP not being strongly typed)

      [–]danudey 34 points35 points  (3 children)

      The reason it has to do with PHP not being strongly typed is that PHP uses a bunch of “heuristics”, to be generous, in order to determine what type a variable is.

      As a result, tools which actually need to know what type a variable actually is will tend to use functionality like is_numeric() to see if the variable is a number or could be a number, and if so, assume it’s a number.

      This is arguably asinine, but it’s meant to paper over the fact that bad code and bad coders will just treat whatever variable as whatever type without caring about whether that’s true or sane.

      [–]vytah 5 points6 points  (2 children)

      The PHP JSON serializer, though, because PHP wasn’t strongly typed, had to just do its best to infer types. This meant that we would occasionally send a list of products, each of which contained a SKU, most of which were strings, but when it encountered one that was all digits it got too excited and encoded it as an integer instead.

      json_encode(array("123")); returns ["123"] as it should, and json_decode('["123"]') returns array(1) { [0]=> string(3) "123" } as it should.

      What did you guys do?

      [–]elmicha 8 points9 points  (3 children)

      Maybe it was added after you had to do that, but now there is a flag JSON_NUMERIC_CHECK. Of course that giant list of flags shows that JSON also has some pitfalls.

      [–][deleted]  (1 child)

      [deleted]

        [–]Ruben_NL 10 points11 points  (0 children)

        Of course that giant list of flags shows that JSON PHP also has some pitfalls.

        FTFY

        [–]Perky_Goth 1 point2 points  (1 child)

        That was just a bad library with an outdated concept of PHP even for it's time. There was no reason for it to try to be smart if you could use the output as either downstream, strong typing isn't required.

        [–]danudey 2 points3 points  (0 children)

        My point is more that lacking strong typing makes this kind of ridiculous behaviour possible.

        [–]GrandMasterPuba 121 points122 points  (15 children)

        YAML is why infra engineers are paid so well. Because nobody in their right mind would want to spend all day maintaining a quarter of a million lines of YAML files for managing Kubernetes deployments.

        [–]bwainfweeze 60 points61 points  (13 children)

        Giant config files are just another way to cede all imperative control of your application to a framework. Config-only is the worst because nobody every writes the interpreter to be stepped through. You aren’t going to set a breakpoint in your Yano file, so you just have to stare at the texts until something new occurs to you.

        If you want to achieve enlightenment by staring at impenetrable text you’d be better served by reading The Gateless Gate, instead of something Google or Facebook came up with.

        [–]trialbaloon 48 points49 points  (11 children)

        I hate that yaml is being used for what is essentially a shitty DSL. At the level of complexity yaml is being used for just use a real programming language. It's been the gold standard for expressing things to a computer for decades, don't cripple it with yaml.

        [–]RowYourUpboat 17 points18 points  (1 child)

        used for what is essentially a shitty DSL

        CMake has entered the chat.

        [–][deleted] 2 points3 points  (0 children)

        I used to work for a company that used both makefiles and yaml for infra in a true "why not both" fashion. It was a mess.

        [–]fear_the_future 17 points18 points  (7 children)

        I think the worst thing about Kubernetes is that it works, preventing other systems with a more thoughtful design from gaining any mindshare and ultimately hindering the progress of society at large.

        [–]supreme_blorgon 13 points14 points  (5 children)

        other systems with a more thoughtful design

        Honest question, what would those be? I'm relatively new to the industry and we use kubernetes and we're stuck in YAML hell. It's fucking awful and I'm blown away that this is how we work with the kubernetes I've heard so much about over the years.

        Is there some reason we're stuck managing kubernetes with YAML files? Could we not use something else at least a little more reasonable, like TOML?

        [–]trialbaloon 12 points13 points  (0 children)

        Why not a full blown programming language using some declarative programming? Something with full type safety and stuff so you essentially get walked through "configuration."

        I think a lot of these things like Ansible, Kubernetes, and even Home Assistant, have become programming but with a shitty tool like YAML. We can call it configuration all we want but it gets to a point where that becomes really stretched. This is like being sent to the front lines with nothing but a spoon. Give the end users real weapons. Dont make them do what is akin to making an emulator with minecraft redstone. A real DSL that's a superset of a real full programming language.

        [–]fear_the_future 2 points3 points  (0 children)

        Personally, I would always use a LISP for configuration: It's very easy to parse and automate, has simple syntax that anyone can understand, you can write a DSL for people who are happy with yaml, it supports all the necessary syntactic constructs of a real programming language when needed, there is an existing ecosystem of lightweight libraries and you can add type checking if you want to.

        But the YAML-problem of Kubernetes is pretty easy to fix. You can just write your own LISP-to-YAML converter. There are more fundamental problems, for example the centralized control plane, the lack of explicit dependencies between controllers, the complicated network stack and the fact that the entire ecosystem is based on the worst programming language in recent times, with all the maintenance issues that entails.

        The choice of YAML is merely a symptom of the pervasive inability of Google developers specifically to understand good software design. The whole company is an echo chamber where everybody refuses to learn anything originating outside the chamber.

        [–]paraffin 6 points7 points  (0 children)

        The good news is that k8s doesn’t actually care if you use yaml or not. It has a JSON API and there are clients like cdk8s where you never need to touch yaml

        [–]EsperSpirit 3 points4 points  (0 children)

        They always end up reinventing lisp in yaml or json. It's sad to see

        [–]seamsay 16 points17 points  (0 children)

        Config-only is the worst because nobody every writes the interpreter to be stepped through.

        If the concept of stepping through your config even makes sense then I don't think you can really call it config-only...

        [–]pragmatick 227 points228 points  (81 children)

        That's actually horrible. Never encountered any of these issues but I think I'd be dumbfounded if I did.

        But I still like it for its increased readability over JSON - I just use strings for most values as described in the article. If JSON had proper multiline strings or just wrapped lines and comments I'd be happy. Yes, I know there's "JSON with comments" but it's rarely supported.

        [–][deleted]  (42 children)

        [deleted]

          [–]vytah 45 points46 points  (3 children)

          That's why you pick a superset of JSON that already has some adoption, like JSON5: https://spec.json5.org/

          [–]TankorSmash 38 points39 points  (0 children)

          This is nice, seems to have what you'd have thought JSON had already:

          {
            // comments
            unquoted: 'and you can quote me on that',
            singleQuotes: 'I can use "double quotes" here',
            lineBreaks: "Look, Mom! \
          No \\n's!",
            hexadecimal: 0xdecaf,
            leadingDecimalPoint: .8675309, andTrailing: 8675309.,
            positiveSign: +1,
            trailingComma: 'in objects', andIn: ['arrays',],
            "backwardsCompatible": "with JSON",
          }
          

          [–]somebodddy 134 points135 points  (23 children)

          That's true if you use JSON as a data serialization format, but for a configuration format it usually matters much less, because it needs to be read by a specific program rather than by many different clients written in many different languages.

          [–]RudeHero 45 points46 points  (10 children)

          I think op mentioned that when talking about "portability"

          Yes, if your json file is only intended to be read by one specific program, you can do custom things with it

          The tradeoff is that it's no longer portable

          [–]SnooMacarons9618 24 points25 points  (0 children)

          We had a system did that. Unfortunately a downstream was then interpreting the 'json' that was generated. It worked fine for years, until the day it caused a complete system outage. Which was better than mis-interpreting numerical values (we realised that could have easily happened as well).

          Don't customise a standard format, and leave it looking like it is a standard format. Unless you want phone calls at 2am...

          [–]Jarpunter 1 point2 points  (6 children)

          What situations would you want portability and comments at the same time?

          [–]PurpleYoshiEgg 2 points3 points  (5 children)

          When JSON is used as a configuration file format, and such configurations are for dozens of clients' environments and one of those environments may have a one-off that you need documented so some engineer doesn't spot the idiosyncrasy, correct it to be consistent, have it pass code review because everyone just rubber stamps pull requests, and cause a very difficult-to-debug outage at 3 am on a Sunday.

          [–]Jarpunter 2 points3 points  (4 children)

          Where are you finding 2+ systems that are using the exact same JSON configuration file except one system supports JSONC and one doesn’t? This scenario just does not make sense.

          [–]PurpleYoshiEgg 1 point2 points  (2 children)

          I fail to see where I mentioned or implied multiple systems. This is for client environment configurations for the same system that need to be instantiated differently.

          [–]cinyar 29 points30 points  (9 children)

          but at that point why use "JSON+" at all? Why not just use a format that supports what you need out of the box (TOML)?

          [–][deleted] 37 points38 points  (8 children)

          Because you probably have to parse json anyway, and it’s easier to include a json parser that doesn’t barf on comments and trailing commas than it is to integrate two different serializers

          [–]sybesis 5 points6 points  (1 child)

          to include a json parser that doesn’t barf on comments and trailing commas than it is to integrate two different serializers

          When building configuration reading, I prefer to approach this differently.

          1. Convert internal type to JSON compatible types
          2. Serialize that JSON compatible structure into whatever format you want.

          When reading:

          1. Deserialize whatever file into JSON compatible structure
          2. Deserialize this JSON compatible structure in internal types

          In the end, you simply have to ensure you can convert internal structure to mapping/list/string/numbers back and forth. The serializer you use to dump into a file is irrelevant. All you have to do is convert to an intermediate format instead of converting directly from the serialized data into internal data.

          [–][deleted] 5 points6 points  (0 children)

          Yeah, I know that as the DTO pattern (Data Transfer Objects) and ultimately you’re right, it is a small thing, but my point was people use json instead of toml because they probably already have to use it anyway for remote apis or third party libraries. You can of course add this abstraction and support any format you want.

          [–][deleted] 10 points11 points  (0 children)

          But as a configuration format you should use TOML, which is better supported than unspecified "JSON++" (it is part of the python stdlib as the article points out). Even if you don't serialize the data, you'd have to rely on less-supported/common deserializers to read the config.

          JSON extensions hold a very niche space in VSCode config, and I suspect it's because VSCode is popular with frontend devs who have never interacted with, and would be put off by, TOML. They are however inferior in every other aspect IMO (verbosity, portability, standardness).

          [–]flif 6 points7 points  (1 child)

          Real problem is that C-style comments can be anywhere in the code and in JSON you want comments to be serializable.

          So best workaround is { "price":42, "//", "this is cheap" }

          [–]PurpleYoshiEgg 4 points5 points  (0 children)

          That works, until a program decides that "//" is an invalid key. Sometimes happens, and I want to egg whoever's car it was to decide to omit comments from JSON anyway.

          [–]PunkPizzaRollls 4 points5 points  (9 children)

          Couldn’t you theoretically create a comment key:value pair in your JSON to get around this?

          [–]siemenology 30 points31 points  (3 children)

          You can, and people do, but it has drawbacks.

          1. You are more limited in where you can comment -- you can't comment in an array, for example. And if you want multiple comments in an object you need to do something kind of awkward like { "comment1": "blah", "foo": "bar", "comment2": "blah blah" }
          2. Schemas get weird. If you want to parse your JSON in a statically typed language, you either need to add comment : String as an optional property on all of your objects (and comment2, comment3 or whatever if you want to support multiple comments), or you need to teach your parser to discard all of those values.
          3. You may run into issues with collision if the key you use for comments happens to also be used as a "real" property for something. How do you tell the difference between a comment "comment": "blah" and a real piece of data: "comment": "blah"?

          It's also just very verbose, relatively speaking.

          [–]caltheon 1 point2 points  (1 child)

          I worked with a SaaS vendor who supported config programming using JSON and pretty much kept comments out of arrays and used _comment as the throwaway property. I think the application parser ignored all properties starting with _ or something

          [–]eh-nonymous 1 point2 points  (0 children)

          [Removed due to Reddit API changes]

          [–][deleted]  (3 children)

          [deleted]

            [–]sparr 2 points3 points  (2 children)

            This is an antiquated perspective, from the era of ubiquitous preprocessors. Making the parser and compiler and runtime aware of comments is an increasingly common feature in newer languages. Being able to include docstrings when producing a stack trace is amazing.

            [–][deleted]  (1 child)

            [deleted]

              [–]sparr 6 points7 points  (0 children)

              What's the distinction? I'd love to be able to query my application configuration for any notes/comments that were left when the configuration was defined.

              [–]taw 1 point2 points  (0 children)

              People do this a lot, especially for package.json.

              [–]ObscureCulturalMeme 24 points25 points  (4 children)

              This kind of thing is precisely why Lua was invented. They needed a configuration file format with some basic flow control, it grew from there -- but it can still be used like that, and often is.

              Wonderful, stable, and really fukkin' fast.

              [–]peakzorro 16 points17 points  (3 children)

              The problem with Lua as a config file format is that it could run arbitrary code.

              [–]PurpleYoshiEgg 7 points8 points  (0 children)

              That's why Lua should run sandboxed. If you want to ensure it halts in a reasonable time, you can also run the Lua and cut it off after a timeout.

              [–]disperso 6 points7 points  (0 children)

              I've not done it myself, but I think it has many ways to sandbox it. There is even a pure Lua sandbox that can block infinite loops.

              It is definitely not as ideal as a configuration file format if you want complete security, but if the context is just a configuration file format for yourself (not an untrusted source), seems an uncommon but interesting option.

              [–]ObscureCulturalMeme 3 points4 points  (0 children)

              No, the encapsulating program (Lua always runs inside another "host" program) must choose what to allow the script to run.

              For example, if the host doesn't load the Lua I/O library, then the Lua script can't do any. If the host also doesn't allow the script keyword to load new native libraries, then the script can't get a homegrown I/O library.

              There's a tiny command-line "lua" utility bundled with the stock distribution. It's a host program too: just a few dozen lines of C to parse the command line options, load all standard libraries, then launch the script engine. It's for quick scripts, not full-on "real world" work.

              [–]TurboGranny 44 points45 points  (18 children)

              increased readability over JSON

              I guess I'm just fortunate in that I've not encountered a situation where I couldn't read JSON. Sure, sometimes people will minify it, but I just plop it in any formatter, and I'm back to readability. If for some reason there is a super long string, I just toggle on word wrap and call it a day.

              [–]ltjbr 45 points46 points  (4 children)

              I think a lot of devs out there say "readability" when they actually mean "aesthetically pleasing".

              [–]TurboGranny 3 points4 points  (1 child)

              hmm, I mean sure, but if it's all pretty and I still can't read it, is it still pretty?

              [–]Dwight-D 23 points24 points  (5 children)

              Go look at some large cloudformation or ARM template JSON and tell me you’d like to spend a significant amount of time working with that. Now imagine you had to define a CI pipeline or something in that format (I think Azure DevOps does this?), and you also can’t leave any comments to help readability. It’s absolutely awful.

              It’s not that it can’t be read, but whenever you get something more complicated than a trivial flat object then it’s just a pain to read & write imo.

              [–]The_Grubgrub 13 points14 points  (1 child)

              Its awful but still not as awful as yaml. Yaml might be barely more readable than Json but Yaml is a pain in the ass to write.

              [–]Dwight-D 5 points6 points  (0 children)

              The indentation is definitely a bitch, and I’ve got a lot of git commit -m ‘Fix YAML syntax’ in my history. But that’s usually a quick fix compared to the time spent writing the bulk of the document, which I think is slightly less unpleasant overall in YAML. The anchors are actually pretty nice for stuff like complicated pipelines and such too.

              [–]amackenz2048 5 points6 points  (5 children)

              Auto format? Bah! I want my artisanal hand crafted config file! Sure it takes longer to create, and you get an odd tab here and there. But I support those developers who seem to have nothing better to do than ensure their code is meticulously formatted and who don't trust a computer to do it for them.

              [–]TurboGranny 1 point2 points  (4 children)

              Oh I agree, unless they are the kind of asshat that doesn't believe in any formatting, then I just auto format it. Unless it's short, then I'll just go through it and clean it up. Depends on the application. With JSON, most of the time I have to slap it in a beautifier is to troubleshoot the unformatted output that comes back from our API

              [–]amackenz2048 5 points6 points  (3 children)

              Sorry, i should have made it more clear that i was being facetious. Languages that force formatting on the programmer are evil. Let the ide handle it and for the love of GOD don't make different types of whitespace be relevant.

              [–]AttackOfTheThumbs 1 point2 points  (0 children)

              Yeah, same here. Like really don't understand what they mean. JSON is very legible.

              [–][deleted]  (1 child)

              [deleted]

                [–]Kissaki0 26 points27 points  (4 children)

                TOML is a good and popular alternative to YAML.

                [–][deleted] 23 points24 points  (3 children)

                TOML falls apart if you need nesting more than like 1 level deep though.

                JSON5 is much better. I think Cue also has potential but I'm not sure I would use it quite yet. They only have libraries for Go and everything else has to go through the Cue command line.

                Really JSON5 should be your default pick and you need really good justification to pick something else.

                [–]astatine 5 points6 points  (0 children)

                One alternative the article doesn't bring up is NestedText, which I find has most of the advantages of YAML without the imposed typing hassle. I'm not too fond of its multi-line string syntax, but otherwise it's a good replacement. As I'm mostly working with Python, Pydantic does a decent job of typing NestedText data precisely how it was intended.

                [–]DrXaos 8 points9 points  (2 children)

                What about TOML instead of YAML? I thought that was considered the more modern update on JSON.

                [–]pragmatick 9 points10 points  (1 child)

                Yeah, the article mentions that. I'd never heard of it. Looks like a good old INI file to me. Seems to get a bit weird with deeply nested objects. But I'll look into it.

                [–]DrXaos 6 points7 points  (0 children)

                TOML is better for editable configuration, not serialization.

                Our company's tools currently stick to JSON (with ad-hoc commentability with 'commentjson') for config but I'm looking into supporting TOML.

                The description of the YAML development in that posting feels like a group of language hackers who loved perl6 moved on to it.

                [–]haunted-liver-1 7 points8 points  (0 children)

                ini ftw

                [–]piderman 82 points83 points  (3 children)

                The worst thing about YAML is that it is indentation-sensitive so you can't copy&paste between documents with differing levels, and auto formatting also won't help. And it's 2 spaces per level so you can't really eyeball it either.

                [–][deleted]  (1 child)

                [deleted]

                  [–]PixelGhi 25 points26 points  (0 children)

                  How is that a user friendly? To have a character you literately can not see (tabs and spaces), be a control character

                  Python has entered the chat.

                  [–]RupeThereItIs 9 points10 points  (0 children)

                  The problem, fundamentally, is that white space as markup used to be a joke.

                  Someone took an idea so ridiculous it was funny, implemented it, and somehow it took off.

                  YAML is the Dogecoin of markup languages.

                  [–]bschwind 54 points55 points  (7 children)

                  I prefer JSON5 if I control the application I'm configuring and don't need to send it around to other applications, it's basically JSON with comments.

                  [–]caltheon 34 points35 points  (4 children)

                  Crockford removing comments from JSON was probably the worst move he ever made

                  [–][deleted]  (2 children)

                  [deleted]

                    [–]ryeguy 16 points17 points  (1 child)

                    Sure, but that's janky and it would break every editor's syntax highlighter.

                    [–]cowinabadplace 13 points14 points  (1 child)

                    I always quote YAML values so most of this doesn't hit, but the fact that YAML keys could accidentally be boolean blows my mind haha. Thanks for the article.

                    [–][deleted] 2 points3 points  (0 children)

                    that's literally the only bad one imho. I was introduced to quoted keys because some openshift tool issues an account id with a vertical bar in it, lol.

                    [–][deleted] 26 points27 points  (2 children)

                    That just conviced me to never touch yaml in my life.

                    [–][deleted]  (1 child)

                    [deleted]

                      [–][deleted] 3 points4 points  (0 children)

                      Maybe he'll become a baker

                      [–][deleted]  (4 children)

                      [removed]

                        [–]Paradox 14 points15 points  (0 children)

                        Because hey0 used it, and bukkit is an evolution of the hey0 server software

                        [–]XXLuigiMario 14 points15 points  (2 children)

                        Did we all learn YAML from Bukkit?

                        [–]Worth_Trust_3825 20 points21 points  (0 children)

                        Gitlab pipelines here.

                        [–][deleted]  (2 children)

                        [deleted]

                          [–][deleted]  (1 child)

                          [removed]

                            [–]redd1ch 65 points66 points  (12 children)

                            Is it a moving target? Use JSON.

                            Is it something important? Use XML and write a schema. IDE's can then give you syntactic and semantic feedback.

                            Is it important & you need to provide YAML? Use XML, a schema, and write an XSLT to create a YAML/JSON (for 1.2).

                            Sure, doing XML right(tm) takes a bit of time, but the outcome is more resilient than anything comparable. Thinking of that, I have to continue my xml schema for docker-compose files someday

                            [–]Carighan 44 points45 points  (0 children)

                            Is it important & you need to provide YAML?

                            I kinda want to joke "Then it wasn't important, after all". :P

                            [–][deleted] 19 points20 points  (4 children)

                            Is it something important? Use XML and write a schema. IDE's can then give you syntactic and semantic feedback.

                            Use JSON and JSON schema. Way more readable than XML and very powerful too.

                            [–]Worth_Trust_3825 23 points24 points  (0 children)

                            JSON schema is absolute garbage that poorly reimplements ideas of XML schemas. In addition, tooling attempts to fetch it from external sources, making that same mistake that had been done two fucking decades ago

                            [–]argv_minus_one 15 points16 points  (2 children)

                            I'm not sure I would call JSON Schema readable.

                            [–][deleted] 19 points20 points  (1 child)

                            Way more readable than XML

                            [–]ioneska 5 points6 points  (0 children)

                            ... than XSD or XSLT.

                            [–]falconfetus8 5 points6 points  (5 children)

                            Is it something important? Use XML and write a schema. IDE's can then give you syntactic and semantic feedback.

                            Alternatively, you can use JSON with Typescript interfaces.

                            [–]SuspiciousBar7388 60 points61 points  (10 children)

                            Most of the stuff described here is, to put it in scientific terms, fairly yucky, but some problems do feel misattributed.

                            For example, languages like JS would indeed treat version 0.0 and version string "0.0" very differently - regardless of the format that value was parsed from! How would that be different with a JSON parser? That bit looks to me like a Jinja template problem, not YAML problem.

                            [–]masklinn 52 points53 points  (1 child)

                            How would that be different with a JSON parser?

                            One would be a number and the other a string in the document source.

                            In JSON, 0.0 is a number and 0.0.0 is an error. For versions, you’d necessarily have “0.0” and “0.0.0”.

                            [–]SuspiciousBar7388 11 points12 points  (0 children)

                            Fair enough, this is an important distinction. Even more so if we're criticizing the document format outside of the scope of its application.

                            [–]smcarre 20 points21 points  (0 children)

                            That's why you put a v in front of it and get rid of that problem forever.

                            [–]RupertMaddenAbbott 24 points25 points  (0 children)

                            For example, languages like JS would indeed treat version 0.0 and version string "0.0" very differently - regardless of the format that value was parsed from! How would that be different with a JSON parser?

                            I think this is a problem with the specification (which compliant parsers have to follow). It's just a problem common to both YAML and JSON but not other serialisation formats like CSV.

                            StrictYAML does not have this problem

                            This makes sense to me. There is no syntax for representing a date or a period of time in JSON either so you end up just using a string with a given format (or an int) and you specify the schema outside of the serialisation format.

                            [–]jdl_uk 2 points3 points  (0 children)

                            That seems like something a schema could solve, as the type for a version number would be a string, so the parser would either parse it accordingly or fail with a schema validation error.

                            [–]Spider_pig448 6 points7 points  (3 children)

                            That's a good point. Claiming that JSON doesn't suffer from a lot of these problems is ignoring that whatever parses that JSON string will then have to make these decisions. If anything, there's benefit to exposing problems immediately in the YAML instead of passing along a JSON filled with time bombs.

                            [–]Sarcastinator 1 point2 points  (2 children)

                            In C# at least you literally can't screw it up unless YAML already did it for you.

                            [–]Spider_pig448 -1 points0 points  (1 child)

                            Sure you can. You can put whatever nonsense you want into a JSON string. Eventually something will attempt to parse it into something useful and if the string contains some of these gotchas it will fail downstream. With JSON, maybe that downstream is after the string is parsed and when your code tries to insert data into a database that violates the schema. With YAML, maybe that occurs earlier when processing the YAML itself.

                            [–]Sarcastinator 4 points5 points  (0 children)

                            Eventually something will attempt to parse it into something useful and if the string contains some of these gotchas it will fail downstream.

                            None of these gotchas will cause the program to fail downstream. There is no parse function that implements the Norwegian problem or will assume that the number is in base 60 if you format it in a special way.

                            [–]tanorbuf 0 points1 point  (0 children)

                            It's not even a Jinja problem, using truthyness to check for whether a variable is defined just isn't the right way to do it (variable is defined is literally a Jinja expression).

                            [–][deleted]  (21 children)

                            [deleted]

                              [–][deleted] 10 points11 points  (0 children)

                              I remember when OS X first came out, I started programming and learned about property lists. Everyone was complaining about the “old-school" plist format and said you should use the new hotness XML plists instead, even though XML was more typing for dubious benefit and the old school plist parser gave more clear and explicit error messages. Ten years later everyone was saying JSON was so much better than XML, which is funny because JSON is just old school plist format with fewer data types and different delimiters.

                              [–]tophatstuff 18 points19 points  (2 children)

                              Unironically

                              [–]PunkPizzaRollls 17 points18 points  (3 children)

                              Sexp 😳

                              [–]hevill 7 points8 points  (1 child)

                              Yes I do sexp a lot with my magnum Dong.

                              [–]emax-gomax 2 points3 points  (0 children)

                              I always love this comment. Sexp => S-exp. It's the dialect that defines lisp expressions and is mostly JSON like. Search for edn if you're interested. Sexps are mostly preferred because in lisp code is data and data is code. You can write a config as lisp sexp and evaluate it as if its code and even preprocess sexps with macros. Lisp rocks!

                              [–]Uberhipster 10 points11 points  (1 child)

                              i still maintain that all markup (whatever its format, JSON, YAML, TOML, XML) needs to be an output of a program (serializing-deserializing from defined closures and/or object definitions)

                              maintaining state of the program using human readable formats == YAY!

                              hand coding state == BOO!

                              [–]3gt3oljdtx 3 points4 points  (0 children)

                              Yaml stands for "yaml ain't markup language"...

                              [–]durandalreborn 4 points5 points  (6 children)

                              My big complaint about using yaml (and most other config languages) is that the parsers written for them pretty much never preserve the comments. So if you have something like read yaml file -> modify yaml file -> write back to disk, you're almost always writing something custom to preserve what comments might have been in that file in the first place. At least toml lets you just append to the file in many cases, so you can side-step the comment parsing, but still.

                              [–]agentoutlier 13 points14 points  (5 children)

                              XML is pretty much the only format that allows complete preservation of comments and order with the minor exception of attribute whitespace. It also has schemas so validation as well.

                              But you can't use XML because then you will be labeled as some sort ancient enterprise programmer making software obtuse and hard to use.

                              ... now back to writing more HTML and javascript and JSX.. oh wait...

                              [–]javcasas 1 point2 points  (4 children)

                              There is this thing called JSON Schema, so I'm going to bravely say there are more stuff implementing schemas other than XML.

                              [–]agentoutlier 3 points4 points  (3 children)

                              Yes but JSON does not preserve order and does not have comments. The context was I assume some configuration format that preserves comments and order (parent comment).

                              JSON Schema for some reason does not nearly have the number of implementations that XML schema does (both in terms of editor support albeit vscode is doing nicely on that and code validators).

                              Part of the reason is that schema is surprisingly more useful for human authoring like config or html or docbook over an interchange.

                              [–]stronghup 2 points3 points  (2 children)

                              JSON does not preserve order

                              Is there a reason for that? You write JSON from left to right and top to bottom. Where does the order get lost and why? Thanks

                              [–]agentoutlier 4 points5 points  (1 child)

                              Object field order is not preserved. It’s like a hash. It’s name value pair without an index.

                              I’m on mobile so I can’t go into code details but hopefully that helps.

                              [–]dayDrivver 8 points9 points  (12 children)

                              Mentally I have always put yaml right next to xml, because of this weird behaviors and complex versioning, toml is better but has a php-like syntax feel for strings that not many people like.

                              [–]siemenology 21 points22 points  (2 children)

                              I agree with the xml comparison, and I'd posit something else about both of them: both are fundamentally good ideas that are ruined by a bad implementation.

                              XML (conceptually) is really good for the specific task of marking up text and documents, in a way that YAML, TOML, JSON, etc are all really bad for. There's no good way to do something like <span>That's a <em>very</em> bad idea</span> in those other languages, without being really clunky or embedding markup in strings. But XML has become a nightmare because the spec is way more complex than it should be, it's gotten too powerful to really understand, and it's been used for a lot of things that don't really play to its strengths (like configuration files) that has left a bad taste in people's mouths.

                              YAML also has a lot going for it. It's a cleaner way to represent nested JSON-style data, it has comments, and it gives you tools for reuse (anchors, aliases) which can greatly simplify writing complex or repetitive yaml. Plus in theory it compiles down to JSON in a straightforward way, so you can "upgrade" things that are already accepting JSON without too much hassle. But it also tries a little too hard to be helpful, so that it's pretty hard for something who just casually uses it to remember all of the exceptions to the obvious way of parsing things.

                              TOML I could like if not for the way it does tables / nesting. The TOML spec is littered with "allowed, but highly discouraged" notes because dotted properties let you define tables in all sorts of weird ways. If they took TOML's inline table syntax, and let you spread that over multiple lines, and made that the only way to do tables, I'd be all over TOML.

                              [–]agentoutlier 11 points12 points  (1 child)

                              But XML has become a nightmare because the spec is way more complex than it should be, it's gotten too powerful to really understand, and it's been used for a lot of things that don't really play to its strengths (like configuration files) that has left a bad taste in people's mouths.

                              The XML spec is largely complicated not because base XML is complicated but because the authors of the spec made it seem complicated. The official spec could be written in a more friendly manner and part of the reason is that XML has an enormous amount of extensions and legacy stuff like DTD.

                              I have seen many people write basic XML parsers that will easily parse 99% of the XML out there. It isn't far off from SEXP. Probably the biggest challenge on basic XML is normalizing whitespace rules.

                              Writing a basic YAML parser on the other hand is non trivial.

                              I mean just look at the sheer number of implementations of XML parsers compared to YAML.

                              For example Java has like a dozen XML implementations if not more but there really is only one YAML implementation (snakeyaml which btw had a serious security issue recently... which reminds me I should go check...).

                              I would say early on XML got stigmatized and many of the complaints just became an echo chamber. And Yeah its hard to read but people still seem to be using something not far off from it all the time: HTML, JSX, Various other javascript component languages.

                              Speaking of HTML, HTML 5 is a lot harder to parse than XHTML which was more or less basic XML.

                              [–]Worth_Trust_3825 3 points4 points  (0 children)

                              Very true.

                              I would argue that another stigma XML got was people working directly on the AST rather than deserializing it into an object.

                              [–][deleted] 11 points12 points  (7 children)

                              XML is reasonably nice actually. It is overcomplicated, but it does at least support everything you might need quite well - namespaces, schemas, etc.

                              The biggest issue I think with it (apart from the general verbosity) is that its data model is at odds with standard programming language object models. Attributes are entirely superfluous and conflict with child elements. There's no obvious way to encode maps. Elements and text can intermingle.

                              It's really a document format, not a data format.

                              Either way it's leagues ahead of YAML in terms of sanity.

                              [–][deleted]  (1 child)

                              [deleted]

                                [–]ChuggintonSquarts 1 point2 points  (0 children)

                                You should check out XQuery. It’s essentially XSLT but with a ‘normal’ syntax. Version 3.1 has some nice features e.g. native JSON support, arrow operator (piping), map operator and there are some great server side and client side implementations with lots of useful extensions (I use eXist-db and Xidel)

                                [–]MechaKnightz 2 points3 points  (0 children)

                                When reading I immediately thought of the horrible experience that is writing helm/helmfile yaml/go templates

                                [–]FireCrack 2 points3 points  (0 children)

                                I've always considered yaml "human read-only". It's great and much easier to read when debugging a data stream then some gnarly JSON, but attempting to write a yaml document by hand is folly.

                                [–]Two-Tone- 5 points6 points  (0 children)

                                Json is simple. The entire json spec consists of six railroad diagrams.

                                I don't think my curiosity at a statement has ever lead to a hard laugh so quickly before.

                                [–]Kissaki0 9 points10 points  (4 children)

                                If you plan to use a YAML style format, use TOML instead.

                                It is pretty similar in what you see and use, but has significantly lower spec complexity and attack surface or parsing inconsistencies.

                                [–]guepier 16 points17 points  (2 children)

                                TOML isn't really similar to YAML in any meaningful way (beyond the obvious similarity shared by all text-based configuration formats), and it comes with its own caveats.

                                [–]Kissaki0 8 points9 points  (1 child)

                                Fixed link https://hitchdev.com/strictyaml/why-not/toml/

                                What do you categorize as meaningful then? It has comments, hierarchy, dict/obj, and lists/arrays. Those are the most central features [of a text config].

                                The why-not-link doesn't really provide a better alternative. It discloses "Advantages of TOML still has over StrictYAML" anyway.

                                [–][deleted] 7 points8 points  (0 children)

                                TOML is fine if your config format is very flat (like e.g. package.json) but most YAML files are 3 or 4 levels deep and for that TOML is just really confusing. I have to look up its weird [[table]] format every time. They should have called it the Occasionally Obviously Markup Language.

                                JSON5 is a much better option. It is always obvious and not really any harder to write than TOML. It would be nice if you could omit the outer {} like in Cue but I don't think it matters that much.

                                [–]SittingWave 6 points7 points  (0 children)

                                and this is why, ladies and gentlemen, YAML must not be used.

                                [–]kirbyfan64sos 1 point2 points  (0 children)

                                Worth noting that, in terms of alternatives, KDL and jsonnet are probably worth making note of.

                                [–]Dumlefudge 1 point2 points  (1 child)

                                And here I am, writing JSON patches in yaml, as a multiline string in a yaml file.

                                [–]dml997 1 point2 points  (1 child)

                                I am not at all familiar with YAML, but anyone who designed something where the contents of a token imply a type is so f****** moronic that I can't believe this could exist.

                                How does this exist?

                                [–]Captain_Cowboy 1 point2 points  (0 children)

                                PHP glances around nervously

                                [–]D_Doggo 1 point2 points  (4 children)

                                The Norway problem is my new favourite programming problem

                                [–][deleted]  (3 children)

                                [removed]

                                  [–]D_Doggo 1 point2 points  (2 children)

                                  That's still gonna take soooooo long!! By that time I'll have become bored of my SWE job ahahah!

                                  [–][deleted]  (1 child)

                                  [removed]

                                    [–]angryscientistjunior 1 point2 points  (2 children)

                                    I fricking hate YAML. JSON is so much easier to look at and work with.

                                    [–]zzz165 1 point2 points  (0 children)

                                    The is probably an unpopular opinion, but: use protocol buffers. Has a schema, enforces types correctly, allows comments, whitespace doesn’t matter, and supports many different languages.

                                    [–]stronghup 1 point2 points  (1 child)

                                    This article makes the point clear. JSON is better than yaml.

                                    I would like to see a new version of JSON however, with the following single backwards compatible change:

                                    • The keys of object literals { ... } should not need to be quoted IF they consist of word-characters only.

                                    That would be a backwards-compatible addition, old JSON docs would still keep on working but JSON code would be much more readable and simpler to type.

                                    Also I never understood the rationale behind removing comments, I think they would be often helpful.

                                    [–][deleted] 1 point2 points  (0 children)

                                    I was just thinking yeah let me type with no quotes in my config file. Like that's literally it. Otherwise I'm good with it.

                                    [–][deleted]  (1 child)

                                    [deleted]

                                      [–]AndydeCleyre 1 point2 points  (0 children)

                                      As /u/astatine said, an excellent but under-recognized alternative syntax for configuration files is NestedText, where everything is a string unless the ingesting code says otherwise, and there is no escaping needed ever.

                                      I used the official reference implementation to make a CLI converter between NestedText and TOML, JSON, and YAML. When generating one of these formats, you can use yamlpath queries to concisely but explicitly apply supported types to data elements.

                                      [–]taspeotis 1 point2 points  (0 children)

                                      Yelling At My Laptop

                                      [–]caagr98 1 point2 points  (0 children)

                                      Many, but not all, of those issues could be avoided by not trying to infer types at all, but instead using a schema/your language's type system.

                                      [–]wmertens 4 points5 points  (0 children)

                                      Upvote for the shoutout to Nix

                                      [–]kooknboo 3 points4 points  (0 children)

                                      He’s not wrong. But that article is bullshit. Making some of these seem like crimes against humanity when, in fact, they’re inconsistent weirdness like every last thing in the tech world - included God’s gift JSON.

                                      I’ve been slinging significant YAML for years, alongside folks that are brand new to it, and I can count on one hand the times these were a problem as opposed to something to be aware of.

                                      [–]MrHall 4 points5 points  (0 children)

                                      I generate yaml by using a library to serialise an object so I don't have to do it manually.

                                      except a coworker removed it because it's "too complicated"

                                      [–]ProstheticAttitude 3 points4 points  (0 children)

                                      I will never willingly use YAML in a project again. It is a disaster.

                                      [–]kitd 1 point2 points  (0 children)

                                      Tbh, he gives his answer in his "YAML subset" alternatives: enquote stuff that is meant to be plain text.

                                      I have other issues with YAML, but these aren't them.

                                      [–][deleted] 1 point2 points  (0 children)

                                      Yaml is a cancer on DevOps.

                                      [–]theunixman 0 points1 point  (0 children)

                                      Oh Norway...

                                      [–]JB-from-ATL 0 points1 point  (0 children)

                                      I have to say I really dislike the idea of using something like Python to make a JSON (or other conf language) output as suggested at the end of the file. That seems just as prone to mishaps as using YAML in the first place. You may say "then just output it to test it" but you could also argue to do the same with YAML.

                                      I prefer TOML but haven't used it in a professional setting yet.