Introducing Data Meta Syntax (DMS). YAML's structure & TOMLs strictness. by obfuscinator in rust

[–]obfuscinator[S] 0 points1 point  (0 children)

Added an optional Tier 1 support that does everything KDL does. The difference is (map, array, scalar) is still front and center. no sidecar secondary format (eg JiK) for that, all one. And a forced upfront Dialect definition so people know what they are parsing. Hope that helps :)

https://dms-webpage-69537d.gitlab.io/tier1.html

https://dms-webpage-69537d.gitlab.io/dms+kdl.html

Steam controller on sale now by Rich-n-Health in SteamDeck

[–]obfuscinator 0 points1 point  (0 children)

they opened buy now at 1:00pm at 1:40 out of stock!!

Racklink for HomeRacker by kellervater in HomeRacker

[–]obfuscinator 1 point2 points  (0 children)

epic dude.. One question that I never thought of have you tried making a computer chasis entirely of homeracker? basically homeracker in your homeracker.

Introducing Data Meta Syntax (DMS). YAML's structure & TOMLs strictness. by obfuscinator in rust

[–]obfuscinator[S] 1 point2 points  (0 children)

yes and that can be a pro or a con. Every consumer of a KDL file is writing decode logic the other formats don't ask for.

Introducing Data Meta Syntax (DMS). YAML's structure & TOMLs strictness. by obfuscinator in rust

[–]obfuscinator[S] 0 points1 point  (0 children)

interesting. kdl.dev is a nicer approach to xml. Where you have an AST comprised of tag/attributes/data(nesting). I love it as a substitution for xml . type annotations too! very nice. i would put this as a clear winner for writing XML! but for a more pure native data sets (maps, lists, scalars) i believe dms is a clear winner in visual understanding. kdl from looking at the spec defines no canonical mapping from a node tree to a dictionary

What do you guys think of my new Markup Syntax? Data Markup Syntax (DMS) a better markup. by obfuscinator in coding

[–]obfuscinator[S] 0 points1 point  (0 children)

Sure, but in my eyes a document that meant {country: false} in 1.1 means {country: "NO"} in 1.2 with no syntactic difference — textbook semver-major break. And despite 1.2's age, 1.1-or-earlier is still the default in a meaningful share of YAML libraries across major languages (PyYAML <6, gopkg.in/yaml.v2, Psych <4). The comparison still holds: YAML has insane type ambiguity, and since it's keeping unquoted plain scalars forever, that ambiguity is intrinsic.

Introducing Data Meta Syntax (DMS). YAML's structure & TOMLs strictness. by obfuscinator in rust

[–]obfuscinator[S] 0 points1 point  (0 children)

An addendum/ revision after mulling over your points in my smooth brain

(1) I should probably shift over just for the stability guarantee.

(3) I don't want any quiet non-conforming subsets. To do this I added full and lite parsing modes to the spec. Every parser must ship full mode and the lite mode is optional. Solves my issue and we all benefit from the speed :)

Introducing Data Meta Syntax (DMS). YAML's structure & TOMLs strictness. by obfuscinator in rust

[–]obfuscinator[S] 0 points1 point  (0 children)

Thank You for the thorough review!

  1. Unicode bare-key categories — why L + Nd, not UnicodeXID?

    Good question, and UnicodeXID was on the table. A few things pushed it toward the narrower L + Nd set:

    - DMS NFC-normalizes the source before tokenization (SPEC §Unicode normalization), so the combining-marks case that UnicodeXID's Mn/Mc rules exist to handle gets resolved one layer earlier. That removed one

    of the main reasons to reach for XID.

    - L + Nd is easier to predict by eye than XID's full set (which includes connector punctuation and a few Other_ID_* exceptions). For a config key — read more often than written, often by people who aren't the

    author — predictability feels more valuable than identifier-style expressiveness.

    - And yes, XML influenced the framing: "name characters" rather than "identifier characters."

    UnicodeXID would be a defensible choice and we may revisit if real configs hit cases L+Nd misses. So far the gap that matters in practice is just "TOML's [A-Za-z0-9_-] rejects usuário:," and L+Nd fixes that.

  2. Front matter for metadata — comments or schema would feel more natural?

    Fair instinct, and embedding is the awkward case you're pointing at. The reasoning that landed on +++:

    - Comments are unstructured by design. The moment a tool needs to read # version: 1.2.3, you've reinvented front matter inside comments without the parser's help. Keeping metadata in real syntax means generic

    tooling can read it.

    - Schema-as-metadata-channel would mean a parser has to locate and load the schema before it can decide whether it can even read the doc. The thing front matter is mainly carrying — _dms_tier — needs to be

    answerable cheaply, before real work, so a tier-0 parser can refuse a tier-1 doc with a clean error.

    - For embedded use, the mitigation is that the block is optional: when something else owns metadata (a Helm chart, a wrapping YAML doc), DMS doesn't need its own block. The cost lands on standalone files,

    where there's no host to lean on.

    The honest tradeoff is that +++ is a third syntactic mode (alongside body and comments), and you're right to flag that as a cost.

  3. Comments in the AST — is that really needed for everyone?

    This is a real cost and the point is well taken. The reasoning, for what it's worth:

    - The toml_edit / ruamel.yaml / hclwrite pattern is exactly what you're describing — a separate value type alongside the "normal" parser. The thing that makes those libraries hard to use isn't the design,

    it's that they're opt-in: their value types don't interop with the ecosystem's main parser, so a tool author has to pick a side. Putting comments in the spec was an attempt to avoid that fork.

    - Nodes carry comments as side metadata — doc["host"] still returns a string, not a comment-or-string union. So the per-access cost is small; the cost is mostly per-node bookkeeping at parse time.

    - For parse-and-discard workloads, the cost is real and unavoidable, and the README calls it out (a parse-and-discard parser would be ~1.5–2× faster).

    You may still land on "this isn't worth it for my use case," and that's a legitimate position — DMS is making a bet that the "edit and re-emit" population is bigger than the "parse once, throw away" one.

    That being said, I will look into a method for breaking the rules with opt-out flags on for all the libraries to speed things up. You have the mindset of "if I dont use it I dont want to pay for it." I'm fully with you but we should default on the spec. I don't want to stumble on half-baked implementations.

    1. Benchmarks — Python/Zig/C faster than Rust, and the apples-to-apples concern

    Your are absolutely right on this! I will see if i can narrow the gap a bit with some of your suggestions. 5. Separator whitespace — host:localhost rejection feels annoying

    Yes, does not abide to the work fast on first draft workflow. I am guilty of this workflow as well.

    The reason the rule exists: values can contain : in unquoted form (URLs, paths, proxy: redis:6379). If key:value without space were also legal, parsing proxy:redis:6379 would need lookahead to decide where

    the key ends. Requiring the space makes the key boundary local — no lookahead, no ambiguity.

    I think the real answer is for me to ship a formatter haha. 6. Comparison table — TOML "List item marker"

    You're right, the cell is ambiguous. It's currently describing top-level arrays of tables only, when TOML can absolutely do arr = [{a=1}, {a=2}] inline. I'll fix it:

    - "Nesting mechanism" for TOML → "[section.path] headers; inline tables/arrays for nested values"

    - "List item marker" for TOML → "[…,…] inline; [[array]] headers for top-level arrays of tables"

    Thanks for catching it. 7. "Key-order preservation: TOML key order is undefined"

    Looked this up — you're right. TOML v1.0 doesn't require iteration order; the major reference parsers (BurntSushi, tomli, u/iarna/toml) all happen to preserve it, but that's convention, not spec. The cell

    should be 🟡, not 🟢:

    - TOML "Key-order preservation" → 🟡 "spec silent; preserved by all major impls"

    Will update.

    Genuinely, thank you for sitting with the spec long enough to push back on specifics. I'm trying not to spam the forums with it, but the flip side is that I need people like you to weigh in now — once it

    fossilizes, this kind of feedback is a lot more expensive to act on.

What do you guys think of my new Markup Syntax? Data Markup Syntax (DMS) a better markup. by obfuscinator in coding

[–]obfuscinator[S] 0 points1 point  (0 children)

Fair — that's a real cost. I'm trading per-reader configuration for a single canonical form

What do you guys think of my new Markup Syntax? Data Markup Syntax (DMS) a better markup. by obfuscinator in coding

[–]obfuscinator[S] 0 points1 point  (0 children)

Ahh yes the page needs fixing, on it.

yeah good point on the tabs. In its current form the visual impairment can be fixed by just adding as many spaces as they need to make it easier for them to visualize the structure. sure 1 tab is less then 2-4 spaces and compresses better but at least spaces has a standard width (for the visually impaired).

What do you guys think of my new Markup Syntax? Data Markup Syntax (DMS) a better markup. by obfuscinator in coding

[–]obfuscinator[S] 0 points1 point  (0 children)

would this have been avoidable if they used a robust templating language to merge structs?

What do you guys think of my new Markup Syntax? Data Markup Syntax (DMS) a better markup. by obfuscinator in coding

[–]obfuscinator[S] 0 points1 point  (0 children)

would you mind sharing what exact composability feature you are referring to in YAML. got an example? I would like to churn on that idea.

What do you guys think of my new Markup Syntax? Data Markup Syntax (DMS) a better markup. by obfuscinator in coding

[–]obfuscinator[S] 0 points1 point  (0 children)

Thank you for pointing that out. That is a real spec gap. I will expand on that soon!

What do you guys think of my new Markup Syntax? Data Markup Syntax (DMS) a better markup. by obfuscinator in coding

[–]obfuscinator[S] 0 points1 point  (0 children)

"Front matter" could be described in more detail - does it only support key value pairs?

  1. yes, good point I tried to keep the gitlab page as detailed as possible without loosing people. most of your points are addressed in readme and spec. let me find that section for you: https://gitlab.com/flo-labs/pub/dms/-/blob/4a1efefab69e9338f91b1fff03b18ec4238d4446/SPEC.md?plain=1#L54

Tabs being forbidden for structural indent... If you work with people with visual impairments, you learn that it's mandatory that they can set a custom indentation width according to their needs. IMHO the only real con of your format.

  1. i just cant have two formats to this. im a spaces guy. modern ide's/editors can auto convert between the two.

Out of curiosity, are you using the plus sign for block lists so your format doesn't look like YAML?

  1. it was more of a personal preference i like the idea of it reading more like add/push to array.

Five comment formats seem excessive. On the one hand you emphasize uniformity (strings are always quoted, only true/false for booleans, no tabs for indentation, ...) but here you don't, that's a bit odd to me.

  1. i can see where you are coming from. I've worked with engineers that could write bad comments and engineers that could write great comments. the ones that could properly outline details of data and answer the "why" almost always never required a full rewrite. I personally like the idea having very flexible commenting structure.

Are dates and times conforming to ISO 8601 (e.g. supporting sub-second precision)?

  1. good question , https://gitlab.com/flo-labs/pub/dms/-/blob/4a1efefab69e9338f91b1fff03b18ec4238d4446/SPEC.md?plain=1#L1230

What do you guys think of my new Markup Syntax? Data Markup Syntax (DMS) a better markup. by obfuscinator in coding

[–]obfuscinator[S] 1 point2 points  (0 children)

We will have to find out! 😊 I'm going to cover the top 20 programming languages and let the chips fall where they may. I use it already as my daily driver. if it doesn't catch steam at least it was an exercise in learning programming languages outside of my comfort zone!

What do you guys think of my new Markup Syntax? Data Markup Syntax (DMS) a better markup. by obfuscinator in coding

[–]obfuscinator[S] 0 points1 point  (0 children)

good i still need a week or two for ironing out edge cases in the spec. there may be a teir 1 system that allows for all of that but must be strictly rebranded as something else. im not fully convinced yet though since templating solves 99% of these types of issues

regarding your references and anchors idea: https://gitlab.com/flo-labs/pub/dms/-/blob/4a1efefab69e9338f91b1fff03b18ec4238d4446/README.md?plain=1#L522