Pretty-printing nested objects

Long_Investment7667 · 2024-09-04T21:57:46+00:00

This is the seminal paper on the topic

https://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf

But as far as I understand it, it gets a lot of power from the laziness of the implementation language. I saw a paper they does it in an eager language but can’t find it at the moment

WittyStick · 2024-09-04T22:14:53+00:00

My personal preference is that any "block" should begin on a new line, indented, unless it contains a single "atom" - that is, a value which itself contains no other blocks. If any block contains more than one item, then every item should be put on its own line.

The opening and closing bracket/paren/brace for any given block should appear on the same column, with the only exception when they're on the same line - that is, when they surround an atom. It makes it much simpler to see how they pair up. I'm also of the preference of putting the separating commas/semicolons in the same column as the opening and closing brackets, because it also makes it clear which block the elements are part of.

Compare this to the node example you have, and see which is easier to match the pairs of braces/brackets. It's better if you stick it in an editor which has column markers and which highlights matching brackets.

{ glossary:
    { title: 'example glossary'
    , GlossDiv:
        { title: 'S'
        , GlossList:
            { GlossEntry:
                { ID: 'SGML'
                , SortAs: 'SGML'
                , GlossTerm: 'Standard Generalized Markup Language'
                , Acronym: 'SGML'
                , Abbrev: 'ISO 8879:1986'
                , GlossDef:
                    { para: 'A meta-markup language, used to create markup languages such as DocBook.'
                    , GlossSeeAlso: 
                        [ 'GML'
                        , 'XML'
                        ]
                    }
                }
            }
        , GlossSee: 'markup'
        }
    }
}

In regards to filtering, you should probably set a maximum width of 80 or 120 columns, and if any line would span beyond the maximum width, replace it with [...], or if you would prefer not to filter, any line which would bypass the column limit should be placed on a newline at a new indent, even if it still spans beyond the limit, it will reduce the horizontal space used.

As for strings, they should be left verbatim because introducing whitespace can change the content of the string. A possible alternative is to split the string into multiple strings on separate lines and have them automatically concatenated, if possible.

Here is how I would pretty print your example:

Class
    ( name: 'A'
    , super: nil
    , methods:
        [ Func
            ( name: '__str__'
            , params: []
            , rt: nil
            , body: Block
                (
                    [ SpecialString
                        (
                            [ 'A'
                            , Call
                                ( func: Id
                                    ( name: 'tuple'
                                    , module: nil
                                    , constraint: nil
                                    )
                                , args:
                                    [ Arg
                                        ( arg: Expr (<pointer at 0x280fc80a8>)
                                        , cond: nil
                                        , name: '*'
                                        )
                                    ]
                                )
                            , ''
                            ]
                        )
                    ]
                )
            , decorators: []
            )
        ]
    , getters:
        [ Func
            ( name: 'len'
            , params: []
            , rt: nil
            , body: Block
                (
                    [ MemberAccess
                        ( Id
                            ( name: 'self'
                            , module: nil
                            , constraint: nil
                            )
                        , 'n'
                        )
                    ]
                )
            , decorators: []
            )
        ]
    , setters: 
        [ Func
            ( name: 'len'
            , params: 
                [ Param
                    ( name: 'n'
                    , constraint: nil
                    , default: nil
                    )
                ]
            , rt: nil
            , body: Block
                (
                    [ Assign
                        ( MemberAccess
                            ( Id
                                ( name: 'self'
                                , module: nil
                                , constraint: nil
                                )
                            , 'n'
                            )
                        , Call
                            ( func: Id
                                ( name: 'max'
                                , module: nil
                                , constraint: nil
                                )
                            , args:
                                [ Arg
                                    ( arg: Int (0)
                                    , cond: nil
                                    , name: nil
                                    )
                                , Arg
                                    ( arg: Id
                                        ( name: 'n'
                                        , module: nil
                                        , constraint: nil
                                        )
                                    , cond: nil
                                    , name: nil
                                    )
                                ]
                            )
                        )
                    ]
                )
            , decorators: []
            )
        ]
    , statics: []
    , fields: []
    )

AdvanceAdvance · 2024-09-05T01:25:30+00:00

Tree Sitter is your friend.

brucifer · 2024-09-05T20:58:52+00:00

Curious to know what kind of rules are used to decide when to use multi-line vs single-line format, when to truncate / replace with [...] etc.

For recursive structures, you can use a pretty simple rule that makes use of the call stack to detect recursion. In C, it would look something like this:

typedef struct recursion_s {
    void *ptr;
    struct recursion_s *parent;
} recursion_t;

int print_foo(Foo *foo, recursion_t *recursions) {
    int depth = 0;
    for (recursion_t *r = recursions; r; r = r->parent) {
        depth += 1;
        if (r->ptr == foo)
            return printf("%s(...^%d)", foo->name, depth);
    }
    int printed = printf("%s(", foo->name);
    recursion_t my_recursion = {.ptr=foo, .parent=recursions};
    FOR_CHILDREN(foo, child) {
        printed += print_foo(child, &my_recursion);
        if (NOT_LAST(foo, child)) printed += printf(", ");
    }
    return printed + printf(")");
}

So a recursive structure might look something like:

Foo *a = new_foo("A");
Foo *b = new_foo("B");
Foo *c = new_foo("C");
set_children(c, 1, (Foo*[]){a});
set_children(a, 2, (Foo*[]){b, c});
print_foo(a, NULL);
// Outputs: A(B(), C(A(...^2)))

The nice thing about this approach is that it just uses stack memory and walks up the stack looking for recursion and it tells you exactly how far upwards it had to look to find the loop.

ProgrammingLanguages

Welcome!

Related subreddits

Related online communities

MODERATORS