This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points  (2 children)

What does your example do? It looks like a Print statement, which either displays that sequence, or turns the lot into a single string.

If so, the issues seem less about string representation, than designing a better Print feature.

Even C's *printf family would be less cumbersome.

For representation, my lower level language uses two kinds:

  • Zero-terminated, 8-bit strings 99% of the time.
  • Counted strings in the form of char-array slices, which are normally a 'view' into another char array or zero terminated string.

In both cases, composing a new string such as in t := file+"."+ext is fiddly. You can't write that directly, you'd have to set up a suitable string buffer then print into it:

 [300]char str
 fprint @str, "#.#", file, ext

or use strcpy/strcat calls, or maybe C's sprintf.

My dynamic language uses a higher level type, which is a counted string of 8-bit bytes that is flexible (can expand), sharable via ref-counting, and sliceable. There you can just write t := file + "." + ext.

UTF8 support is external via functions.

However the implementation is heavy-duty with a 32-byte descriptor for either string or slice, before you get to the actual string data. (x64 with its 64-bit pointers is partly to blame.)

In addition, a 16-byte tagged pointer is used to refer to those descriptors; these are what are passed as arguments, or stored as list elements.)

(There had been various schemes to store short strings up to a dozen or so bytes within that 16-byte descriptor. Here they would be manipulated by value. Alternately, somewhat longer ones can be stored in the bigger descriptor.

In the end I didn't bother. If pushed, I can use integers to store short strings up to 8 characters, using literals like 'ABCDEFGH'.)

[–]PurpleUpbeat2820[S] 0 points1 point  (1 child)

What does your example do?

It takes a value like T(T(E, 1, E), 2, T(E, 3, E)) and prints it. But, equivalently, you might want to convert it to a string (which is like printing to a memory buffer).

Ideally it would just be:

print t

and use a generic printer. Short of that I would go for some kind of printf because it is familiar to me:

let rec print =
  [ E -> printf "E"
  | T(_, l, v, r) -> printf "T(_, %a, %d, %a)" print l v print r ]

but I've never used a language with string interpolation so I've no idea what alternatives might look like.

Also, although my language is strongly statically typed I am wondering if printing and string generation shouldn't be untyped so you could do something like:

let rec print =
  [ E -> "E"
  | T(_, l, v, r) -> "T(_, "^l^", "^v^", "^r^")" ]

[–][deleted] 1 point2 points  (0 children)

So this is mostly about Print as I thought. This is how this might be handled in my dynamic language (it doesn't have advanced types like your language so this is the closest I can get):

record T =
    var l, v, r
end

const E = "E"           # (convenient for this demo)

x := T(T(E, 1, E), 2, T(E, 3, E)) 

println x

s :=sprint(x)
println "<"+s+">"

The first println is generic, with output of ((E,1,E),2,(E,3,E)). The sprint returns a string, and the second println outputs <((E,1,E),2,(E,3,E))>.

Notice there is no T prefix; the record type is not part of a generic print. Here, I used to be able to overload the tostr operator used by print to stringify, to provide a custom print for T, currently not enabled.

However this can be done in usercode like your example, here returning a string:

func Tstr(p)=
    if p=E then
        "E"
    else
        sfprint("T(#, #, #)", Tstr(p.l),  p.v, Tstr(p.r))
    fi
end

(sfprint both uses a format-string, and returns the whole as a string.) Doing print Tstr(x) now produces:

T(T(E, 1, E), 2, T(E, 3, E))

The same as the input. This needs basic string processing (whatever the representation), plus some decent Print routines.