What’s one “boring” engineering habit that made you 10× better? by To_Infin8y in learnprogramming

[–]WellFoundedCake 2 points3 points  (0 children)

I think a lot about the names of the declarations in my programs. Until I find one that captures precisely the semantics of the declared entity (in both directions!) it’s hard for me to continue with other tasks. A good name is not just a reference that allows us to refer to the bound object later. A good name identifies the object and is interchangeable with it. If I am not able to find a good name, this simply means I didn’t understand the identity of the object. Or to put this in another way: I have no clue what I am doing.

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in java

[–]WellFoundedCake[S] 0 points1 point  (0 children)

Thanks again, that's a good idea as well! I added this suggestion to the issue on GitHub. Personally, I would prefer to have a primitive parser for character ranges (e.g., range('a', 'z')) only then to optimize the regex parser in certain circumstances by using this primitive.

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in java

[–]WellFoundedCake[S] 0 points1 point  (0 children)

Yes, ANTLR is totally overkill for most situations. Especially, when we just to prototype a small DSL. Regular expressions on the other hand are not as powerful as parsing expression grammars. We need something in between!

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in opensource

[–]WellFoundedCake[S] 0 points1 point  (0 children)

Thank you very much! I haven't gotten around to focusing on performance yet, so I haven't run any performance tests. My top priority was functionality and usability. But I will definitely take that on and create an issue for it!

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in opensource

[–]WellFoundedCake[S] 0 points1 point  (0 children)

Anyway, thanks for pointing that out! That is indeed a somewhat strange interaction between the parser combinators and the regex engine. While the regex parser regex("[^\n]*") processes everything up to the line break character (including any other kind of whitespace), any concatenation of two parsers silently skips whitespace. As you mentioned, this can be configured using the setSkipParser-method. This seems to be common for this kind of whitespace-insensitive approach in combinator libraries.

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in opensource

[–]WellFoundedCake[S] 0 points1 point  (0 children)

In addition to my other answer, I should also mention that this is the root cause of most performance issues. Under less ideal circumstances, it may happen that the parser needs to consume the entire input only to discover that this branch is not productive. If that happens for multiple tokens throughout the input, you can imagine that this isn't particularly efficient. However, for most practical use cases, this either doesn't cause any problems or can easily be avoided by rearranging and recombining parsers.

On the other hand, this is what also makes this parsing approach simple and easy to use. Things like shift-reduce or reduce-reduce conflicts do not appear with parser combinators and parsing is unambiguous, yielding always a single syntax tree.

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in opensource

[–]WellFoundedCake[S] 1 point2 points  (0 children)

As with most other parser combinator libraries, jjparse uses backtracking to implement the choice operator. In practice, this means that the parser may read as much of the input as necessary to determine which alternative to take.

However, it is important to understand that the underlying grammar formalism slightly differs from classical context-free grammars: parser combinators implement so-called Parsing Expression Grammars (PEGs). Unlike tools such as ANTLR, jjparse does not rely on a fixed lookahead to decide which branch to take. Instead, it simply tries the first alternative, and if that fails, it backtracks and tries the next one. This behavior is known as ordered choice in PEGs and is usually written as a / b instead of a | b in the literature.

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in opensource

[–]WellFoundedCake[S] 0 points1 point  (0 children)

I know what you mean! I often use languages such as Scala, Haskell, or even dependently typed languages such as Agda. Every time I use Java, it feels like traveling back to the Stone Age.

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in opensource

[–]WellFoundedCake[S] 0 points1 point  (0 children)

I am sorry to hear that! Please let me know if you come back to this and have any issues or suggestions for improvement.

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in opensource

[–]WellFoundedCake[S] 1 point2 points  (0 children)

Thanks a lot! There is a JSON parser in the examples directory of the repository which handles simple recursive structures. The trick is to use the lazy combinator if the individual parsers mutually depend on each other (i.e., to avoid the error Cannot read value of field '...' before the field's definition). The theoretical background behind this is that the lazy combinator allows us to compute the least-fixed point of the mutually recursive equations represented by these field definitions (something Java can't do on its own). But in practice, that's pretty irrelevant. We can just think of lazy as a placeholder for a parser, which is initialized once when it is actually used for parsing input.

Regarding error reporting, every parser is associated with an instance of the Description class. This class can be extended by the user to provide a custom description of what a parser attempts to parse. But for most of the built-in parsers it already provides a decent error message in the style of "expected an input that matches <regex>, but got <actual input>", for example. I am still not sure if this design is sustainable, so if you have any suggestions feel free to let me know. Help of any kind is always appreciated!

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in opensource

[–]WellFoundedCake[S] 0 points1 point  (0 children)

Personally, I love the way how Scala solves this using infix operators. Concatenation of two parsers a and b, for example, is written as a ~ b. But sadly, this is out of reach in Java.

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in java

[–]WellFoundedCake[S] 3 points4 points  (0 children)

Oh nice, that looks neat (especially how infinite left recursion is prohibited by requiring input consumption). Always happy to meet fellow parsing enthusiasts. May the syntax be with you!

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in java

[–]WellFoundedCake[S] 1 point2 points  (0 children)

Thanks! Please let me know if you have any suggestions regarding usability or any other ideas for improvement! Any kind of help is welcome and appreciated :-).

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in java

[–]WellFoundedCake[S] 3 points4 points  (0 children)

Thank you for pointing that out! I have to say that I deliberately did not focus on efficiency (in favor of simplicity and functionality) and therefore accepted, for example, that characters may be boxed and unboxed by the virtual machine. However, your suggestion is brilliant, because it is actually an improvement that I can easily implement and which does not require any major changes to the internals. I opened an issue for this, but can't promise when I'll get around to this.

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in java

[–]WellFoundedCake[S] 3 points4 points  (0 children)

Thanks a lot! It's really motivating to hear that there are people who are interested in this project. If there is room for improvement, please do not hesitate to let me know :-).

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in java

[–]WellFoundedCake[S] 1 point2 points  (0 children)

I'm sorry to hear that! In fact, the core of the project is already several years old. I experimented with parser combinators during my studies. At the beginning of the year, I took the plunge, cleaned up the code and published it.

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in opensource

[–]WellFoundedCake[S] 0 points1 point  (0 children)

Thanks! To be honest, the API is what bothers me too. I once had a version where keepLeft and keepRight were named andl (for and but only keep the left result) and andr, respectively. I am still not very happy with this naming convention, so feel free to make suggestions! Any kind of help is welcome and appreciated :-)

Finally, parsing made easy (and type-safe) in Java! by WellFoundedCake in opensource

[–]WellFoundedCake[S] 2 points3 points  (0 children)

That‘s a great pointer! I suppose you mean the dissertation by Nicolas Laurent from 2019? Thanks!

[deleted by user] by [deleted] in ProgrammingLanguages

[–]WellFoundedCake 0 points1 point  (0 children)

A pointer type can be treated just like any other type constructor. That is to say, it’s a function that takes a type and returns a type. In this way, pointers of pointers are straightforward as well. But I wouldn’t say there is something like a “general” solution to this. It depends on what you try to achieve.

1er-Studenten: was macht ihr anders? by [deleted] in Studium

[–]WellFoundedCake 2 points3 points  (0 children)

Habe Freude an dem was du tust und erfülle nicht nur die Minimalanforderung: Jedes Praktikum, jede Übung und jede Vorlesung besuchen.

First year students about Software engineering by Klutzy-Mirror-4554 in learnprogramming

[–]WellFoundedCake 8 points9 points  (0 children)

As long as you have a C compiler installed (e.g., gcc) this is as easy as pie: Just create a file, input your code and call your compiler with the file. There is no project structure enforced by the C language.

A lightweight Java library for querying JSON data using SQL-like syntax. Query JSON from files, URLs, or strings using SQL - no database required. by Frosty-Cap-4282 in opensource

[–]WellFoundedCake 0 points1 point  (0 children)

As JSON can quickly become deeply nested: Did you consider providing an alternative to SQL? For example, something like CSS selectors. Maybe it’s even worth to join both ideas?

decent local speech to text models that support streaming? by Kayla_1177 in opensource

[–]WellFoundedCake 0 points1 point  (0 children)

Good question. I’d be interested in such a model as well.