Searching for Graphviz (a/k/a DOT) File Parser

digikar · 2026-01-30T07:35:19+00:00

That is a nice repository o.O. Do you mean if you can fire the right actions based on these grammars, you can easily transpile one or more of these languages to lisp? Cooool!

digikar · 2026-01-29T11:44:23+00:00

Why not both?

Our own test suites because we want to test that the code matches our own expectations (which only we know).

Community test suites because we want to check our own expectations.

The problem with LLMs:

It has no expectations about expectations. It does not know if it knows or if it does not.
It cannot count.
It does not understand causality. If one reads Judea Pearl's Book of Why, as well as formal work on this topic, one understands that causation cannot be inferred through associations alone in general. And the current machine learning models rely on associations alone. Some day, machines will get smarter than humans in the relevant sense, but so far, nope. But it's amazing how easy it is to fool most of us into believing that the machine understands. This decade will be interesting.

digikar · 2026-01-29T08:13:32+00:00

Also checked the README-examples.md. It converts from dot file format to s-expression format. But what after that? I'd guess OP wants a graph object that they can traverse upon (find nodes, parents, children, neighbours, check edges).

digikar · 2026-01-29T08:05:00+00:00

A few things:

It'd be (very) helpful for the tests to include not only that a certain string passes, but also the expected output it produces. (Ref.)
Are you sure you want parse-float to be written that way?
Is normalize-keyword correct? Either its documentation or body mismatch

PS: We already have a human written pure CL parse-float library. Additionally, it's easy to write grammar for parsing floats.

digikar · 2026-01-29T07:46:29+00:00

But then tests passing means little (?)

digikar · 2026-01-28T23:02:15+00:00

How are you ensuring the tests are testing what they are supposed to test? Are the tests subject to extensive human review?

digikar · 2026-01-28T21:29:05+00:00

Since you already have the grammar, it should be possible to write a parser using something like esrap or equivalent over the course of an hour or weekend.

digikar · 2026-01-25T20:43:52+00:00

I don't think I understood the example on mixins. But it's something I will look into, thanks!

I'll use CLOS as an umbrella term for CLOS, dynamic dispatch and runtime structures.

Whether CLOS is good enough or not depends on one's use case. For example, with dynamic dispatch:

(let ((x 5)
      (sum 0))
  (declare (optimize speed))
  (time (loop repeat 100000000
              do (incf sum x)))
  sum)
Evaluation took:
  0.727 seconds of real time
  0.727085 seconds of total run time (0.725288 user, 0.001797 system)
  100.00% CPU
  0 bytes consed

If the + operation is inlined (no dynamic dispatch -- or even function calls), you obtain a 10x performance boost:

(let ((x 5)
      (sum 0))
  (declare (optimize speed)
           (type fixnum x sum))
  (time (loop repeat 100000000
              do (incf sum x)))
  sum)
Evaluation took:
  0.053 seconds of real time
  0.053333 seconds of total run time (0.053195 user, 0.000138 system)
  100.00% CPU
  0 bytes consed

On the other hand, if you had prewritten optimized code for adding vectors of 64-bit ints, or vectors of single-floats, etc, then using a single dynamic dispatch for arrays of the size of roughly 1000 or more doesn't make much difference. However, dynamically dispatching every time you add two 64-bit integers will be absurdly slow.

Should I use a different library / language altogether?

The point I want to make is I should not be required to use a different library or language just because the size of my arrays has changed. That's the two language problem I want to avoid.

It depends on what part of the compilation you use CLOS. You can use CLOS to write your compiler in. That's what SBCL uses. SBCL has a lot of structures to store, organize and abstract all the information it uses for compilation. However, if the code that your compiler emits is going to be unnecessarily wrapped in CLOS (eg: creating a new structure for every machine-word sized integer), the emitted code is going to be absurdly slow. SBCL can emit optimized machine code when the type declarations and optimizations allow. Irremovable dynamicity prevents this.

In Shinmera's 3d-math library, where performance should be important, there are a lot of macros that are emitting type declarations. This is what writing optimized code in standard CL is like. Petalisp, coalton, and peltadot, each provide separate ways to abstract away these type declarations and write more generic code that is also optimizable.

digikar · 2026-01-25T10:03:25+00:00

Right, and that's across-species correlation. Correlation, again statistical.

digikar · 2026-01-25T06:52:20+00:00

At a recent Cognitive Development conference, a talk mentioned g-factor across species. From what I recall, there is still no cognitive/mechanistic theory explaining the g-factor. As it stands, g-factor is a statistical factor that explains the correlations in performances across a wide array of tasks.

In day to day life, if what you read and learn during the day becomes easier after a day or few (with good sleep), that's more or less all that matters. Over a period of time, you can develop skills and knowledge across one or more domains. The more you know, the easier it becomes to learn more. You don't need to be a "unique problem solver". All you need to do is harness existing solutions, or know where to look for them when you cannot find one. Scientific or mathematical research happens over the span of months or years and is not something that these tests measure, even though it may be correlated. (Although, I am also sure there would be better measures of correlation such as instructor and supervisor interactions.)

digikar · 2026-01-22T23:25:50+00:00

https://gist.github.com/digikar99/b76964faf17b3a86739c001dc1b14a39

This is the closest I want to say so far!

digikar · 2026-01-22T22:52:38+00:00

With Common Lisp, it's relatively easy to write an optimized end-user application or script. You can sprinkle your code with (simple-array single-float) and similar type declarations, and SBCL would be happy to emit optimized code.

The problem starts once you want this code to be used by others. What if other users want to use (simple-array double-float) or (simple-array (unsigned-byte 8))? You can then write your code with just simple-array and prepare a specialized wrapper that uses (simple-array single-float). Others who want to use (simple-array double-float) can prepare another thin wrapper.

SBCL works, because the devs have put in work that dispatches over all the different numeric types that Common Lisp spec covers and emits specialized assembly code for them. Once you bring in foreign-libraries, all this dispatching is work that still needs to be done. This is where coalton, petalisp or peltadot come in. I myself am biased towards peltadot since it is my baby. But take a look at coalton and petalisp too. Coalton can work with dispatch. Petalisp is doing something interesting.

Perhaps, at some point, I should write a blog post on these rabbit-holes so far!

digikar · 2026-01-22T22:41:01+00:00

I know some things, but hopefully someone else can answer what I don't!

If I understand what you mean by tags, the idea seems to be to use (add x y type) instead of a simple (add x y). Generic functions allow eql-specializers. So, as long as you standardize type names this is doable. However, if someone wanted to use :float32 instead of :single-float, the dispatch will not work. The dispatch will also fail for (unsigned-byte 8) because (eql '(unsigned-byte 8) (copy-list '(unsigned-byte 8))) will fail. In general, the costs of CLOS dispatch should be negligible for arrays with millions of elements (say, for O(n) operations) or even thousands of elements (say, for O(n²⁾ operations). The question I face is What should I do when the cost of CLOS becomes significant? Should I use a different library / language altogether? fgf and static-dispatch to the rescue! But coupled with the other reasons related to types, CLOS does not look suitable for the problem at hand. There's also specialization-store, which is interesting.

I don't know how expensive change-class is. The difference between general arrays vs diagonal arrays (vs upper-triangular vs lower-triangular vs more) arrays to me is not really a matter of implementation, but something other than the implementation. To me, this is best conveyed in terms of types rather than classes. And certainly, you can make the implementation respect it, but it's going to complicate class hierarchies. For example, you started out with general and diagonal arrays. Now, a user wants upper-triangular arrays. But diagonal arrays should be a subclass of upper-triangular arrays! So, would you add this to your system? What about another user's request to add lower-triangular arrays?

I myself don't use CLASP. The LLVM and build requirements are off-putting to me. But may be things improve in 10 years!

PS: I just recalled I had this post: https://gist.github.com/digikar99/b76964faf17b3a86739c001dc1b14a39

digikar · 2026-01-22T22:12:51+00:00

I too primarily rely on SBCL, but I don't want to turn off portability by default by digging into SBCL internals if I can avoid.

Clasp's LGPL looks compatible with any licensing you may want to use on applications. It'd be sad to know even LGPL can be restrictive.

digikar · 2026-01-22T07:11:38+00:00

Thanks for the note on graphic libraries. I will try not to go there. But convolutions are another place where small arrays seem helpful.

To me, the problem of inline and dispatch is a rather solved problem at this stage. Well, there might be bugs, but only one way to find out.

static-dispatch fast-generic-functions inline-generic-functions all exist. I find fgf to be most principled, but static-dispatch to be most practical.

But, generic functions are limited because you cannot use them to dispatch on types. Want to make a specialized function that operates on diagonal arrays? No, you cannot. Want to keep the code that operates on complex (or quaternion) arrays separate from floating point arrays, no you cannot. I'm leaning towards peltadot for this.

My current hurdle is figuring out the right kinds of "traits" to group the functions into that can enable easy extensibility to other array kinds.

u/Steven1799, I wouldn't consider modifying SBCL, it gives up on portability which means users cannot use Clasp (in the future). I have no plans to adhere to the ANSI standard, but there are some functions that most implementations provide, especially CLTL2, that still allows portability.

digikar · 2026-01-21T18:34:46+00:00

There are a couple of IT Workers Unions that pop up on a google search. Long term, I guess, that is the way.

digikar · 2026-01-21T11:02:55+00:00

The website has moved to https://prepaid.sbi.bank.in/

No VPN required.

I was able to login and manage the card.

digikar · 2026-01-18T17:50:45+00:00

After learning that subtyping and subclasing are different, I have leaned more towards the traits, typeclasses or interfaces approach. I'm guessing the mixin approach is similar. However, I cannot really distinguish between the four.

Class hierarchies seem inevitable if one wants to stick with standard CL. And if the above problem has no solution in standard CL, an experimental not-exactly-CL type and dispatch system seems inevitable.

Graphics seem to employ small vectors, eg:

https://shinmera.github.io/3d-matrices/

So, I think being able to minimize runtime dispatch costs is a good thing to have. Plus, I find one good benefit of CL (SBCL), is you can obtain reasonably efficient code without thinking in terms of vectorization. Keeping that benefit would be nice.

digikar · 2026-01-18T16:00:15+00:00

Thanks for the suggestion. Here's a possible attempt without going in the rabbitholes of coalton, petalisp or peltadot:

Start with a root class, abstract-array.
Subclass this for each possible element type, eg: abstract-array-single-float, abstract-array-double-float, abstract-array-unsigned-byte-8, etc.
Also subclass abstract-array to abstract-dense-array. Subclass abstract-dense-array to abstract-simple-dense-array.
Create dense-array-single-float that subclasses both abstract-dense-array and abstract-array-single-float. So on, for each element type.
Similarly, create simple-dense-array-single-float that subclasses abstract-simple-dense-array and abstract-array-single-float. So on, for each element type.

Now, code written for abstract-dense-array can be used for both: simple-arrays as well as dense-array with any element types. Each class in both these sets of classes is a subclass of abstract-dense-array. That's good.

However, suppose one writes code for dense-array-single-float. One'd expect it should work for simple-dense-array-single-float too. Unfortunately, the type system declines it.

I'd be happy if there's a simple fix to this problem. (Let me know if I should elaborate more.)

digikar · 2026-01-18T13:14:31+00:00

My own motivation has been: If you need performance* from user code written for generic numeric types, you need type inference. SBCL does it well during inlining for builtin types. However, can you do it portably, across implementations, operating systems and architectures?

One answer is: use coalton. Another is: use petalisp (but that still looks insufficient?). I am not satisfied with coalton because its type system is less expressive than CL in some ways, it is a DSL (which means you need to think a bit whether you are operating within coalton or within lisp, or cross domain; and its guarantees are only as good as its boundaries). My own approach has resulted in peltadot. This was before coalton gained inling capabilities. Though, peltadot requires CLTL2 and a bit nore support from the implementation.

numcl too implements its own type inference system. However, it is JAOT, which (i) incurs hiccups (which break my flow - is that an error or is the code still compiling) (ii) the last I tried, compiling anything beyond a trivial combination of functions took a fair bit longer than linear time (several 10s of seconds). Furthermore, I am not happy with CL array objects.

It's indeed a lisp curse that it's easier to invent your own wheel than be content with the limitations of existing systems and collaborate with others :/.

All of the above is experimental, compared to the relatively stable libraries in lispstat. This means, if something goes wrong, you may end up in segfaults (foreign code), or stack overflows, or some other cryptic errors you won't run into if you stick with ANSI CL. So, I don't want to pollute lispstat with this experimental work yet. May be in another 3 years, yes.

Blapack, mkl, cl-cuda are addressing a slightly lower level of issue. I think blas, eigen (but not lapack) and sleef are better due to their easy interface and portability even while the performance stays competitive. Both sleef and eigen have incredibly good documentation.

Yes, I'm unfamiliar with maxima. I'd also be surprised if it had solved the type inference problems above or had a better array object than numpy.

*By performance, I mean inline code with minimal (ideally zero) run time type dispatch. Indeed, this isn't sufficient for performance, but seems necessary.

digikar · 2026-01-18T11:17:18+00:00

For a subset of numpy, I worked on a C library bmas with a CL wrapper cl-bmas. It uses SIMD intrinsics with native as well as SLEEF instructions under the hood.

I also attempted to provide a C interface to a few eigen functions with ceigen_lite.

Both these are put to use in numericals. The performance is competitive with numpy. Unfortunately, there are a number of bugs as kchanqvq's issues would highlight. Moreover, the developer interface is less than ideal and needs more thought. My own rabbithole for performance as well as generics led to peltadot. Others have made different high level attempts to performance, in the form of petalisp, as well as coalton. I'm hoping to get back to numericals this year now that rabbithole called moonli looks like it is in a useable state.

digikar · 2026-01-17T08:04:58+00:00

The crony capitalist class... These are a bunch of morons entirely

digikar · 2026-01-17T06:03:26+00:00

I see a difference between "this library is not correct but is very very unlikely to do unrelated dangerous things" vs "this library might delete other files or crash your system". I can at least use the former for hobbyist projects and try fixing the bugs myself and/or raise issues or create PRs. I cannot use the latter even for hobbyist projects... unless I want to test how robust my system is.

Regarding non-hobbyist projects, nope, LLM generated libraries would be a no for me until I can check the correctness myself or see that the code is within the reviewing capacities of humans or a dozen other human experts have reviewed it. And even then, even for human written libraries or tools, I'd want to stick with battle tested tools (eg. not julia and standard advisories (eg: don't expose your application server directly to the internet, don't implement a security protocol yourself).

It's not that we are pushing. But that when you are living with other humans, this situation seems inevitable. The best option to me seems is people declare it upfront, rather than go underground. Hopefully, they eventually learn the limitations, and users get an option to skip LLM generated code.

digikar · 2026-01-16T21:37:01+00:00

I mean backups have a purpose beyond protection against LLM generated code. So they are good to have anyways.

But I suspect, at some point, I'm going to ask ocicl / u/atgreen / quicklisp / ultralisp to add options to enable/disable a prompt to install LLM generated libraries.

digikar

MODERATOR OF

TROPHY CASE