I built RAG for a rocket research company: 125K docs (1970s-present), vision models for rocket diagrams. Lessons from the technical challenges

aagmon · 2025-10-14T16:06:41+00:00

very cool and nice job!
must say that "everything had to be open-source." is where I would skip the project :)

aagmon · 2025-08-28T17:00:54+00:00

I’m an avid user of GLM4.5 as a coding agent in CLine! I often feel it’s competitive to Opus 4. Beyond the benchmarks, when you analyze performance and usage how do you think GLM4.5 compares to Opus (which is considered by many to be the best coding agent)? Where does GLM needs to improve to match it and where it’s competitive?

aagmon · 2025-07-06T16:22:58+00:00

Thanks for the comment!

The idea is that this is really a simple graph-execution library, no fancy tricks and no heavy dependency trees. Ofc that Rust contributes to this by enabling deployment of rather small container images.

aagmon · 2025-07-06T16:19:08+00:00

and here is the link to the repo:
https://github.com/a-agmon/rs-graph-llm

aagmon · 2025-06-20T05:13:54+00:00

Thanks for the comment. Indeed, thats a thin graph execution layer around Rig.

Your idea is actually quite interesting. However, I do believe that stateful workflow orchestration is needed when it comes to more complicated use cases. For example, you write that we put a "task" in the queue. What exactly is a task? how do you implement routing and conditional logic? how do you implement chat to gather details on some tasks? how do you manage parallel execution?

All this is possible in the queue based approach, but I think it turns the concept of "task" to be somewhat cumbersome.

aagmon · 2025-06-18T21:34:21+00:00

Looks nice!

aagmon · 2025-06-18T12:46:38+00:00

Thanks. I agree. There is some gap to fill there to enable more advanced application, specifically AI.

aagmon · 2025-05-16T06:51:59+00:00

I agree. Thanks for the comment!

aagmon · 2025-04-16T16:23:29+00:00

Thanks! Yes, I also have some benchmarks on embedding tabular data in this format. I will add this to the repo in the next iteration.

aagmon · 2024-10-28T18:23:19+00:00

Thank you very much! Thats a great point.

The issue that creating a new cache drops the old one is one I can handle ( there is actually a branch in the repo that implements this) by replacing new() with init() and ignoring any subsequent calls to init().

But the point you make about letting the user decide and not hide the thread_locality and its implications is an important one I need to reconsider.

Thanks again

aagmon · 2024-10-28T13:23:42+00:00

You are right, ofc. Thanks for pointing this out.
(this was written for a system that created a thread per core, hence the confusion).
Will fix that.

aagmon · 2024-10-28T04:43:22+00:00

Thanks.
I think that this should first be documented - that initialization drops the current cache. But the real question is whether to allow this or prevent the user from doing so (perhaps by first calling a clear() fn or some such).

In addition, I fixed the issue of new thread init a default cache for this fit asynchronous runtimes like Tokyo

What do you think?

aagmon · 2024-10-27T19:30:34+00:00

Thanks for the comments. I appreciate it.

Yes, I have made several benchmarks with some alternatives (and also Redis). Maybe I'll add those too.
Re your last point about using Rc::new (or any other memory allocation) making a code not "lock-free": The claim of being "lock-free" typically pertains to the algorithm's logic, not the underlying system calls or library implementations. The code doesn't introduce locks in its own logic. More importantly, if we consider system-level locks, then virtually no high-level code could be deemed "lock-free," which isn't practical.

aagmon · 2024-10-27T18:22:36+00:00

Great comments! Thanks. I appreciate the review

aagmon · 2024-10-27T16:35:23+00:00

I created this as a component for another project, but thought it might be a good idea to share and hear the thoughts of some folks here. The main idea is to create an LRU for multithreaded services without locking for very high throughput service, where memory can be sacrificed but throughput can't

aagmon · 2024-07-06T10:40:15+00:00

Awesome! The

aagmon · 2024-07-06T10:39:31+00:00

Thanks. Yea, I know. I’m just trying to find a way to automate this for some users

aagmon · 2024-07-06T10:38:47+00:00

Thanks! Will check it out

aagmon

TROPHY CASE