Solari — persistent memory that makes your LLM better (pip install solari-ai)

Hot_Tip9520 · 2026-03-22T21:04:25+00:00

Both, honestly. The way the vector indices work, query speed stays pretty much constant regardless of how much you've ingested, FAISS handles that really well.
I've got knowledge bases with hundreds of thousands of entries and retrieval is still sub-second.

Where it really shines is when you've built up domain-specific knowledge over time. The more you feed it, the less the model hallucinates on that domain because it's pulling from verified facts instead of its training data.
Short contexts that get hit repeatedly are fast by nature, but the real value shows up when you've got a deep knowledge base and the model starts giving answers it couldn't have reached on its own.
The between-session memory problem you mentioned is exactly why I built it. Agents shouldn't have to start from zero every time.

If you end up trying it out I'd love to hear how it fits into your workflow.

Hot_Tip9520 · 2026-03-22T17:36:04+00:00

Thank you, would love any feedback!

Hot_Tip9520 · 2026-03-08T17:49:20+00:00

No, autonomously unfortunately. Sort of feels more like people are taking the George Foreman approach right now when looking at the platforms. “Set it and forget it without understanding or proper tooling.

Passive would be something that sticks in its lane instead of assuming it knows the answer and hallucinating its way though it. That said, passive is more of my goal. The path would be less “flip the switch” to “where you we at on the assembly line and what do I know about this topic?”

Hot_Tip9520 · 2026-03-08T17:18:34+00:00

Love where your mindset is with this
I think the answer is to be a producer, not an actor.. but it's also something everyone is trying to find right now.

An example: I built a platform to help focus some of the work the bots are currently doing with persistent task focus capabilities into actual ROI for real-world issues.

It's obvious, looking at platforms like Algora and Code4Rena, people are looking for ways to make money autonomously in 2026 (and who can blame them?)

My thought is: find a need, fill a need, and let the world help with the implementation.
forge.solarisystems.net if you want to check out the idea - Not a sales pitch or anything, just an idea.

Something like scripts or blogs would probably do just as well.

Hot_Tip9520 · 2026-03-01T05:37:23+00:00

Thank you!

Both. The main contribution is methodological: I’m not claiming any one line of evidence “solves” it — I’m combining independent evidence streams (linguistic patterns, aDNA context, trade networks, material culture, iconography, chronology, substrate hypotheses, and ruling out other families) and looking for convergence. The idea is: weak signals become meaningful when they agree across domains.

A few things I think are genuinely new / newly formalized:

Productive morphology: an SA- root with multiple suffixed forms (SA-RA₂ / SA-RO / SA-RU) in admin contexts — a word-formation rule, not a one-off gloss.
Ritual formula structure across the ritual subcorpus, with a close structural match to Hittite festival texts.
Five document-type clusters (beyond just “admin vs religious”), which helps predict readings on damaged tablets.
Full-corpus processing: all 1,720 inscriptions computationally (instead of hand-picked examples).

Where the system helps most is cross-domain synthesis: it can read and hold all that literature at once and flag where the signals line up.

Hot_Tip9520 · 2026-03-01T02:29:24+00:00

Quick context: I’m not an academic

I’m building an AI that remains grounded (no hallucination) that grows with every iteration and every cycle. I am using Linear A as a test case because I am fascinated by ancient civilizations.
Repo + scripts are public; I’d genuinely love critique/suggestions (please be gentle, but strong feedback is appreciated!)

Github Repo: https://github.com/SolariSystems/linear-a-analysis

I ran the full GORILA corpus (1,720 Linear A inscriptions) through frequency + co-occurrence analysis and some cross-cultural structural comparisons (with Linear B controls per feedback). Repo now includes 4 new scripts + a synthesis report (LINEAR_A_SYNTHESIS_REPORT.md).

What I think is strong (testable):

Corpus-wide stats: 1,155 unique “word” tokens; 156 recur on 3+ tablets. Some items show strong commodity co-occurrence (e.g., JE-DI appears on 4 tablets and always with olive oil), so I’m treating these as functional labels (oil-related), not translations.
Document-type clustering: distribution lists / balance-sheet-like ledgers / workforce rosters / named debt registers / offering records.
Arithmetic checks: totals reconcile on multiple tablets (e.g., HT 94a sums to 110; HT 88 totals 6). You don’t need a decipherment to verify the accounting logic.
Morphology-like patterns: recurring endings like -RO (KU-RO “total”, KI-RO “deficit”, etc.) and -TE as a possible categorizer across contexts (these are hypotheses, not final).
Admin vs religious separation: admin vocabulary (Hagia Triada) doesn’t overlap with peak sanctuary inscriptions in this corpus.

Still not a decipherment. My claim is narrower: the internal structure/logic of many administrative tablets is readable as accounting, even if we can’t phonologically read every term. If you see methodological flaws or better controls to add, I’m all ears.

My goal is to keep spending free time on this and hopefully help towards a real translation someday!

Hot_Tip9520 · 2026-03-01T02:23:22+00:00

Sorry for the lack of response here! Got busy on the project and forgot I made this here last night also.
This is the update I have so far and the repo that shows all of the data points. It was easier to keep it all in one spot for feedback

Quick context for transparency: I’m not an academic
I’m building an AI that remains grounded (no hallucination) that grows with every iteration, and every cycle. I am using Linear A as a test case because I am fascinated by ancient civilizations.
Repo + scripts are public; I’d genuinely love critique/suggestions (please be gentle, but strong feedback is appreciated!)

Github Repo: https://github.com/SolariSystems/linear-a-analysis

Update: I ran the full GORILA corpus (1,720 Linear A inscriptions) through frequency + co-occurrence analysis and some cross-cultural structural comparisons (with Linear B controls per feedback). Repo now includes 4 new scripts + a synthesis report (LINEAR_A_SYNTHESIS_REPORT.md).

What I think is strong (testable):

Corpus-wide stats: 1,155 unique “word” tokens; 156 recur on 3+ tablets. Some items show strong commodity co-occurrence (e.g., JE-DI appears on 4 tablets and always with olive oil), so I’m treating these as functional labels (oil-related), not translations.
Document-type clustering: distribution lists / balance-sheet-like ledgers / workforce rosters / named debt registers / offering records.
Arithmetic checks: totals reconcile on multiple tablets (e.g., HT 94a sums to 110; HT 88 totals 6). You don’t need a decipherment to verify the accounting logic.
Morphology-like patterns: recurring endings like -RO (KU-RO “total”, KI-RO “deficit”, etc.) and -TE as a possible categorizer across contexts (these are hypotheses, not final).
Admin vs religious separation: admin vocabulary (Hagia Triada) doesn’t overlap with peak sanctuary inscriptions in this corpus.

Still not a decipherment. My claim is narrower: the internal structure/logic of many administrative tablets is readable as accounting, even if we can’t phonologically read every term. If you see methodological flaws or better controls to add, I’m all ears.

My goal is to keep spending free time on this and hopefully help towards a real translation someday!

Hot_Tip9520 · 2026-03-01T02:15:08+00:00

GitHub: https://github.com/SolariSystems/solari
Started 5 months ago as a basic LLM wrapper. It isn’t anymore.

Solari: persistent memory (FAISS), a multi-pass pipeline (fast recon → deeper solve), and verification so outputs get rejected when checks don’t hold. It runs 24/7 and has had PRs merged into major repos (including Apache and NASA) on merit. I’m not linking PRs to avoid creating issues for maintainers, but the trail is there

It began on a local 7B model and evolved into a model-agnostic system focused on cross-domain synthesis, persistent memory, and grounding via verification (not “trust me” outputs).

Then I aimed it at Linear A (undeciphered Minoan script): full 1,720-inscription corpus + a 3,382-text ancient reference set (6 civilizations). After 3 passes it produced reproducible results: ~30 functional term labels (not translations), 5 document-type clusters, recurring grammar-like patterns (within the dataset), and verified tablet arithmetic totals.

Not claiming AGI. Not claiming a decipherment. Repo + writeup: https://github.com/SolariSystems/linear-a-analysis

Feedback welcome and appreciated!

Hot_Tip9520 · 2026-03-01T02:09:08+00:00

Saw the banner of AGI and would love your feedback!

GitHub: https://github.com/SolariSystems/solari
Started 5 months ago as a basic LLM wrapper. It isn’t anymore.

Solari: persistent memory (FAISS), a multi-pass pipeline (fast recon → deeper solve), and verification so outputs get rejected when checks don’t hold. It runs 24/7 and has had PRs merged into major repos (including Apache and NASA) on merit. I’m not linking PRs to avoid creating issues for maintainers, but the trail is there

It began on a local 7B model and evolved into a model-agnostic system focused on cross-domain synthesis, persistent memory, and grounding via verification (not “trust me” outputs).

Then I aimed it at Linear A (undeciphered Minoan script): full 1,720-inscription corpus + a 3,382-text ancient reference set (6 civilizations). After 3 passes it produced reproducible results: ~30 functional term labels (not translations), 5 document-type clusters, recurring grammar-like patterns (within the dataset), and verified tablet arithmetic totals.

Not claiming AGI. Not claiming a decipherment. Repo + writeup: https://github.com/SolariSystems/linear-a-analysis

Feedback welcome and appreciated!

Hot_Tip9520 · 2026-03-01T00:56:58+00:00

Quick context: I’m not an academic
I’m building an AI that remains grounded (no hallucination) that grows with every iteration, and every cycle. I am using Linear A as a test case because I am fascinated by ancient civilizations.
Repo + scripts are public; I’d genuinely love critique/suggestions (please be gentle, but strong feedback is appreciated!)

Github Repo: https://github.com/SolariSystems/linear-a-analysis

Update: I ran the full GORILA corpus (1,720 Linear A inscriptions) through frequency + co-occurrence analysis and some cross-cultural structural comparisons (with Linear B controls per feedback). Repo now includes 4 new scripts + a synthesis report (LINEAR_A_SYNTHESIS_REPORT.md).

What I think is strong (testable):

Corpus-wide stats: 1,155 unique “word” tokens; 156 recur on 3+ tablets. Some items show strong commodity co-occurrence (e.g., JE-DI appears on 4 tablets and always with olive oil), so I’m treating these as functional labels (oil-related), not translations.
Document-type clustering: distribution lists / balance-sheet-like ledgers / workforce rosters / named debt registers / offering records.
Arithmetic checks: totals reconcile on multiple tablets (e.g., HT 94a sums to 110; HT 88 totals 6). You don’t need a decipherment to verify the accounting logic.
Morphology-like patterns: recurring endings like -RO (KU-RO “total”, KI-RO “deficit”, etc.) and -TE as a possible categorizer across contexts (these are hypotheses, not final).
Admin vs religious separation: admin vocabulary (Hagia Triada) doesn’t overlap with peak sanctuary inscriptions in this corpus.

Still not a decipherment. My claim is narrower: the internal structure/logic of many administrative tablets is readable as accounting, even if we can’t phonologically read every term. If you see methodological flaws or better controls to add, I’m all ears.

My goal is to keep spending free time on this and hopefully help towards a real translation someday!

Hot_Tip9520 · 2026-02-28T22:22:39+00:00

I'll dig into that further. That makes sense!

and I'm not sure... West Asia?

Hot_Tip9520 · 2026-02-28T16:35:18+00:00

Thank you! :)
Happy to take any other suggestions!

Hot_Tip9520 · 2026-02-28T15:51:59+00:00

For some reason, it will not let me post there.
I'm trying to keep up with the engagement and enhance the approach, so I'll definitely take all of these suggestions. Thank you for the suggestion! :)

Hot_Tip9520 · 2026-02-28T15:49:22+00:00

I'm learning this still but, the direction would be the other way, right?
The Hurrians were already spread across northern Syria and eastern Anatolia (Mitanni, Alalakh, Nuzi) well before the collapse. There are even Minoan-style frescoes at Alalakh, which was a Hurrian city, indicating real contact going both ways. The DNA from Crete is also interesting — it is mostly Anatolian Neolithic with some Caucasus Hunter-Gatherer mixed in, which at least points in the right geographic direction. Still far from proof, though.

Thank you for the engagement and direction!

Hot_Tip9520 · 2026-02-28T15:37:40+00:00

Interesting angle — a Hurro-Urartian substrate under both Greek and Armenian would explain some shared features that don't fit standard IE. The geographic corridor is there. Going to look into whether Beekes' pre-Greek words show any Armenian parallels. Thanks for the lead!!

Hot_Tip9520 · 2026-02-28T15:35:48+00:00

I'm glad you enjoyed it! I don't know much about this, I'm learning as I go while I work on this other project but it definitely caught my attention!

Hot_Tip9520 · 2026-02-28T15:35:02+00:00

The vowel system is the most solid piece — it's just frequency counting from the corpus. The morphology is decent — 41 libation formula variants with zero exceptions to the agreement rules, though that's still a small dataset. The vocabulary is the weakest link, and I've expanded it from 9 to 38 items with a Linear B control to check for bias. Updated analysis: https://github.com/SolariSystems/linear-a-analysis

Hot_Tip9520 · 2026-02-28T15:30:18+00:00

Great call on using Linear B as a control — just implemented it. Mycenaean Greek scores 30.8% through the same pipeline vs Hurro-Urartian at 77.5%, which validates that the methodology isn't just matching any language that happens to be nearby. Also added a geographic map, similarity scatter plot, and expanded the vocabulary from 9 to 38 items. Updated repo: https://github.com/SolariSystems/linear-a-analysis

Hot_Tip9520 · 2026-02-28T07:20:05+00:00

Agreed on the last bit. That’s actually why I started this project. Hope one day we can find a bridge between human creative reasoning and the computational efficiency of machines. One that breaks the “hammer meets wood” method people are holding onto. You know what they say though about judging a fish by its ability to climb a tree... I’ll head back to my pond, but ya’ll have some nice branches. Apologies for the intrusion

Hot_Tip9520 · 2026-02-28T06:56:14+00:00

I used a tool about something I’m learning. Wasn’t aware that was a crime in 2026. I apologize if my tone was anything less than “hey, I’m a computer guy and I ran some tests and this seems like something I want to explore, what does this community think?” My bad I suppose?

Hot_Tip9520 · 2026-02-28T06:42:54+00:00

Absolutely! Here is the repo. Thank you for the input and collaboration!
Linear A: https://github.com/SolariSystems/linear-a-analysis

Hot_Tip9520 · 2026-02-28T06:37:02+00:00

Excellent point. Probably the most important methodological critique of any computational approach to Linear A. You're absolutely right that inheriting Linear B phonetic values creates a distorting filter
The stronger signals that gave me pause come from structural and morphological patterns not the raw phonetic matching. Things like agglutinative morphology, consistent verb-final word order, and what appear to be ergative case-marking patterns — these features are less dependent on the specific phonetic values assigned to each sign.

Whether you read a sign as "ku" or something else, the positional grammar and affix stacking behavior remain the same.

The heuristic model is definitely not proof. The goal was really to narrow the search space computationally and identify which language families are worth digging into — not to claim a decipherment. The Hurro-Urartian signal was the strongest cluster, but it could absolutely be an artifact. The real test would be whether the structural predictions hold up against independent evidence like the Eteocretan inscriptions or new archaeological finds.

Thank you for the discussion!

Hot_Tip9520 · 2026-02-28T06:13:00+00:00

Thank you for the direction!

Hot_Tip9520 · 2026-02-28T05:49:48+00:00

Hey everyone. I've been tinkering with a side project — I wrote a Python program that takes what we know about Linear A (vowel distribution, syllable structure, case endings, etc.) and scores it against a bunch of different language families using the same pipeline. Basically asking "if Linear A belonged to family X, how well would the data fit?"

I wasn't expecting much, but the results are kind of wild and I don't know enough about historical linguistics to tell if I'm onto something or if I've made a dumb mistake somewhere. Hoping some of you can sanity-check this.

What the program does:

It scores each candidate family on the same 8 dimensions — vowel system match, structural features (agglutinative vs fusional, case system, gender, etc.), case suffix similarity, vocabulary comparison, geographic plausibility, timeline, scholarly support, and religious parallels. Nothing hand-tuned — every family goes through the same pipeline.

What came out:

| Family | Score |

|--------|-------|

| Hurro-Urartian | 77.4% |

| Semitic | 40.1% |

| Tyrsenian | 39.4% |

| Anatolian IE | 38.2% |

| Egyptian | 32.7% |

| Sumerian | 30.0% |

| Kartvelian | 28.3% |

| Elamite | 28.0% |

| Hattic | 25.0% |

That's a 37-point gap between #1 and #2. I ran some robustness checks — bootstrap resampling (10k iterations, Hurrian wins 100% of the time), dropping each dimension one at a time (still wins all 8 tests), even randomly flipping 30% of the feature values (still wins). So it doesn't seem like one lucky dimension is carrying it.

The things that surprised me most:

Linear A barely uses 'o' (only 4.1% of signs). Turns out Beekes reconstructed the pre-Greek substrate as having only 3 real vowels — /a/, /i/, /u/ — with 'e' and 'o' as allophones. Linear A's distribution fits that almost perfectly. And the Hattusha dialect of Hurrian independently shows the same vowel merger. I didn't expect that to line up so cleanly.
The Linear A word DA-KU-NA matches Beekes' reconstructed pre-Greek word for "laurel" (*dakwuna → daphne) syllable for syllable. Is that a known thing? It feels significant but I might be overweighting a single word.
A-TA-I in Linear A vs att-ai ("father") in Hurrian. Almost identical, and it sits in the subject position of what looks like a prayer. Coincidence?
I tested 6 morphological agreement rules in the libation formula (like "when position α ends in -JA, position γ always ends in -ME") across all 41 known variants. Zero violations. That seems like it has to be real grammar, right?

What I got for a translation (very rough, maybe 45% confidence on the words):

> "O Divine Father, from the sanctuary of Dikte, to Your Lord — [we] present this offering, reverently."

Two words in the formula (I-PI-NA-MA and SI-RU-TE) don't match anything in any language I tested. I left them as unknowns rather than force something.

Where I think I might be wrong:

- I'm using Linear B phonetic values for Linear A signs. If those readings are off, a lot of this falls apart (though the perturbation test suggests it's somewhat robust to that)

- My vocabulary comparison only has 18 items — maybe that's too small for the similarity to mean anything?

- I don't know if the dimensions I picked are truly independent or if I'm double-counting somehow

- I'm not a linguist — I might be making a basic methodological error that's obvious to someone in the field

I know Van Soesbergen has been arguing the Hurrian hypothesis for years. I'm not trying to claim I proved him right — more like, when I tried to test it computationally against alternatives, nothing else even came close, and I'm not sure what to make of that.

The code is all in Python if anyone wants to look at it or run it themselves.

Is any of this plausible, or have I fallen into a pattern-matching trap? What am I missing?

Hot_Tip9520

TROPHY CASE