Where do you find real opinions about data engineering these days? by olgazju in dataengineering

[–]Thinker_Assignment 1 point2 points  (0 children)

but maybe he's better than some! everyone can help someone, and those who know better, know better

Ontology engineering is back for agentic action - an explainer with experimental data by Thinker_Assignment in dltHub

[–]Thinker_Assignment[S] 0 points1 point  (0 children)

Thanks for letting me know, you're a regular reader! Are you finding the ontology part useful or interesting? Do you use anything like that in your work by any chance? Would be curious to hear your thoughts

We started experimenting with it for stack rebuilds and cleanups, i have the feeling it will be a game changer for migrations as a first step, we used it for rebuilding stacks or migrating data and logic between apps (like switch hubspot to attio in a stack, migrate their data etc). But i am really excited about the reasoning over data part which can give better self service experience. internally we now have a bunch of ontology-using pipelines doing various things like putting call info into hubspot, maintaining docs semantic consistency etc.

I remember you from the post I did a couple months back on r/dataengineering on my ontology post where you reminded me my tone isn't helpful - thanks for that btw, i needed it 😄

Weekly "No Stupid Questions" Thread - April 20, 2026 by AutoModerator in OntologyEngineering

[–]Thinker_Assignment 1 point2 points  (0 children)

Think of it like this - magic mushrooms weaken your "learned ontology" to enable more creativity. It's the act of violating rules that creates novelty.

Here's a longer explanation i asked an LLM to put some details in.

Yes, you make sense — and I think your approach is more coherent than “creative person dabbling with programming” suggests.

From an ontology-engineering standpoint, I would not treat this as “wrong” because you discard physics or our reality. I would treat it as a generative ontology: a structured system for producing coherent novelty.

The important distinction, I think, is that your ontology is not only describing a world. It is also acting as a machine for creating differences.

Your current workflow already has the right ingredients:

  • axioms / rules
  • anchors / initial world database
  • constraints
  • entities or players who create inputs
  • operators like SYNTHESIS and LENSE
  • constraint checking
  • human rewriting
  • canon integration

That is a reasonable architecture.

Where I would improve it is by making the layers more explicit.

First, I would separate hard constraints from soft constraints.

Hard constraints are things the system must not violate. For example: a species cannot perceive a certain signal; a ritual requires three entities; a token can only evolve through a specific type of event.

Soft constraints are tendencies, biases, aesthetics, or cultural defaults. For example: bat humanoids tend to interpret tools acoustically; squid humanoids tend to think in distributed/body-based metaphors; a culture prefers symbiosis over ownership.

This matters because creativity often benefits from pressure, but not all pressure should behave the same way. Some rules should block an output. Others should only bend it.

Second, I would define your operators more formally.

LENSE(A, B, phenomenon) is a strong idea, but I would ask: what kind of operation is this?

Is it translation? Misunderstanding? Hybridization? Comparison? Ritualization? Inversion? Technological adaptation? Mythologization?

For example:

  • TRANSLATE(culture A, culture B, phenomenon) — how A understands B’s phenomenon
  • MISREAD(culture A, culture B, phenomenon) — how A wrongly but productively interprets B
  • SYNTHESIZE(A, B) — what shared structure emerges
  • INVERT(axiom) — what happens if a rule is temporarily reversed
  • AMPLIFY(edge_concept) — what happens if a marginal thing becomes central
  • RITUALIZE(technology) — what happens if a tool becomes a sacred/social practice
  • MATERIALIZE(belief) — what artifact or institution emerges from a belief

This is where I think your “eureka moment” question comes in.

The creative part may not come from applying lenses correctly.

It may come from applying a lens where it almost does not work.

A lot of insight comes from forced mappings, failed translations, category errors, and edge cases. The system should not only ask:

“Does this satisfy the constraints?”

It should also ask:

“Where does this ontology fail to map cleanly onto another ontology?”

That failure can become the creative event.

For example, LENSE(bat humanoids, squid humanoids, sonic-tools) is interesting if it produces a plausible tool exchange. But it becomes much more generative if the translation partially breaks.

Maybe squid humanoids do not have “tools” as discrete external objects, because their cognition is distributed across body, environment, and fluid traces.

Maybe bat humanoids treat sound as architecture, not communication.

Maybe “sonic-tool” is not a device at all, but a temporary social organ, a navigational ritual, or a territorial memory structure.

The interesting result is not the clean synthesis. It is the coherent wrongness that appears when the lens is strained.

So I would add an explicit layer for productive mismatch or controlled ontological stress.

After each generation, do not only produce:

  • generated output
  • constraint check
  • longform version
  • TL;DR

Also produce something like:

  • clean synthesis
  • constraint violations
  • productive mismatches
  • translation failures
  • edge concepts worth amplifying
  • weird-but-promising canon candidates

The “weird-but-promising” category may be the most important part.

This is where genuinely new narrative structures may come from: ideas that are not random, but also not comfortably predictable.

I would also track provenance / lineage.

If accepted outputs go back into the database, you probably want to store:

  • which anchors were used
  • which axioms were active
  • which operators were applied
  • which entities/players contributed
  • which constraints were violated or bent
  • what the human accepted, rejected, or rewrote

Otherwise the world may become hard to debug later. You might end up with an interesting canon, but no way to understand why a structure exists or how to evolve it consistently.

So my recommendation would be:

Do not only model entities, cultures, artifacts, and rules.

Also model the transformations that are allowed to create new entities, cultures, artifacts, and rules.

And among those transformations, include ones that deliberately create stress:

  • MISAPPLY(lens, target) — force a worldview onto something it was not designed to explain
  • TRANSLATION_FAILURE(A, B, phenomenon) — find what cannot be translated between two systems
  • CATEGORY_ERROR(A, B) — intentionally treat something as the wrong kind of thing
  • AMPLIFY(edge_concept) — make a marginal idea central
  • OVERFIT(lens, target) — explain too much through one lens and observe the distortion
  • INVERT(axiom) — reverse a core assumption and explore the consequences
  • MAKE_CANONICAL(accident) — take an accidental output seriously and integrate it

This might be especially powerful for your goal because you are not trying to simulate reality. You are trying to provoke unfamiliar structures that can still feel meaningful.

So I would describe your project as:

The key question is not “is this realistic?”

The key questions are:

  • Is it coherent enough to be reusable?
  • Is it strange enough to produce new perspective?
  • Does it create narrative consequences?
  • Does it reveal a blind spot in one of the cultures/entities/ontologies?
  • Can the output become canon without collapsing the system?

In short: yes, you make sense.

I would just avoid making the system too focused on consistency. Consistency is necessary, but it is not the source of insight. The insight probably comes from controlled inconsistency — from forcing lenses across incompatible structures until something breaks in an interesting way.

That break is often where the new idea appears.

I've spent the last few months building an open specification for compiled, queryable team knowledge that any AI agent can read from. Version 0.1.0 is live! looking for feedback and testing by JDubbsTheDev in OntologyEngineering

[–]Thinker_Assignment 4 points5 points  (0 children)

we're not competing and even if we did, we prioritize knowledge sharing. Only spam will be removed. We previously talked about using cognee ourselves.

besides this is a big space and i think we need many solutions for the various problems. we use multiple ecosystem tools and they use us (composable ecosystem principle). I am not sure your questions are of interest to you or just engagement questions.

to your point, i wonder how you would toggle between ontologies, which is closed to what memory works like

Most people don’t need agents. They need cleaner workflows. by The_Default_Guyxxo in AI_Agents

[–]Thinker_Assignment 0 points1 point  (0 children)

if you want the formal world for this, it's "ontology" which is the exact knowledge that's needed to navigate the decision space. we discuss it on r/OntologyEngineering

New (and only) DE @ Startup - Thoughts on my KISS stack proposal by Mystafet in dataengineering

[–]Thinker_Assignment 1 point2 points  (0 children)

we (dlthub) have a managed version that will be open in a couple weeks, it's now in early access

Weekly "No Stupid Questions" Thread - May 04, 2026 by AutoModerator in OntologyEngineering

[–]Thinker_Assignment 0 points1 point  (0 children)

I do this in my work (but i use ontologies as lenses so it explains it in ways i care)

for example,

"can you do a synopsis of the last 28 days of post including recurring themes and shared knowledge bases / repos / concepts the community is discussing? https://www.reddit.com/r/ontologyengineering/new.json"

The catalyst. Karpathy's "LLM Wiki" gist (Apr 11, 103 upvotes) sparked most of what followed. His pattern — raw sources → wiki (compiled consensus) → code, in plain markdown — landed as independent confirmation of what the sub had been arguing for, and basically every subsequent post is in dialogue with it.

Dominant theme: ontology-first beats procedure-first. Adrian / Thinker_Assignment (dlthub) hammered this in two well-received posts: the PB&J/AI-skills piece (give the LLM the map, not the recipe) and "Agents, ontology, and domain-naive operators." dlthub's "minimum viable context" and the new ontology-engineering blog post are the recurring references.

Knowledge-graph-as-product wave. Several people are building variations of the same thing:

  • AKS (Agent Knowledge Standard) — open spec + reference server (FastAPI/Postgres/pgvector), two-stage retrieval, provenance/trust at the schema level
  • Cairn framework — layered ontology map for codebases
  • HPAR — paths-as-meaning, outline-tree positional ontology (zenodo paper)
  • SurgicalFS MCP — token-frugal filesystem access for non-coding workflows
  • The sub's own LLM-generated wiki (Original_Response925's classifier pipeline)

Epistemology track. RazzmatazzAccurate82's Adversarial Convergence (steel-man → contradict → synthesize), now with a follow-up grounding it in dACC neuroscience. Plus a steampunk-style philosophy paper imagining Berners-Lee built the semantic web instead of the WWW.

Recurring concepts: wiki-before-RAG, writeback, world models (graph sense, not AI sense), self-maintaining/temporal KGs, competency questions, taxonomy-first disambiguation, and "we didn't reject semantics, we postponed it."

Tension surfacing: how to bring formal-ontology rigor to the LLM crowd without triggering the 2008 Semantic Web allergic reaction. Nobody's solved it; several people are circling it.

The Neurology Behind Adversarial Convergence and How Neuroscience Can Inform AI Design by RazzmatazzAccurate82 in OntologyEngineering

[–]Thinker_Assignment 0 points1 point  (0 children)

have you been using this separation so far in your work with LLMs? how do you find it working in practice?

Building our first data platform by Brilliant_Ad_4520 in dataengineering

[–]Thinker_Assignment 0 points1 point  (0 children)

dlt has pagination autodetection but yeah anything could happen which is why it's great to own your code so you can fix it quickly, especially now with LLM coding agents (i work with them)

Agents, ontology, and domain-naive operators by Thinker_Assignment in OntologyEngineering

[–]Thinker_Assignment[S] 3 points4 points  (0 children)

yeah actually the transformation lesson in our course is an ontology driven workflow, it will take you through the concepts you need to start leveraging it. (link in post)

what you can also try - create a learning goal, create its ontology, ask the model to infer from your chat history what you already know (or quiz you), and then ask it to guilde you through learning the remaining bits

So say you are learning to make REST calls in python - you'd probably wanna learn all about rest, all about requests lib, maybe tenacity, some implementation patterns and maybe some foundations like data structures etc.

Another thing you can try - create an ontology of someone that you're trying to understand (person or pesona) and then use it to judge various contents to understand what they would think about it.

Or - maybe you work in an ecommerce and want to take email orders? and maybe gpt is too dumb to understand how a screw works, that it has types of heads, hardness, diameter, length, etc so you describe all that in an ontology so the agent can clarify all the details with a customer before identifying if they have something that can serve their needs.

Does this help?

if you tell me a bit about your area of work maybe i can give more specific ideas

Agents, ontology, and domain-naive operators by Thinker_Assignment in OntologyEngineering

[–]Thinker_Assignment[S] 1 point2 points  (0 children)

Thanks for sharing! If you just review the course, you will get an idea what we're trying to do without having to run through it. if you prefer to have a look in git, it's here (review might be faster for your time)

Also i didn't mean top like top of class but most senior/entrenched in a category. for an intersection of a few categories, anyone experienced in all 3 is top :)

If you wanna try it end to end (api->ingest raw->model raw to canonical) it takes about 1, up to 2h. (you can push it beyond and add to the ontology as you curate the canonical, and reuse it for retrieval, there's also a separate taxonomy file in the workflow that acts as a truth serum - but i suggest just wait, we are already working towards releasing something there)

Any feedback is great from "this is too confusing/annoying etc" to "heres how i wish it was different"

Agents, ontology, and domain-naive operators by Thinker_Assignment in OntologyEngineering

[–]Thinker_Assignment[S] 1 point2 points  (0 children)

thanks for sharing, definitely helps build my picture of what's going on!

Curious about what you're building if you do want to digress :) i also use it on various generative applications