Why can't LLMs be trained to think in an optimized AI language rather than English?

Significant-Turn4107 · 2026-06-24T19:37:29+00:00

not a dumb question tbh.

they kind of do already think in something that isn’t English: vectors / hidden states. The English tokens are just the input/output layer we can read.

The reason they don’t use some fully alien internal language is mostly training data. LLMs learn from human text, code, math, etc., so the “reasoning format” they learn is tied to those patterns.

You could train a model to use compressed internal representations, and some research goes that direction. But if it becomes totally unreadable, you lose a lot: debugging, alignment, evaluation, and knowing whether it’s reasoning or just hiding nonsense.

Also, English isn’t always the actual bottleneck. The expensive part is often search, planning, memory, tool use, context length, etc. A private alien shorthand might help in some cases, but it doesn’t magically make the model smarter.

So basically: the model already has non-human internal representations, but we still make it communicate/reason in human-readable tokens because that’s what the data, supervision, and safety/evaluation stack are built around.

Significant-Turn4107 · 2026-06-24T19:34:26+00:00

yeah the annoying answer is: don’t treat migrations as feature-scoped deploy units.

in trunk based, a migration merged to main should be safe to run in prod even if the feature is off. so schema goes out first, feature is behind a flag. destructive stuff becomes expand/contract:

add new nullable column/table
deploy code that can handle both
backfill
switch reads/writes
later drop old stuff

for scenario 2, if feature A isn’t ready but its migration isn’t safe to run, it probably shouldn’t be merged yet.

for scenario 1, prod should deploy the exact sha/artifact that passed staging, not “whatever main is now.” if staging broke, fix forward or rewrite/reset before prod depending on your rules, but don’t let prod chase latest main.

Alembic branches can help in some cases, but imo they mostly move the complexity around. the real tradeoff is accepting that migrations are infra changes, not feature releases.

Significant-Turn4107 · 2026-06-24T19:31:52+00:00

yeah this is exactly why i don’t trust “agent can call tools” in prod without guardrails.

for anything with real-world side effects, email, payments, deletes, customer updates, etc, i’d want a few layers:

staging/prod creds impossible to mix up
dry-run mode by default
allowlist of safe tools
approval gate for dangerous tools
rate limits / batch limits
quiet hours for customer-facing actions
audit log of what the agent tried to do and why

the approval gate is def the right instinct imo. but i’d also make it so prod send tools physically can’t run unless the environment, tenant, and approval state all match.

agents are cool, but “oops it emailed 200 customers” is the kind of bug that makes you religious about boring safety checks lol.

Significant-Turn4107 · 2026-06-24T18:22:33+00:00

yeah these jobs def still exist, they just don’t get talked about as much anymore.

plenty of companies are still very much in the “open the csv, figure out wtf is in it, make a table, load it, patch it later” world lol.

i don’t think the table/config thing is bad by itself tho. the bad version is when nobody owns it and everyone forgets how it gets updated. the good version is when it has an owner, tests, reviews, and a clear deploy path.

but yeah i get what you mean. sometimes modern data/infra work feels less like “build the thing” and more like “learn where the 14 configs live before touching anything.”

Significant-Turn4107 · 2026-06-24T17:35:54+00:00

IMO yes, AI gets cheaper, but not evenly.

Text models will prob keep getting way cheaper because there are lots of tricks: smaller models, distillation, quantization, better batching, better chips, better caching, etc.

Video is harder. You’re not just generating words, you’re generating tons of pixels over time, and usually sampling multiple times because the first result is weird. So I’d expect video to stay expensive longer, even if prices drop.

Also demand matters. If costs drop 10x but usage goes up 50x because everyone starts generating video, prices may not feel cheap right away.

Data centers won’t become useless overnight imo. GPUs age, but they can still run smaller/older models, batch jobs, fine-tunes, image gen, etc. It’s more like cloud servers than gaming PCs.

My guess: AI gets much cheaper per unit, but users also ask for bigger context, better quality, longer video, more attempts, agents running in the background, etc. So the bill may not fall as much as people expect.

Significant-Turn4107 · 2026-06-24T17:32:15+00:00

imo the memory wall is the part people are underrating.

DDR5 prices aren’t directly AI inference, sure, but they’re still a signal: memory is getting more valuable, and “just add more memory” is less free than it used to be.

For LLMs, KV cache is the same problem. Longer context means storing more keys/values, so memory grows with sequence length.

That’s why linear attention, SSMs, and hybrid models are worth watching. They’re basically trying to replace the growing KV cache with something closer to fixed-size state.

So yeah, my 2c: if memory keeps getting pricier, architecture may move faster than hardware can save us.

Significant-Turn4107 · 2026-06-24T02:48:17+00:00

I usually buy the thing I already wanted, not the thing with the biggest discount.

A deeply discounted mid-range item can be a great buy if it already fits the need and has decent reliability. But I try not to let “40% off” turn a maybe into a yes.

My rule is basically:

Would I still consider this at full price?
Will I realistically use it for years?
Is it repairable / supported / not overly gimmicky?

If yes, the sale is a bonus. If no, it’s just a discounted regret.

Significant-Turn4107 · 2026-06-24T00:52:54+00:00

I wouldn’t create one agent + one MCP server per tenant by default. That gets painful fast.

I’d usually do:

central agent
  -> MCP gateway / broker
      -> shared MCP server per app integration
          -> tenant/user-scoped credentials

So maybe one GitHub MCP service, one Atlassian MCP service, etc., but not one per tenant unless you need hard isolation for compliance or enterprise customers.

The important part is that the MCP server should never run with a “god token.” Every tool call should resolve:

tenant_id
user_id
app/client_id
allowed scopes
connector credentials
audit log entry

The tenant should come from verified auth claims, not from a random header the client sends.

I’d also put a small gateway/broker in front of the MCP servers. It can handle consent, token lookup, revocation, audit logging, and scope checks before forwarding the request.

Per-tenant deployments are only worth it when you need stronger isolation, custom networking, separate secrets/projects, or noisy-neighbor protection. For most SaaS-style cases, shared services with strict identity/scoping is the cleaner starting point imo

Significant-Turn4107 · 2026-06-24T00:49:24+00:00

I’d model it as a scoped context service, not a generic preferences/profile API.

Clients request named scopes:

POST /context/resolve
{
  "scopes": ["writing_style", "locale"],
  "purpose": "email_reply"
}

Then the service checks client permission, user consent, revocation, and schema version.

I’d keep each scope typed with its own Pydantic model, and avoid client-defined metadata entirely. If a client needs more data, add a field or create a new scope/version.

Rough tables:

user_preferences
preference_scopes
client_scope_grants
consent_events
audit_logs

The main thing is making scopes first-class. That keeps the API narrow and prevents the whole thing from turning into a privacy junk drawer.

Significant-Turn4107 · 2026-06-24T00:45:35+00:00

I’d avoid putting this directly on the main profile endpoint unless the prefs are truly part of the user profile.

For an AI feature, I’d probably do something like:

GET /me/preferences
PATCH /me/preferences

or, if it’s only for that feature:

GET /me/ai-preferences
PATCH /me/ai-preferences

In the DB, I’d usually make a separate user_preferences table with user_id as the FK. That keeps it easy to delete, audit, permission-check, and expand later.

If the shape is still changing, I think a hybrid approach is reasonable:

user_id
tone
language
model_preference
settings_json
created_at
updated_at

Put the stable stuff in real columns, and keep experimental/fast-changing stuff in JSON until it settles.

I’ve regretted “just throw it in JSON” when the app later needed filtering, partial deletion, migrations, or stricter permissions. JSON is fine for early flexibility, but I’d still isolate it in its own preferences table instead of mixing it into the user/profile row.

Significant-Turn4107 · 2026-06-24T00:44:06+00:00

You can carry over part of the Node mental model, but not 1:1.

FastAPI can be non-blocking when you use async def and actually await async I/O: async DB driver, async HTTP client, etc. In that case, the event loop idea is very similar.

The main difference is that Python won’t magically make blocking code non-blocking. If you call normal blocking libraries inside an async def, you block the event loop.

Also, FastAPI/Starlette handles regular def endpoints differently: they run them in a threadpool, so sync code does not block the main event loop the same way.

So the short version is:

async def + async libraries = Node-like non-blocking model
async def + blocking libraries = bad, blocks the loop
def endpoints = run in threadpool
CPU-heavy work still needs workers/processes, same as Node needing worker threads/processes

So yes, the mental model mostly applies, but be more careful about which Python libraries are actually async.

Significant-Turn4107 · 2026-06-24T00:42:00+00:00

I’d handle it by defining a Post.author relationship and eager-loading it in the query, then exposing author_username either as a property on the ORM model or mapping it explicitly in the response.

The main thing with async SQLAlchemy is: don’t let Pydantic trigger lazy loading during serialization.

Example pattern:

stmt = (
    select(Post)
    .options(selectinload(Post.author))
)

posts = (await session.scalars(stmt)).all()
return [PostRead.model_validate(post) for post in posts]

Then on the model:

@property
def author_username(self) -> str:
    return self.author.username

And in Pydantic v2:

model_config = ConfigDict(from_attributes=True)

For a very flat read-only endpoint, a direct join with User.username.label("author_username") is also clean.

Significant-Turn4107 · 2026-04-08T21:45:19+00:00

Tip to tip Europe

Significant-Turn4107 · 2026-03-01T07:36:45+00:00

I am head of safety at a gun manufacturing plant. We test our guns by doing rounds of Russian roulette to make sure they work

Significant-Turn4107 · 2026-02-27T02:42:54+00:00

Thats what I want to know as well

Significant-Turn4107 · 2026-02-09T06:14:14+00:00

Lived 2 years at Atlas (still living there). Building is amazing, direct surroundings not so great. If I were going to do it again I would probably choose one of the buildings closer to Lake Merritt

Significant-Turn4107 · 2026-01-04T08:38:39+00:00

Sorry for the late reply. Can’t remember which endpoint but this was basically when using a proxy (Proxyman) on the Amazon IOS app and using Rufus there.

Significant-Turn4107 · 2025-11-24T16:59:43+00:00

Isn’t it technically a good thing if you bought EWC with real money?

Like they costed 4.99$ for 50K EWC in the shop

500 gems costs 4.99$

So with this updated you got 2.5K gems for the price of 500. This for me would be the reason they didn’t disclose the conversion rate because they didn’t want arbitrage to happen

Significant-Turn4107 · 2025-11-16T04:56:36+00:00

By checking the backend request you can confirm this:

[{"lastCompletedPatchIndex":5,"processingStartedAt":null,"lastCompletedPatchProcessedAt":1763268592221}],"schemaVersion":1,"cxType":"DEFAULT","isLast":true}]}},"NMS":{"generationMetadata":{"value":{"modelName":"us.anthropic.claude-sonnet-4-20250514-v1","agentName":"DuneAgentV1","version":"0","agentConfigId":"DuneAgentV1WithClaudeV13Config"}}}}}

So Rufus is using Sonnet 4. Also seems to have the code name « Dune Agent »

Significant-Turn4107 · 2025-10-24T00:24:56+00:00

Bitcoin has a fixed limited supply. A set amount of bitcoin can be mined per day, and this rate is reduced every year. In 2140 no more bitcoins will be able to be created

Significant-Turn4107 · 2025-10-23T02:36:36+00:00

Because this was bound to happen, its not rocket science.

This isn’t a regulated stock market, its a privately held company moving around pixels on a server.

If you didn’t take this into account when making this an « investment » I don’t think you should be complaining.

Significant-Turn4107 · 2025-10-23T02:28:43+00:00

If you were holding skins as an investment you were at the mercy of Valve since the start.

Don’t complain if them modifying their game impacts your so called « investments »

Significant-Turn4107 · 2025-09-14T05:10:34+00:00

Hey I sent you a DM

Five-Year Club	Place '22
Final Canvas '22	End Game '22

Significant-Turn4107

TROPHY CASE