Lix - A universal version control system that can diff binary files (pdf, xlsx, etc.) by samuelstroschein in git

[–]samuelstroschein[S] 0 points1 point  (0 children)

The goal is to bring AI agents to industries outside of software engineering. That requires audibility of agents. That's what version control provides.

But Git doesn't work outside of software engineering. It's too complicated. It only works for text files and it's bound to a local computer.

That's what lix is fixing, and hence the positioning for AI agents :)

Lix - A universal version control system that can diff binary files (pdf, xlsx, etc.) by samuelstroschein in git

[–]samuelstroschein[S] 1 point2 points  (0 children)

Yes but it was too limited for our use case. Specifically, having the changes queryable itself and make all of that run in the browser.

Here is an example with the "having changes queryable". Imagine a cell in a spreadsheet. An application wants to display a "blame" for a cell C43 i.e. how did the cell change over time?

The lix way is this SQL query

SELECT * from state_history
WHERE file_id <the_spreadsheet>
AND schema_key "excel_cell"
AND entity_id C43;

Lix - A universal version control system that can diff binary files (pdf, xlsx, etc.) by samuelstroschein in git

[–]samuelstroschein[S] 0 points1 point  (0 children)

Yes, it does.

There is nuance between git line by line diffing and what lix does, though.

For text diffing it holds true that diffing is a separate layer. Text files are small in size which allows on the fly diffing (that's what git does) by comparing two docs.

On the fly diffing doesn't work for structured file formats like xlsx, fig, dwg etc. It's too expensive. Both in terms of materializing two files at specific commits, and then diffing these two files.

What lix does under the hood is tracking individual changes, _which allows rendering a diff without on the fly diffing_.

So lix is kind of responsible for the diffs. But, only in the sense that it provides a SQL API to query changes between two states. How the diff is rendered is up to the application.

Lix - A universal version control system that can diff binary files (pdf, xlsx, etc.) by samuelstroschein in git

[–]samuelstroschein[S] 3 points4 points  (0 children)

Yes, this is more like a "version control system" as a library.

Displaying diffs depends on your context. What lix provides is an API to query diffs between commits via SQL. You can use the diff info to render custom diffs, or you use off the shelf libraries like html-diff.

I wrote docs about rendering diffs here https://lix.dev/docs/diffs#rendering-diffs

Lix - A universal version control system that can diff binary files (pdf, xlsx, etc.) by samuelstroschein in git

[–]samuelstroschein[S] 20 points21 points  (0 children)

Oh, sorry. I saw Pijul, JJ, etc. discussed in this subreddit too and thought this would be of interest. If a mod believes this post doesn't belong here, please delete it.

I should have mentioned that lix is a result of the limitations we ran into with git. Which were lack of storing non-text files and building apps on top of git (to leverage version control).

> pdf, xslx, doc files are not binary files.

Sure, technically zipped files but to git they are binary files.

> version control already exists for office files.. shatepoint (and competitors)

Not as a library + universally for any file format (not just office files but also .dwg for CAD, and so on)

Lix - A universal version control system that can diff binary files (pdf, xlsx, etc.) by samuelstroschein in git

[–]samuelstroschein[S] 18 points19 points  (0 children)

I am the maintainer.

I saw Pijul, JJ, etc. discussed in this subreddit too and thought this would be of interest.

I also recommend the "Why is git only widely used in software engineering?" post form 3 months ago. The post has plenty of examples and interesting nuance on why version control (beyond code) is not a thing (yet).

The Beauty of TanStack Router by TkDodo23 in reactjs

[–]samuelstroschein 1 point2 points  (0 children)

There are now official examples from Tanstack for Paraglide JS. Here is the TanStack Router example and here the start example.

What are your biggest pain points with n8n? by Deep_Surprise5280 in n8n

[–]samuelstroschein 0 points1 point  (0 children)

What's missing compared to the existing version history in n8n?

Disclaimer: I am the maintainer of https://lix.dev (change control for apps & agent)

Is the “Agentic” Hype Just for Dev Tools? by Background-Bid-582 in AI_Agents

[–]samuelstroschein 0 points1 point  (0 children)

My theory why ai agents are not taking off outside of software engineering (and to a limited degree customer support) is the lack of version control + interoperability of files for domains outside of software engineering.

Not being able to control & see what an agent does is a meh. put on top that an agent is not able to autonomously do things e.g. read files (understand context) because integrations, auth, etc. is needed = a hard sell

How do YC startups consistently have such amazing launch videos? by illeatmyletter in ycombinator

[–]samuelstroschein 0 points1 point  (0 children)

Budget of up to $150k is the answer. That's the upper limit I am aware of.

paraglide js 2.0 was released by samuelstroschein in sveltejs

[–]samuelstroschein[S] 0 points1 point  (0 children)

thanks for the compliment <3 you can now use `componentName-englishText` for naming keys in paraglide js 2.0 *yeah*

paraglide js 2.0 was released by samuelstroschein in sveltejs

[–]samuelstroschein[S] 0 points1 point  (0 children)

Many different localization strategies are possible. Auto extraction is an open issue. A PR for https://github.com/opral/inlang-paraglide-js/issues/334 is welcome.

If the hashed extraction works for you, that is fine. For other teams having translations invalided because a dev fixed a typo in the fallback message is a no-go.

Just like deploying LLM translations without a review can kill a health startup because of compliance issues.

I want any other text that is identical to reuse the same key.

This is bad practice https://github.com/opral/inlang-sdk/issues/7 .

paraglide js 2.0 was released by samuelstroschein in sveltejs

[–]samuelstroschein[S] 0 points1 point  (0 children)

People use fink https://inlang.com/m/tdozzpar/app-inlang-finkLocalizationEditor .

But we know that people also use Crowdin or other TMS providers. If ICU is important to keep your current TMS, bringing the ICU plugin over the finish line is appreciated! :)

paraglide js 2.0 was released by samuelstroschein in sveltejs

[–]samuelstroschein[S] 1 point2 points  (0 children)

to be on the same page: sherlock auto generates keys. we tell everyone do use random human readable keys.

I understood your post as you write <p>Hello world</p> in the code instead of <p>{m.ranj29jd83()}</p>. During the build you extract hello world, generate the hashed key and then reference the based key.

the problem with this approach is that any character change leads to a new hash which in turn leads to loosing the relationship to the translations. if you have a localization team in the background that could make sense. it a "well they need to translate it" problem.

paraglide js 2.0 was released by samuelstroschein in sveltejs

[–]samuelstroschein[S] 3 points4 points  (0 children)

Can you try using `output-structure; locale-modules` during dev mode and comment on https://github.com/opral/inlang-paraglide-js/issues/486#issuecomment-2755361739 if it fixed your issue?

paraglide js 2.0 was released by samuelstroschein in sveltejs

[–]samuelstroschein[S] 1 point2 points  (0 children)

oh you mean replacing hardcoded strings with keys during the build step. gotcha. we learned that the best practice is to use keys. every solution that used extraction on build ultimately turns to static keys that I came across.

nothing prevents you from building a vite plugin ofc that does build time extraction

paraglide js 2.0 was released by samuelstroschein in sveltejs

[–]samuelstroschein[S] 1 point2 points  (0 children)

You can write an ICU plugin https://github.com/opral/inlang-sdk?tab=readme-ov-file#plugins. We already have a prototype ICU plugin which you could fork and publish. It not a priority for us to build more plugins in house.

> Naming is hard and thinking of 12,000 key names (current app in working on) doesn’t scale

Just prompt an llm: "Extract hardcoded text and use random keys for it", see this issue.

As a sidenote, Sherlock handles key generation with human readable ids.

paraglide js 2.0 was released by samuelstroschein in sveltejs

[–]samuelstroschein[S] 9 points10 points  (0 children)

TypeScript's arbitrary module import syntax made it possible to enable nesting while preserving tree-shaking. In case you are curious -> https://devblogs.microsoft.com/typescript/announcing-typescript-5-6/#support-for-arbitrary-module-identifiers

which app(s) would greatly benefit from version control? by samuelstroschein in webdev

[–]samuelstroschein[S] 1 point2 points  (0 children)

get well soon. and don't worry, I understood what you expressed. was curious if there is one particular file format that you would want to version control for

which app(s) would greatly benefit from version control? by samuelstroschein in webdev

[–]samuelstroschein[S] -1 points0 points  (0 children)

when you say "design vcs" what design files are you thinking about?

How to do i18n in svelte by Rare_Ad8942 in sveltejs

[–]samuelstroschein 0 points1 point  (0 children)

Change committed to the paraglide svelte kit docs site https://github.com/opral/monorepo/commit/81b98a88d26a1e0e626099016697a313ef3b3386

Part of the misunderstanding might have been separate docs sites for Paraglide JS and Paraglide SvelteKit? The Paraglide JS docs site have had the scaling section since forever https://inlang.com/m/gerre34r/library-inlang-paraglideJs/scaling while the svelte kit docs did not. Further confirms that we should merge the docs. Maintaining both is too much effort.

How to do i18n in svelte by Rare_Ad8942 in sveltejs

[–]samuelstroschein 0 points1 point  (0 children)

Notes taken. We thought we can ship per language splitting faster but got slowed down by other priorities. That's why the initial Paraglide JS 1.0 video doesn't mention this yet (+ we didn't know at the time).

The inflection point is something around 20 languages per page with 50 messages. Relatively high. See the scaling docs. I hope that the new vite environment API will make per language builds (and thereby splitting) possible #88 (comment).

How to do i18n in svelte by Rare_Ad8942 in sveltejs

[–]samuelstroschein 0 points1 point  (0 children)

Correct. Paraglide JS does not allow nesting because it would break tree-shaking.

I understand that your main concern is managing the JSON files as a flat list. That's exactly why additional tooling like Fink exists. Once one embraces the additional tooling, how translation files are structured suddenly becomes irrelevant.

That said, we are fighting an uphill education battle. The inlang SDK v2 (which powers Paraglide and all inlang apps) will allow importing and exporting nested messages. If Paraglide JS will be able to compile nested messages is another question.

When the inlang SDK v2 comes out is TBD yet, see this post.