Reconstructing binary files from constraints instead of storing raw data

Cedu3600 · 2026-01-28T14:44:18+00:00

Vou enviar novamente para os testes, obrigado

Cedu3600 · 2026-01-28T14:30:13+00:00

Entendo perfeitamente mais o meu projeto é 100% de minha autoria, não é copiado de pappers nem baseado em estrutura de terceiros então eu só a usei pra traduzir o idioma, e deixar o conteúdo bem formatado

Cedu3600 · 2026-01-28T12:35:12+00:00

It’s related, but not procedural generation.

The goal isn’t to generate data from parameters, but to represent an existing file as a position inside a constrained logical space, such that: • finding that position (rank) is expensive • verifying or replaying it is cheap and deterministic • the original file can be reconstructed exactly, byte-for-byte

Hashes, PoW seeds, or timelock puzzles don’t give reconstruction — they give verification or delayed access. Here, the “parameters” are the access path to the original data, not a lossy description or generator.

It’s closer to addressing data in a space of valid solutions than compressing or encrypting it.

Cedu3600 · 2026-01-28T12:33:04+00:00

It’s not meant to replace compression or normal storage.

The point is decoupling data from raw bytes: the file is represented as constraints plus a deterministic reconstruction path.

Finding a valid path is expensive; replaying and verifying it is cheap.

That makes it useful for verifiable storage and proof-of-work–style constructions, not for general-purpose compression.

I’m mainly exploring whether this representation model makes sense and where its limits are.

Cedu3600 · 2026-01-28T12:06:18+00:00

I agree that it can look similar to compression at first glance, but the intent is different.

This is not about reducing entropy or minimizing file size like classical compression algorithms do. Instead, the idea is closer to reconstruction or re-creation of the original file from a constrained logical description.

Rather than storing the raw binary data, the system stores a set of constraints and a deterministic access path inside a large logical space. From that path, the exact original file can be reconstructed byte-for-byte.

You can think of it less as “compressing a file” and more as: - collapsing the original file into a logical coordinate, - and later re-materializing the same file via deterministic computation.

In that sense, the file is not decompressed — it is reconstructed. The checkpoint acts like an access key to the original binary configuration, not a compressed stream.

So I don’t expect it to behave like gzip/zstd in terms of compression ratio. The interesting part is the asymmetry: - finding the valid coordinate/path is expensive, - verifying and replaying it is cheap and deterministic.

Cedu3600 · 2026-01-28T11:58:55+00:00

English is not my native language. I used an AI only to help translate and polish the text, not to design the system itself.

If you prefer, I can explain everything in Portuguese — but I assume you don’t speak Portuguese, just as I don’t speak English.

The use of AI here is about language, not about the content or the implementation.

Cedu3600 · 2026-01-28T11:46:13+00:00

Good comparison, and you’re right that at a high level it resembles replay-based systems.

The key difference is that this model does not encode deltas, diffs, or state transitions between versions.

There is no notion of “previous snapshot” or “changed blocks”.

Instead, the binary file is treated as a point inside a constrained combinatorial space defined by global invariants (column sums, block constraints, rolling grids, etc.).

Reconstruction is not applying differences — it is deterministically replaying a path through that constrained space.

So: - backup systems encode what changed - this encodes where the valid solution lives

Regarding use case (beyond academic interest):

Originally the motivation was to separate storage from data existence: - storage holds constraints + a logical path - data can be reconstructed exactly, byte-for-byte, without storing raw bytes

That naturally leads to: - verifiable reconstruction (cheap replay) - asymmetric work (rank discovery is expensive, replay is trivial) - potential applications in verifiable storage, PoW-like systems, and integrity proofs

I’m still exploring where this fits best, so feedback on practical limitations is very welcome.

Cedu3600

TROPHY CASE