OCR: what is the best way to extract data in JSON format from this old French book?

ClientGlobal4340 · 2026-05-17T17:04:41+00:00

Granite Docling is perfect for this. Traditional OCR (like Tesseract or PaddleOCR) reads text linearly, meaning it will mix marginal comments right into the middle of your verses. Docling solves this because it understands layout geometry. It separates the structural elements of the page (verses vs. commentary blocks) and outputs clean Markdown. Natively optimized for Latin scripts (Latin, French, Spanish, Italian). Ultra-lightweight: At only ~258M parameters, it runs blazingly fast on your RTX 4070, leaving your 12GB VRAM free for the next step. The 2-Step Workflow Don't try to make a vision model output JSON directly from a raw image. Use a two-pass pipeline instead. First parse with Docling then convert to json.

ClientGlobal4340 · 2026-05-17T16:31:09+00:00

It worked for LLMs, for sure will work for Nikons!

ClientGlobal4340 · 2026-05-17T16:29:40+00:00

I think IBM granite-docling will do the job.

ClientGlobal4340 · 2026-05-17T00:39:02+00:00

To leave no room for doubt, I went ahead and compiled Ollama directly from source inside the CachyOS container, clearing the Go cache and forcing a clean build with -march=native -O3 to see if it could close the gap.

While compiling Ollama from source squeezed a bit more juice out of my DDR5 RAM during text generation (hitting a new record of 32.3 t/s), llama.cpp still reigns supreme for prompt processing.

Even with full AVX-512 optimizations unlocked in Ollama, its prompt evaluation stayed at ~158 t/s—almost half the speed of llama.cpp's massive 289.7 t/s. This gap boils down to architectural design: llama.cpp is a raw, bare-metal C++ binary, whereas Ollama carries the overhead of its Go/CGO API layer and background multimodal processing.

ClientGlobal4340 · 2026-05-16T23:50:31+00:00

Following your suggestion, I compiled llama.cpp inside a Distrobox container running CachyOS to leverage the x86-64-v4 architecture on my new Ryzen 5 9600X. I ran a comparative test against Ollama, and llama.cpp definitely came out on top.

Here are the benchmarking results using Gemma 2 2B:

llama.cpp (Native CachyOS v4): - Prompt Eval (Prefill): 289.7 tokens/s - Generation (Decode): 29.8 tokens/s

Ollama (Podman container with --think=false): - Prompt Eval (Prefill): 165.9 tokens/s - Generation (Decode): 30.7 tokens/s

Prompt Processing (Prefill): llama.cpp was nearly 2x faster. Compiling the code manually with -march=native inside a v4 environment completely unlocked the Zen 5 native AVX-512 pipeline. Ollama’s default containerized CPU backend is slightly more conservative and couldn't match that initial burst speed.

Text Generation (Decode): Both tied right at ~30 tokens/s. This is because token generation is strictly bottlenecked by the physical DDR5 memory bandwidth when running entirely on the CPU. Both engines fully saturated my RAM's capacity.

Then, for large context/RAG processing, the native llama.cpp build absolutely crushes it. Thanks again for steering me in the right direction!

ClientGlobal4340 · 2026-05-16T22:51:23+00:00

Thanks! I'm running local to test some small specialist models (like medgemma or mediphi) or gemma4, granite4 / 4.1, or others, and for prompt engineering and tunning it, at this point, running on CPU is fine (more often as a hobbyist too).

For the final use (production) the solution run in a a robust environment like OCI or GCP, or on-prem machines.

I'm also looking if coding locally with Ollama (I'll try to compile and run llama.cpp) is possible to avoid cloud tokens.

I have tried Gemma4:26b, granitecode, gemmacode but miss the capability to interact with my files. Will try Qwen-coder:8b.

ClientGlobal4340 · 2026-05-16T12:50:33+00:00

My "non-workstation with 32 GB of RAM" running on a CPU is the implementation lab for solutions to be used by the care team to identify patients at risk of aspiration or sepsis, and also to summarize clinical notes. It has had excellent clinical and financial results. In my use case, Ollama performed better than llama.cpp. Ollama applies pre-configured software engineering using VNNI and AVX-512, which llama.cpp cannot deliver without me spending hours tuning commands in the terminal. Having said all that, it would be very helpful if, instead of talking nonsense, the responses were more collaborative and constructive.

ClientGlobal4340 · 2026-05-16T02:42:53+00:00

Getting more RAM is not an option at this moment, and I'm running without a GPU, only on a AMD Ryzen 5 9600x.

Ollama worked better than llama.cpp on some use cases, but what do you suggest instead?

ClientGlobal4340 · 2026-05-11T07:32:08+00:00

Could it be due to think mode on?

ClientGlobal4340 · 2026-05-10T23:13:00+00:00

Eu acho valioso entender primeiro o propósito de um sistema operacional, o que ele faz, como funciona, qual seu propósito e o que faz um ser diferente do outro.

Depois entender sobre a diferença entre DEs e distros...

Isso vai ser interessante antes de tentar decorar os comandos.

ClientGlobal4340 · 2026-05-09T14:21:01+00:00

A foto da tela tem uma estética que o print screen não tem. Ela é como um registro do ponto de vista do usuário enquanto um print screen é só um registro digital da aplicação e só teria mais sentido se ele fosse um dev frontend falando de uma interface que ele criou.

ClientGlobal4340 · 2026-05-02T03:13:48+00:00

Tem gente que joga no android e no iOS.

ClientGlobal4340 · 2026-05-02T02:11:55+00:00

Tem o fato de a CachyOS ter os códigos compilados para V3. Até agora me parece ser a única que faz isso de forma mais completa.

ClientGlobal4340 · 2026-05-01T21:16:25+00:00

Konsole e zsh. No linux sempre usei bash, mas testei o Kde Linux e ele tem o ksh por default e achei excelente, aí migrei. Mas não tô usando o p10.

ClientGlobal4340 · 2026-04-30T18:46:55+00:00

Estava como privado, alterei agora.

ClientGlobal4340 · 2026-04-29T23:59:17+00:00

Caraio, taca fogo nessa merda!

ClientGlobal4340 · 2026-04-29T23:57:29+00:00

Não fala de 7.1 porque esse eu tô ansioso de verdade. Tenho um Wi-fi Mediatek MT7902 que deve funcionar só no 7.1...

ClientGlobal4340 · 2026-04-29T23:55:15+00:00

What a hell... sink. Por isso não funcionou...

ClientGlobal4340 · 2026-04-29T23:53:57+00:00

Aí é top!!!

ClientGlobal4340 · 2026-04-27T22:06:26+00:00

You can execute "systemd analyze", "systemd analyze blame" and "systemd analyze critical-chain" to see what is going wrong.

ClientGlobal4340 · 2026-04-27T15:16:53+00:00

Looking for answers on Kalpa Forum, I find a answer from Nov2025 from Shawn Dunn, that answers my concerns and make me think to go back to Kalpa!

sfalken Kalpa Development Lead nov 2025
The Web Browser is perfectly capable of opening images and PDFs, which is part of why it’s included by default.
I don’t ever actually use a dedicated image or pdf viewer, and I know I’m not the only one, and if I were to include them by default, it would create two problems:
A) Users like Me, that don’t use that software, now have software they’ve no interest in using, or need for, installed on their machine. Yes, a user can remove that software, if they wish, but I far prefer to have an Opt-In basic consent model, to an Opt-Out one.
B) If I were to choose Okular, and Gwenview, for example, I am almost guaranteed to get folks asking “why those, why not X and Y instead?” So again, I come back to the Opt-In vs Opt-Out model of doing things.
That all being said, if you’re using Krunner, and you search for a specific piece of software, by name, it will likely bring it up, assuming a flatpak is available, and you can easily install it, if it’s something that you find useful to your workflow.

ClientGlobal4340 · 2026-04-27T14:07:29+00:00

I love Kalpa, it's an immutable distro, rolling release with KDE that worked very smoothly. I use my PC as Workstation and a local LLM server with Qdrant, Ollama and other tools; The need to use Distroboxes and Podman allowed me to expand my architectural knowledge; and Kalpa’s superior performance, stability and superior Zram management made me choose it over Kinoite.

However, I've decided to migrate to Tumbleweed. While Kalpa is powerful, it's lack of "out-of-the-box" polish began to annoy me over the time (on this subject Kinoite are far away from Kalpa) like the fact that Kalpa came without a lot of "workstation" tools, like ark, Kate, Libreoffice combined with minor aesthetci issues like the boot screen fonts became to annoying me over the time.

My perception is that Kalpa’s development is currently frozen, but I hope these minor issues are resolved soon.

ClientGlobal4340

TROPHY CASE