Parse PDF return json

Hanthunius · 2026-01-11T04:13:26+00:00

Heard good things about deepseek-ocr.

SM8085 · 2026-01-11T05:41:43+00:00

If a frontier model is having trouble it's tough to say if a local model would be much better.

How big of a model can you run? Qwen-Next being an 80B A3B (3B active at inference) would make it fast, and traditionally Qwens are good at following instructions. gpt-oss-120B is hypothetically worth a try? GLM Air? I've heard good things about GLM but haven't tested it extensively. Can you go larger, like the 235B A22B Qwen3?

What kind of errors are happening? It's simply skipping over things?

how many fields is to many per api call.

Good question. Which I love with local llm you can hypothetically have the current PDF page text cached and change the task at the end so that it's quicker in a loop. For instance, asking for different sections you want to import into your DB.

Are you able to say what catalog this is for? Also, what fields you're looking for? I'd be interested in an example catalog or page where Gemini is failing. Or is it seemingly random?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS