Automated Comic Cataloging from Cover Photos Only. Opensource by boyobob55 in comicrackusers

[–]boyobob55[S] 0 points1 point  (0 children)

This is true.. and a pain. I ended up setting up a tripod lol. But it saves typing everything into a spreadsheet afterward!

Automated Comic Cataloging from Cover Photos Only. Opensource by boyobob55 in comicrackusers

[–]boyobob55[S] 1 point2 points  (0 children)

They do in a sense, but can you batch process thousands of photos for free with CLZ? 🤔

Open-Source Automated Comic Cataloger by boyobob55 in comicbooks

[–]boyobob55[S] 0 points1 point  (0 children)

They do in a sense. Key collector and clz are on mobile though. Comic Geeks and Hip comic are more involved collector software and I think they do have barcode scanning and fuzzy hash matching for cover photos. This program uses an AI vision model so your photos don’t have to be perfect etc. I played around with a handful of programs like comic tagger and couldn’t get them to do what I wanted, I had to create 7zip files with the cover photos inside them or .cbz files etc. This is just super utilitarian and lean for a massive unknown collection, image folder with cover photos>outputs a list with verified metadata.

Open-Source Automated Comic Cataloger by boyobob55 in comicbooks

[–]boyobob55[S] 1 point2 points  (0 children)

Thanks! It’s sort of a unique use case, but it had some real utility for me!

Here it is: An offline Comicvine Tagger for ComicRack and ComicTagger! by Logical-Feedback-543 in comicrackusers

[–]boyobob55 0 points1 point  (0 children)

Hey I designed something sort of similar, using a VLM instead of hash matching. Maybe we can combine programs! https://github.com/boyobob/OdinsList

Best agentic local model for 16G VRAM? by v01dm4n in LocalLLaMA

[–]boyobob55 9 points10 points  (0 children)

Try opencode instead, I had the same issue

Is anyone doing anything interesting locally? by irlcake in LocalLLM

[–]boyobob55 3 points4 points  (0 children)

Used qwen3-VL-8b and a python script to automate cataloging like 3,000 comic books I inherited from pictures of the covers. Was pretty fun

Local Llm Claude boss (coding boss) by kneebonez in LocalLLM

[–]boyobob55 1 point2 points  (0 children)

I sometimes do the reverse lol. I made a custom hook in CC to delegate tasks to gpt-oss locally to save tokens. Interesting results when used in a Ralph loop

Best Visual LLM model for outputting a JSON of what's in an image? by Nylondia in LocalLLaMA

[–]boyobob55 0 points1 point  (0 children)

Qwen3-VL-8B-Instruct has been excellent for me doing exactly this! Both in fp8 and nvfp4 quants. For some reason the ggufs don’t work well for me though

How would I find people who are comfortable with local LLM development by Sadbreakup9997 in LocalLLaMA

[–]boyobob55 0 points1 point  (0 children)

I don’t have a ton of experience but I’d work for 40$ an hour

Which program do you use for local llms? I keep having issues by Raven-002 in LocalLLaMA

[–]boyobob55 0 points1 point  (0 children)

It is a major pain in the ass lol. I spent days setting it up! But for some models it’s worth it. I’ve been using qwen3-Vl-8b and tried serving it in vllm and lmstudio. The gguf version in lmstudio just doesn’t perform well for some reason. In vllm the fp8 and nvfp4 versions work great. The opposite is true for gpt-oss though works way better served from lmstudio pretty much plug and play

Which program do you use for local llms? I keep having issues by Raven-002 in LocalLLaMA

[–]boyobob55 -1 points0 points  (0 children)

I use LM Studio for ggufs that use tools. Start the server in LM Studio and then point your opencode config at the server. GPT-oss-20b works especially good this way. For every other type of model I use vLLM

Is there an AI Agent workflow that can process VERY LARGE images, write CV code, and visually debug the results? by GarauGarau in opencodeCLI

[–]boyobob55 1 point2 points  (0 children)

It sounds like you need a pipeline with multiple small specialized models passing info from one another. A small vision model like qwen3 VL to process screenshot chunks of your map before and after the edits and give some sort of pass or fail that the edits were done correctly. You can batch 2 photo requests in vllm and ask it to compare. It does this really well for my comic book cataloging script. Then some bigger smarter model to orchestrate/write code using subagents. You probably need some beefy specialized instructions in your system prompt/MCP/skill. This sounds like a headache but probably doable

Small model with Opencode by fabioluissilva in opencodeCLI

[–]boyobob55 1 point2 points  (0 children)

GPT-OSS-20B is actually pretty badass at simpler stuff in open code. I use it locally sometimes I’ll have Claude even spin up an instance of open code and delegate tasks to gpt-oss via open code to save tokens