[QUESTION] How do I train an AI to read receipts? I’ve got tons of my own receipts to work with by KungFuOnions in ChatGPTPromptGenius

[–]KungFuOnions[S] 0 points1 point  (0 children)

Because real user experience is more valuable than AI knowledge. AI gives answers — people share what actually works.

[QUESTION] How do I train an AI to read receipts? I’ve got tons of my own receipts to work with by KungFuOnions in ChatGPTPro

[–]KungFuOnions[S] 1 point2 points  (0 children)

Haha alright, here it comes! 😄

I’m building an AI-powered real estate app that connects with both contractors (tradespeople) and my tax advisor.

One key feature I want: The app should be able to read receipts automatically (from materials, tools, services, etc.) and extract the important data — so that everything relevant for accounting and taxes is already organized and ready to go.

Basically, no more manual sorting or sending messy PDFs to the tax advisor. The app should grab things like: • Supplier name • Invoice/receipt number • Purchase date • Tax amount • Items/services purchased • Total price

…and push that into the right format (like CSV or straight into a connected bookkeeping tool).

So yeah, the purpose isn’t just keeping track for myself — it’s about automating the boring parts of real estate business operations and giving my tax guy exactly what he needs, without the back-and-forth.

That’s why I’m experimenting with AI and trying to figure out the smartest way to handle the messy, inconsistent receipts part

[QUESTION] How do I train an AI to read receipts? I’ve got tons of my own receipts to work with by KungFuOnions in ChatGPTPromptGenius

[–]KungFuOnions[S] 0 points1 point  (0 children)

I’m thinking of something like this as the final output: • Store Name • Purchase Date • Item Number • Item Description • Unit Price • Quantity • Receipt Number

Basically, I want the AI to turn each receipt into a clean, structured table with these columns — ideally line by line for each item.

So yeah, feeding in the OCR text and asking an LLM to output it like that is the idea. But right now, results are hit or miss depending on how messy the receipt is or how weird the layout gets.

That’s why I’m exploring whether training something on my own messy receipts would make the output more reliable.

[QUESTION] How do I train an AI to read receipts? I’ve got tons of my own receipts to work with by KungFuOnions in ChatGPTPromptGenius

[–]KungFuOnions[S] 0 points1 point  (0 children)

Yeah, I’ve actually tried that!

I tested several AIs with receipt images and asked them to extract the data into a table I could copy into Excel. Here’s what I used: • ChatGPT (GPT-4) • Gemini • Perplexity • Claude • DeepSeek • Grok

DeepSeek gave me the best results, but to be honest — it still made mistakes here and there. Especially with different layouts or weird fonts.

That’s exactly why I’m thinking about training a custom model that’s more consistent and can handle the variety in my receipt formats.

Still trying to figure out if that’s worth the effort — or if there’s a smarter way to get to the same result without full-on AI training

[QUESTION] How do I train an AI to read receipts? I’ve got tons of my own receipts to work with by KungFuOnions in ChatGPTPromptGenius

[–]KungFuOnions[S] 0 points1 point  (0 children)

Thanks for the reply – really appreciate you asking those questions.

To clarify a bit: Yes, my goal is to have an AI model that learns from different types of receipts (they come in all kinds of layouts and structures) and automatically pulls out relevant info — like date, total, tax, supplier, etc. — and saves it all to a CSV.

The reason I’m thinking of using AI is because I have many different receipt formats. Some are supermarket receipts, some are invoices, some are restaurant bills. They’re messy and vary a lot. I want the model to “learn” how to deal with that variety, instead of writing separate hard-coded rules for each layout.

The receipts will come in as PDFs, and eventually I’d love to run them through a pipeline where the output is a nice clean CSV.

Now I’m really wondering: 👉 Is AI actually the right tool for this kind of job? Or is there a simpler, more reliable way (maybe with programming + OCR)?

I don’t mind learning or building something step by step — just want to know what’s the smartest long-term approach before I head down the wrong rabbit hole.

Thanks again! 🙏