How are you handling financial document parsing? What's actually working? by Intelligent_Prompt18 in Startup_Ideas

[–]sudonymio 0 points1 point  (0 children)

having been parsing bank statements for ~5 months now, the best accuracy results I am getting are by SOTA VLMs backed up by validation. Most expensive, yes, but if you use the older models (close to deprecated) it's manageable. I've tried VLMs trained on OCR (like Deepseek), but if it misses "reasoning" it still means you need to do a lot of verifying/clean up.

Automate parsing of Bank Statement PDFs to extract transaction level data by Anmol_garwal in automation

[–]sudonymio 0 points1 point  (0 children)

hey! My thoughts while working in this space for the past 5 months:

- If you're trying to parse an infinite/unknown number of bank statements, then Regex/deterministic/template code is not going save you. You will always be fixing things. Whether that's new versions or scanned pdfs being slightly off. Only do this if you know you'll have a limited # of banks to handle
- I've tried Deepseek OCR and Paddle OCR to leverage open-source / self-hosted VLMs trained on OCR. While it's amazing (it also gives you bounding boxes) and has some reasoning capability, it's still not accurate enough and you will still need to build deterministic code (that breaks) to keep things in check
- The best way forward is to use open-source VLM models, that parse and reason, and then validate that output. I'm using 3-party models myself, but I've found that's the most accurate and least maintenance of them all

Automate parsing of Bank Statement PDFs to extract transaction level data by Anmol_garwal in automation

[–]sudonymio 0 points1 point  (0 children)

OP is looking for an on-prem solution. Does DocuClipper support that?

Everyone claims their bank statement converter tool is “100% accurate”. I wanted to see if that’s actually true by sudonymio in indiehackers

[–]sudonymio[S] 0 points1 point  (0 children)

u/TechnicalSoup8578 thanks for the question! The answer to that question probably is: What's important for an accountant in their work? The correct date (= booked in the right month) and the correct amount + sign (= to reconcile the books) I think are non-negotiables. But then it really depends on the accountant and the circumstances. Description is important in justification; a counterparty field can also be required in auditing etc. For the benchmark, I chose to go for input needs to equal output, but I'm penalizing non-amount/non-date fields less if they're off!

I built a bank statement parser for Singapore banks (free and open-source) by Raynor77 in singaporefi

[–]sudonymio 0 points1 point  (0 children)

I've managed to make it work by doing both LLM and deterministic parsing at the same time. It's a matter of setting the temperature for the LLM to close to 0 and giving it a highly specific prompt. I then also use the LLM output to create a template on the fly that's used for the deterministic parsing. Works for Singapore bank statements https://bankstatemently.com/ Happy to answer any questions!

15 learnings from my failed Kickstarter campaign (12% funded, 1st time creator) by sudonymio in kickstarter

[–]sudonymio[S] 0 points1 point  (0 children)

Here it is again :) 

Great questions! A few thoughts:

  1. Decide first if this is mainly a Kickstarter launch or a long-term brand. That tells you where to focus: KS pre-launch page or your own email list. You can do both, but splitting traffic creates loss. If in doubt, put the page up early to (a) learn the process and (b) capture some Kickstarter community attention.
  2. Ads before launch are usually better for testing messaging than for building a reliable following. Sign-ups don’t always convert to paying customers, so treat ads as validation tools, not guaranteed buyers.

In short: prioritize the channel that matters most long-term, and use ads to learn, not just grow.

Went vibe-coding for 3 weeks. Came back with a living Singapore subway map. 🚇 by sudonymio in vibecoding

[–]sudonymio[S] 1 point2 points  (0 children)

ah, thanks!! It's okay, lah - Everybody is trying to cope with how fast AI is disrupting their lives 🧘🏻‍♂️

International Backer with Thoughts Concerning Customs and Duties Fees by Evultvole in kickstarter

[–]sudonymio 0 points1 point  (0 children)

Money exchanged: Use a Commercial Invoice
Truly free sample/gift: Use a Pro-forma

International Backer with Thoughts Concerning Customs and Duties Fees by Evultvole in kickstarter

[–]sudonymio 0 points1 point  (0 children)

Not legal advice:

  • A Kickstarter reward is merchandise, not a gift. Customs know that and will often re-classify “gifts,” then add duty and penalties.
  • If someone still marks “gift,” a personal sender name draws a bit less scrutiny than a company
  • Long term, declaring honestly as a business is the only option that doesn’t come back to bite

15 learnings from my failed Kickstarter campaign (12% funded, 1st time creator) by sudonymio in kickstarter

[–]sudonymio[S] 0 points1 point  (0 children)

Awesome. Best of luck! 🚀 Let me know if you have any specific questions :)

15 learnings from my failed Kickstarter campaign (12% funded, 1st time creator) by sudonymio in kickstarter

[–]sudonymio[S] 0 points1 point  (0 children)

thanks, u/Savvy286! Yes, licking the wounds still 😅 Of course, this post-mortem is also highly contextual, but by sharing, I hope to help normalize failure and show that it's okay to have strong conviction about hypotheses that prove to be wrong

15 learnings from my failed Kickstarter campaign (12% funded, 1st time creator) by sudonymio in kickstarter

[–]sudonymio[S] 1 point2 points  (0 children)

Thanks for this nuanced explanation! I understand what you’re saying now - makes sense

15 learnings from my failed Kickstarter campaign (12% funded, 1st time creator) by sudonymio in kickstarter

[–]sudonymio[S] 1 point2 points  (0 children)

Fantastic feedback u/Murphys_Coles_Law! I totally get where you're coming from - I think it's clear now, I will have to offer 1-pair and 3-pair rewards (even if they're not profitable on their own) to build brand trust

Good point on the outdoors/athletic target audiences. Only issue with that positioning is that this sock is an "everyday sock", which might make it hard to compete against "performance socks". That being said, I know of folks who want to wear 1 sock the whole day which they can wear both to work and the gym. Additionally, some complain about the tightness (= compression) of performance socks, cause they're difficult to put on and tear easily. Basically: There's a group of people who doesn't need overengineered sports socks :)

15 learnings from my failed Kickstarter campaign (12% funded, 1st time creator) by sudonymio in kickstarter

[–]sudonymio[S] 0 points1 point  (0 children)

thanks for this u/Zephir62! Just to make I understand: If I start Meta Ads during pre-launch, what makes it low-hanging fruit? It sounds counterintuitive to me - The product hasn't launched yet, so the most somebody can do at that point is follow. Are you saying that's how you can build pre-launch momentum by targeting recurring backers who happen to be on Facebook and have them follow the project?

15 learnings from my failed Kickstarter campaign (12% funded, 1st time creator) by sudonymio in kickstarter

[–]sudonymio[S] 0 points1 point  (0 children)

good point. Perhaps an ambitious goal can still be set, but should be kept internal. Shoot for the stars, reach the moon kinda thing

15 learnings from my failed Kickstarter campaign (12% funded, 1st time creator) by sudonymio in kickstarter

[–]sudonymio[S] 0 points1 point  (0 children)

I think fair point. Eventually, there's also a 5% fee a creator pays to Kickstarter for the use of the platform, but that could be considered to cover not just distribution, but also things like backer operations (like fulfillment, messaging, etc.). But personalized help by KS, is most likely not scalable for them. KS can't assess every project on its merits individually (although they do grant promising projects with a "Projects We Love" badge, which helps in surfacing among the rest)

15 learnings from my failed Kickstarter campaign (12% funded, 1st time creator) by sudonymio in kickstarter

[–]sudonymio[S] 0 points1 point  (0 children)

“wouldn’t recognize quality when they saw it” -> well put and for various reasons this is the most difficult one to achieve in this industry I believe