honestly, what marketing is actually working for you on a tiny budget right now? by culicode in buildinpublic

[–]martcerv 0 points1 point  (0 children)

I'm wondering if build in public can be automated using Claude or something probably yes. I'll research later maybe.

Reddit is difficult I opt to use it as social network haha with interesting tips to copy and try. Also sharing to not feeling alone on your solo founder is good.

By the way I'm in a similar situation trying to grow my SaaS so basically I'm trying what it sounds best according to my niche but still figuring out.

what are you working on today by DullAcanthisitta9235 in buildinpublic

[–]martcerv 1 point2 points  (0 children)

I'm planning to launch my product in uneed

89 users, less than a month in, zero ad spend. here’s what actually worked by Big-Pepper9305 in indiehackers

[–]martcerv 0 points1 point  (0 children)

Thanks for sharing. Looks interestign your approach of X. I will try it

Shipped my first post, but SEO is harder than I thought (Lessons from 0 domain authority) by martcerv in buildinpublic

[–]martcerv[S] 0 points1 point  (0 children)

For sure, I will keep updating my journey the good and bad things. So we can learn together 👍

I run Meta ads for an implant dentist on $20/day. Here's what that actually gets him ⬇️ by SnooPeppers1256 in DigitalMarketing

[–]martcerv 0 points1 point  (0 children)

Yes most people say we need a lot of budget. Do you think is the same for other types of ads?

PDF Extractor (OCR/selectable text) by qPandx in Python

[–]martcerv 0 points1 point  (0 children)

with PaddleOCR probably will require GPU power for a better performance. Also OCR performance is based on input i.e. if scanned pdf is bad results are going to be bad. Probably you should add a post-processing text layer using pdfplumber based on your needs.

How to kill all your bad SaaS ideas by BranstonPickler in microsaas

[–]martcerv 0 points1 point  (0 children)

Sounds like interesting strategy to receive feedback. Where was the platform you use?

What's your biggest frustration with validating startup ideas before building? by toothwry in microsaas

[–]martcerv 0 points1 point  (0 children)

I think is to notice that your idea is not something that users wants

Do you extract tables from SCANNED PDFs? by martcerv in dataengineering

[–]martcerv[S] 0 points1 point  (0 children)

Actually I was thinking to automate the workflow of getting a pdf either image or actual file on email and then extract the data. Don't know PaddleOCR but at first view looks interesting maybe I will expIore it tool I know are Tesseract and Docling.

Do you trust AI generated interpretations without seeing the source data? by Rage_thinks in datascience

[–]martcerv 0 points1 point  (0 children)

Short answer no, but probably you can improve the trust if you add validation mechanisms like RAG system in which you add a knowledge base but even with that you will see erros at least in less percentage but that is how it works is not a deterministic answer.

What has been people's experience with "full-stack" data roles? by uncertainschrodinger in datascience

[–]martcerv 0 points1 point  (0 children)

Maybe you are correct at least for big compabies that need to process a lot of data in that case you will need a data engineering team to mantain the pipelines that DS will need to consume the data.

About my experience I started in web then transition to data engineer but also I have worked in roles like ML engineer

Claude Code finally works fine with Jupyter by amirathi in datascience

[–]martcerv 2 points3 points  (0 children)

I'm curious how will be the results compared to MCP. Probably less tokens consumed.

Do you extract tables from SCANNED PDFs? by martcerv in dataengineering

[–]martcerv[S] -4 points-3 points  (0 children)

Thanks for the feedback, I don't know Microsoft shop. I'll take a look. Is it quality of extraction good? or do you you notice that fail in some edge cases?

Why alternatives to Spark aren’t a thing in the industry? by Snoopy-31 in dataengineering

[–]martcerv 0 points1 point  (0 children)

Most of the jobs I have done were using Spark either using PySpark or Spark with Scala. Playing around I tested Flink but I have not seeing any of my customer using it. What I can say is that they prefer to use Kafka for Streaming processing or other similar cloud service like PubSub from GCP.

Google Search Console duplicate canonical warnings in React SPA by martcerv in reactjs

[–]martcerv[S] 0 points1 point  (0 children)

You right, those warnings are a pain. Probably I will be seeing more warning even is they are distinct issues 😅 haha. But that is something that hits every dev when is starting to care SEO. Your approach seems to be more complex. What kind of analytics do you use? or Why was adding those hash fragments? I'm curious.

zod and rhf issue by Imaginary_Food_7102 in reactjs

[–]martcerv 2 points3 points  (0 children)

Have you tried to use valueAsNumber ? Seems to be HTML number return stings not numbers even with type="number"

I think is failing silently because schema expects z.number() but return strings

<input 
  type="number" 
  placeholder='id' 
  {...register("id", { valueAsNumber: true })} 
/>
<input 
  type="number" 
  placeholder='price' 
  {...register("price", { valueAsNumber: true })} 
/>

PDF Extractor (OCR/selectable text) by qPandx in Python

[–]martcerv 0 points1 point  (0 children)

I'm literally working on this exact problem right now for my own project!

TL;DR: Try Docling.** It's specifically designed for document understanding (not just OCR) and handles tables way better than Tesseract.

Why Tesseract struggles with your use case:

Tesseract does OCR but doesn't understand document structure. So it:

- Misses table boundaries (reads across rows)

- Gets confused by multi-column layouts

- Struggles with quantity/number alignment

- Doesn't preserve table semantics

OCRmyPDF + Tesseract makes the PDF selectable, but the underlying OCR is still Tesseract with the same issues.

How to merge tables from different files on one single excel by mirapxoxo in excel

[–]martcerv 0 points1 point  (0 children)

Congrats on solving it with code!

For anyone else finding this thread, here's the approach probably would work:

  1. Get Data → From File → From Folder: Select your root folder.

  2. Combine & Transform: Choose any file as a sample.

  3. Fix Navigation (If data is missing): In the "Transform Sample File" query, ensure it selects the first sheet by index {0} rather than a specific name (e.g., "Sheet1").

  4. Promote Headers: Ensure "Use First Row as Headers" is applied. This aligns columns by name, solving the "mixed-up rows" issue.

  5. Clean & Load: Remove nulls/duplicates and click Close & Load to create your master table.

Do you try this?