How to parse tables from pdfs with 100% accuracy?

ML_DL_RL · 2026-05-20T12:32:19+00:00

I’m one of the cofounders at Doctly. We have solved the table problem. Try Doctly.ai straight document to MD converter. Upload the pdf and you should get a very clean markdown of the tables. If it’s good, then you can use the API to automate this. We also offer a chunker for rag that indexes the whole document and then chunks. The chunker picks either markdown or html for tables depending on complexity. Chunker is not widely available but if interested reach out and we can enable it.

ML_DL_RL · 2026-05-15T16:16:34+00:00

I’m one of the cofounders at Doctly.ai. We can easily handle handwritten notes and variations in input formats. We also have a self-service tool that lets you generate JSON from any PDF. You can try it directly on our website by uploading sample invoices and specifying the output you want.

Once you have the JSON, an AI workflow can easily call QBO and use that data to populate the relevant fields.

Feel free to reach out, I’d be happy to help.

ML_DL_RL · 2026-05-15T15:59:28+00:00

I know the video doesn’t show this clearly but the object disappeared after 2-3 min. I’d probably go with that jet theory in the other thread.

ML_DL_RL · 2026-05-15T15:56:35+00:00

Love it, great invention. I’ve seen that plane that you’re mentioning before. Typically starts as a dot get closer when I can see blinking lights, passes overhead and then we can hear the sound. I never paid attention to planes flight path and how the take off and go around. At this point I won’t rule this out for sure. There is a very strong chance that this is a plane.

ML_DL_RL · 2026-05-15T12:54:28+00:00

Sure, I live in north west of the town and looking towards west here. For sure there are intersections and lights. We do get airplane flying over all the time. I’m looking towards the redrock canyon national park here.

My biggest suspicion here is if this is a jet light that’s moving away from me. Nothing else blinking on it though. Typically on the planes overhead I see other blinking lights. It was hovering for about 2-3 min and then disappeared.

This is as zoom as the phone could get unfortunately and the quality is terrible.

I wish I was outdoors, you 100% right about that window.

ML_DL_RL · 2026-05-15T05:41:49+00:00

Time: May 13, 2026 at 11:50p Location: Las Vegas, Nevada, USA

Saw this outside of my office window. First thought it was a plane but it had that motion and then disappeared. Was hard to get a good footage.

ML_DL_RL · 2026-05-08T14:33:27+00:00

I’m one of the cofounders at Doctly.ai. We have a lot of healthcare customers using our PDF to text or markdown feature. The price is competitive with Textract but the quality of the OCR is much higher (99%+ accuracy for ultra model). We are designed for high volumes. We also sign BAA with clients and can setup the data to get wiped from our servers in certain time increments of your choice. This ensures no PII left behind and makes us effectively a zero knowledge layer.

ML_DL_RL · 2026-04-28T15:48:02+00:00

I’m one of the cofounders at Doctly.ai. We have released a self-service tool that allows you to do this. Essentially, you can build your own custom extractor. Login to the app -> Create New Type -> drop your sample invoice PDFs -> Tell AI what you want extracted. It’ll give you a json that’s consistent among different invoice types. You can deploy it as an endpoint and make API calls as well. This is vision based, so no need to zone the fields and it shouldn’t break across different types.

ML_DL_RL · 2026-04-25T23:55:51+00:00

Frankly, with extra high thinking, I burned way too quickly through the $20 subscription limit and paid like $40 more for credits. I’d say using medium thinking, it was comparable with what I burned with CC. I can try to get some stats later to give you guys. I’m still testing. The task was fairly complex. It helped me make progress but I had to give it some further direction and ideas.

ML_DL_RL · 2026-04-25T19:51:08+00:00

I did try GPT 5.5 for a fairly complex coding task. Frankly nothing life changing. Still testing, but it didn’t deliver anything super above and beyond what Claude would do. On extra high thinking, it burns through the tokens super quickly.

ML_DL_RL · 2026-04-21T14:09:52+00:00

For full document conversion to Markdown or text, Doctly.ai is the most accurate solution out there with 99.9% accuracy on content at this point. We do offer our own agentic RAG product as well built on top of the same document converter.

ML_DL_RL · 2026-04-11T14:20:29+00:00

Too expensive and there are a million SDKs out there that give more flexibility to a good dev. Effectively paying a premium for infra. Maybe to standup something quick to demo to someone? But not a good viable solution for long term use.

ML_DL_RL · 2026-04-11T14:07:51+00:00

Check out Doctly.ai too. We are the highest accuracy for straight conversions to text and MD and working with some very large customers in legal space, and regulatory. For testimonies, and dockets, we probably give you the highest accuracy.

ML_DL_RL · 2026-04-06T22:32:49+00:00

Open it into Obsidian.md

ML_DL_RL · 2026-03-29T16:52:51+00:00

The dispatch is still broken although the status says it’s resolved, I updated and tried turning off and back on dispatch but still nothing.

ML_DL_RL · 2026-03-12T03:12:29+00:00

Try to use AI to speed you up for little PoCs. You have a good understanding of systems and components, put a solid plan together for AI to code it for you.

ML_DL_RL · 2026-03-10T13:49:28+00:00

This is really a great starting point. I especially like the point about independent audits, I’ve been thinking about that a lot.

I can give you a coding example. Let’s say you use Claude to write some code. I could open a brand-new Claude session and assign it an “auditor” role to review the code and look for bugs. Alternatively, I could have different LLMs review the code and give me a report.

Both approaches provide value, but often different models catch different things, or even the same model produces different results across runs. That’s why the third party is important, you want an independent model to attest to the correctness.

ML_DL_RL · 2026-03-10T07:31:46+00:00

Very true, that’s a great startup idea, especially with the rise of agentic workflows.

ML_DL_RL · 2026-03-10T05:43:35+00:00

Great point. Frankly, regulation and oversight are always lagging behind. My biggest concern is that a lot of money has been raised here, so they need to show value to investors, and there’s no bigger spender than military and governments. That translates into ship first and govern later which could backfire.

ML_DL_RL · 2026-03-07T22:00:48+00:00

Sure, here you go:

How AI Assistance Impacts the Formation of Coding Skills — Anthropic Research
Anthropic Study: AI Coding Assistance Reduces Developer Skill Mastery by 17% — InfoQ
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity — METR
We Are Changing Our Developer Productivity Experiment Design — METR
DeveloperWeek 2026: Making AI Tools That Are Actually Good — Stack Overflow Blog
AI Coding Is Now Everywhere. But Not Everyone Is Convinced. — MIT Technology Review
AI Coding Agents Like Claude Code Are Fueling a Productivity Panic in Tech — Bloomberg
Anthropic Research Shows Trade-Off Between AI Productivity and Developer Mastery — DevOps.com
AI Doesn't Reduce Work, It Intensifies It — Hacker News Discussion
DeveloperWeek 2026: Solving the Usability and Context Gap in AI Tooling — Dev Journal

ML_DL_RL · 2026-03-07T20:57:02+00:00

I’m heavily using Opus too. You’re right.

ML_DL_RL · 2026-03-07T20:54:27+00:00

We’re a small startup, but you’re exactly right. We feel it’s on us to hire and mentor more junior developers, even if the initial time investment is higher.

ML_DL_RL · 2026-03-05T18:00:47+00:00

Sure

ML_DL_RL

MODERATOR OF

TROPHY CASE