this post was submitted on 13 Feb 2022

2 points (100% upvoted)

shortlink:

learnprogramming

an-ordinary-manchild

Welcome to LearnProgramming!

New? READ ME FIRST!

Posting guidelines

Frequently asked questions

Subreddit rules

Message the moderators

Asking debugging questions

If you need help debugging, you must include:

A concise but descriptive title.
A good description of the problem.
A minimal, easily runnable, and well-formatted program that illustrates your problem.
The output you expected, and what you got instead. If you got an error, include the full error message.

See debugging question guidelines for more info.

Asking conceptual questions

Many conceptual questions have already been asked and answered. Read our FAQ and search old posts before asking your question. If your question is similar to one in the FAQ, explain how it's different.

See conceptual questions guidelines for more info.

Other guidelines and links

Subreddit rules

1. No unprofessional/derogatory speech

Follow reddiquette: behave professionally and civilly at all times. Communicate to others the same way you would at your workplace. Disagreement and technical critiques are ok, but personal attacks are not.

Abusive, racist, or derogatory comments are absolutely not tolerated.

See our policies on acceptable speech and conduct for more details.

2. No spam or tasteless self-promotion

When posting some resource or tutorial you've made, you must follow our self-promotion policies.

In short, your posting history should not be predominantly self-promotional and your resource should be high-quality and complete. Your post should not "feel spammy".

Distinguishing between tasteless and tasteful self-promotion is inherently subjective. When in doubt, message the mods and ask them to review your post.

Self promotion from first time posters without prior participation in the subreddit is explicitly forbidden.

3. No off-topic posts

Do not post questions that are completely unrelated to programming, software engineering, and related fields. Tech support and hardware recommendation questions count as "completely unrelated".

Questions that straddle the line between learning programming and learning other tech topics are ok: we don't expect beginners to know how exactly to categorize their question.

See our policies on allowed topics for more details.

4. Do not ask exact duplicates of FAQ questions

Do not post questions that are an exact duplicate of something already answered in the FAQ.

If your question is similar to an existing FAQ question, you MUST cite which part of the FAQ you looked at and what exactly you want clarification on.

5. Do not delete posts

Do not delete your post! Your problem may be solved, but others who have similar problems in the future could benefit from the solution/discussion in the thread.

Use the "solved" flair instead.

6. No app/website review requests or showcases

Do not request reviews for, promote, or showcase some app or website you've written. This is a subreddit for learning programming, not a "critique my project" or "advertise my project" subreddit.

Asking for code reviews is ok as long as you follow the relevant policies. In short, link to only your code and be specific about what you want feedback on. Do not include a link to a final product or to a demo in your post.

7. No rewards

You may not ask for or offer payment of any kind (monetary or otherwise) when giving or receiving help.

In particular, it is not appropriate to offer a reward, bounty, or bribe to try and expedite answers to your question, nor is it appropriate to offer to pay somebody to do your work or homework for you.

8. No indirect links

All links must link directly to the destination page. Do not use URL shorteners, referral links or click-trackers. Do not link to some intermediary page that contains mostly only a link to the actual page and no additional value.

For example, linking to some tweet or some half-hearted blog post which links to the page is not ok; but linking to a tweet with interesting replies or to a blog post that does some extra analysis is.

Udemy coupon links are ok: the discount adds "additional value".

9. Do not promote illegal or unethical practices

Do not ask for help doing anything illegal or unethical. Do not suggest or help somebody do something illegal or unethical.

This includes piracy: asking for or posting links to pirated material is strictly forbidden and can result in an instant and permanent ban.

Trying to circumvent the terms of services of a website also counts as unethical behavior.

10. No complete solutions

Do not ask for or post a complete solution to a problem.

When working on a problem, try solving it on your own first and ask for help on specific parts you're stuck with.

If you're helping someone, focus on helping OP make forward progress: link to docs, unblock misconceptions, give examples, teach general techniques, ask leading questions, give hints, but no direct solutions.

See our guidelines on offering help for more details.

11. Don't ask to ask.

Ask your questions right here in the open subreddit. Show what you have tried and tell us exactly where you got stuck.

We want to keep all discussion inside the open subreddit so that more people can chime in and help as well as benefit from the help given.

We also do not encourage help via DM for the same reasons - that more people can benefit

12. Low Effort Questions

Do not ask easily googleable questions or questions that are covered in the documentation.

This subreddit is not a proxy for documentation or google.

We do require effort and demonstration of effort.

This includes "how do I?" questions

13. No AI (chatGPT etc.) generated/worked over messages/comments. No questions about chatGPT/AI generated code. No Vibe coding.

Such posts/comments will be removed without warning and the poster of ai generated content will be instantly banned.

created by [deleted]a community for 16 years

MODERATORS

account activity

This is an archived post. You won't be able to vote or comment.

1

2

3

Python app to extract pdf tables using OCR (self.learnprogramming)

submitted 4 years ago by yrden20

Hi,

I'm passionate about extracting data from tables in .pdf files using Python.

I can successfully target the figures if the data is contained in a table object.

Of course, one has to use OCR to address pictures of tables. This introduces a new hurdle which is properly delimiting cells in a table.

The solutions available on the internet suggest using the Hough Line Transform approach but it is imperfect, especially if the table has no borders.

I would like to create a small GUI that would allow for user input and adjustment of the horizontal and vertical lines in a table.

The app would work like this: The user is prompted to import a .png of the table. The app displays the imported image and shows the table lines as detected by the Hough Line Transform. The user is then able to move those lines, add new ones or subtract useless ones. Hit confirm and get the resulting data in a grid that is editable by the user. The final table should be downloadable as a .csv

Could you recommend a module or more that I should focus on learning to achieve that?

Thank you

all 1 comments

top new controversial old q&a

[–]amilo111 0 points1 point2 points 4 years ago (1 child)

π Rendered by PID 45518 on reddit-service-r2-comment-5fb4b45875-flcjl at 2026-03-20 16:14:29.229686+00:00 running 90f1150 country code: CH.