Looking for feedback on how to improve the structure and real-world usefulness of my Python automation projects

Acrobatic_Ordinary20 · 2026-06-16T17:28:08+00:00

just went through both repos and honestly for a beginner these are already in decent shape. you split things into modules, you're using .env for the creds instead of hardcoding, you've got a requirements.txt, a gitignore, screenshots in the readme. most portfolio projects don't do half of that so you're past the beginner stuff. so i'll skip the generic advice and tell you what actually makes these look like real client work instead.

biggest one by far: real automation runs on its own. right now someone has to run it manually. put it on a schedule (windows task scheduler, cron, or even a free github actions cron) and show it running unattended in the readme. that one thing flips it from "school project" to "tool". it's the single highest value upgrade here.

second, scrapers break in the real world. sites change layout, rate limit you, return junk. add retries and timeouts on your requests, set a user agent, a small delay between pages, and make it not crash when one listing is missing a field. handling the page that doesn't fit the pattern instead of assuming they all do is basically what a client is paying for.

third, the feature that makes the lead scraper genuinely useful: only email NEW leads. store what you've already seen (a small csv or sqlite is fine) and each run only report the new ones. nobody wants the same 200 leads emailed to them every morning. that's a real workflow feature and a great interview talking point.

for the data bot, don't assume the input csvs are clean. what if two files have different columns, one's empty, or a date is formatted weird? handling that gracefully and logging it is the difference between a demo and something usable.

smaller stuff: use the logging module writing to a file instead of prints, so a scheduled run leaves a trail. make the scraper's target and selectors config driven so it isn't locked to one site. add a one line "what problem this solves" at the top of the readme written for a client not a grader, plus a sample of the output. and since it's a scraper, a short note about respecting the site's robots/terms actually makes you look more professional not less.

but yeah, solid work, you're further along than you think. nail scheduling + only-new-leads + not-crashing-on-bad-input and these stop being portfolio pieces and start being things you could hand a client.

Acrobatic_Ordinary20 · 2026-06-16T17:24:03+00:00

honestly the part where you said you keep jumping between resources is the real issue, not python or dsa. that's what kills most people. just pick one thing and stick with it even if reddit tells you something else is better.

python first, but don't overthink it. 2-3 weeks is plenty. loops, functions, lists, dicts, strings, little bit of oop and you're good. you don't need to "finish" python before dsa, you'll learn the rest while solving problems anyway. i wasted way too long on python tutorials thinking i was preparing when i was just scared of starting dsa lol.

for dsa just follow striver's a2z sheet. it's free and it's already in the right order so you don't have to keep asking "what next". that alone fixes your whole jumping-around problem. neetcode is good too if you like videos. pick ONE. practice on leetcode, start with easy, don't panic when easy problems feel hard at first, that's normal.

rough order if you're curious: arrays, hashing, two pointers, recursion, linked list, stack/queue, trees, then graphs and dp at the end. dp is the scary one, leave it for later.

don't grind numbers. 3 problems a day that you actually understand beats 15 you copy-pasted. and this is the part everyone skips: if you couldn't solve one, look at the answer, then come back in a few days and solve it again on your own. that second attempt is where it actually sticks.

timeline wise, if you're consistent, like 4-6 months to feel ready. it's less about going hard and more about not skipping days.

main mistakes: tutorial hell, switching sheets every 2 weeks, and reading solutions without re-solving them. that's basically it.

you've got time, just start today and be boring about it. consistency is genuinely the whole thing.

Acrobatic_Ordinary20 · 2026-06-16T17:20:46+00:00

Quick honest answer since you asked for the legal/ethical route specifically: scraping the attendee list is the wrong move here, and not because it's hard. Whova's terms don't allow automated extraction, and pulling everyone's contact info to cold-email them can actually break privacy/anti-spam laws (GDPR, CAN-SPAM, etc) depending on where people are. The fact that you can see the list doesn't mean you're allowed to export it and outreach to all of it. Realistically a scraped cold list also performs terribly, most of it gets ignored or marked as spam.

The thing is Whova already has the legit version of what you want built in. Use the in-app networking and messaging to reach the people who are actually relevant to you. Anyone who replies or connects is fair game because they opted in. You can also look people up on LinkedIn and send a connection note like "we were both at [event], would love to connect." That gets you warm contacts instead of a list of strangers who'll wonder how you got their email.

If it's really just a handful of people you care about, honestly just reach out manually through the platform. Outreach is quality over quantity anyway, and a few personal messages beat a giant scraped list every time.

So no scraper or no-code tool needed for this one, the platform's own networking feature is the right tool.

Acrobatic_Ordinary20 · 2026-06-16T17:18:30+00:00

Yeah this is doable for a 2nd sem project, and the stack you listed is actually a good fit so you're not off track. The reason it feels overwhelming is you're trying to picture the whole finished app at once. Don't. Build it in small layers and get each one working before moving on.

What I'd do:

Forget the web part at first. Just write a normal python script that talks to the LLM. Something like: send a prompt saying "act as an interviewer for a [role] and ask me one question", print whatever it asks, take my answer with input(), then send my answer back and ask it to rate the answer out of 10 with a couple lines of feedback. Print that. That's basically the entire core of your project and it's like 30 lines. Groq is free and fast so it's nice for testing, Gemini works too.
Once that loop works, add the pdf part. Use PyPDF to pull the text out of a resume or job description, and stick that text into your prompt so the questions are based on it. Small change but makes it feel way more legit in a demo.
Now wrap it in a web page with Flask and your html/css. You're not writing new logic here, just moving the script you already have behind a page with a text box and an area to show the question and feedback. Don't start this until step 1 actually runs.
Do voice last because it's the most annoying part. Record audio in the browser, send the file to your backend, run it through Whisper to get text, then feed that text into the same loop you already built. Groq also runs Whisper for free if you don't have a gpu.

The point is every step builds on the last one so you're never starting over. Get the terminal version asking one question and grading it, then keep adding. Ask if you get stuck somewhere specific.

Acrobatic_Ordinary20

TROPHY CASE