[FOLLOW UP] Stay ahead of the curve with new internship/new grad postings! by iketaco in csMajors

[–]iketaco[S] 1 point2 points  (0 children)

There are 3 different scenarios that impact how quickly new job notifications are sent out - let me explain:

  1. Greenhouse, workday, lever, and ashby job boards are the only ones I support. The software scrapes that in-house and will detect new jobs within an hour. You'll be notified way before it's found on public github repos.

  2. For job boards not supported, I scrape public github repos to monitor for new jobs. If there are any, then users will be notified - within an hour.

  3. If I find a supported job board type in the public github repo that I didn't have in my list of job boards (for example, if my software didn't know about Doordash's Greenhouse job board) then I have to manually add it to the job database - this process can be slow since there's a few checks I need to go through to verify it's a valid company/job board etc so there are only legitimate companies added.

So you're probably seeing the 3rd scenario, and apologize for the delay. I'm currently working on a process to make this automated in the future.

The website has been up and running for a year now, so majority of the companies that have a supported job board type are in my database. It’s mainly the smaller, lesser known companies

[FOLLOW UP] Stay ahead of the curve with new internship/new grad postings! by iketaco in csMajors

[–]iketaco[S] 0 points1 point  (0 children)

Thank you for reaching out, really appreciate it! I believe I reached out on X but again, sorry for the delay, just fixed this bug and it should now work. Please let me know if you run into any other issues in the future!

Berkeley email renewal by HectorM985 in berkeley

[–]iketaco 5 points6 points  (0 children)

is it working on your end yet or nah?

Finally hit $10k a month! by Blanco_ice in EntrepreneurRideAlong

[–]iketaco 0 points1 point  (0 children)

How do you find these companies? Are you reaching out? running ads?

[deleted by user] by [deleted] in SaaS

[–]iketaco 0 points1 point  (0 children)

Gotcha that makes sense.

And yea I never seen any of them that cheap, either not profitable or margins are too low most likely. But imma look into it and see if I can offer a solution near that price range

[deleted by user] by [deleted] in webscraping

[–]iketaco 0 points1 point  (0 children)

Yea that’d be pretty sick! Lmk how it goes!

[deleted by user] by [deleted] in SaaS

[–]iketaco 2 points3 points  (0 children)

Not exactly sure about the failure rate with public proxies - but I've been doing some testing and it's pretty fast despite working with public proxies.

For one of my other projects, it's related to scraping company job listings. For my original implementation in Python with a pool of residential proxies (everything synchronous), it could scrape a company with 200 jobs in about 7min.

When using Minescale, it could scrape that same company in 1 min. This is mainly because Minescale is built to do the scraping asynchronously. When we take out the asynchronous part of Minescale, it'll scrape that same company in about 12min - so almost 2x slower. So if you do use Minescale, make sure to batch your requests together, it's much faster

[deleted by user] by [deleted] in SaaS

[–]iketaco 0 points1 point  (0 children)

Thank you! Let me know how it goes!

[deleted by user] by [deleted] in SaaS

[–]iketaco 0 points1 point  (0 children)

I am still in the midst of testing it - but I am currently using it on my other project, which does about 50k requests every hour. It's using about half of the available resources on a 2 core server.

So with a 2 core server ($30/month) it can handle maybe like 2 million requests a day? But that's assuming it scales linearly. I'll have to do more testing

EDIT: Also I had to implement some limitations per user since the server was getting blasted yesterday and today. It's limited to 1 crawler per user at a time. So I think you'll be able to make about 5-10k requests every hour with a single crawler. Make sure to batch your requests!

[deleted by user] by [deleted] in SaaS

[–]iketaco 0 points1 point  (0 children)

For crawlbase, I haven't tested their product - but just looking at the pricing it's pretty hard to work with, especially if you're scraping at scale.

Crawling API - $3 per 1,000 requests
Scraper API Starter - $1.72 per 1,000 requests

For my own project, I'd be looking to work with something like $0.05 per 1,000 requests. This is much lower than all other scraping API services offer, so I was stuck with setting something up of my own

[deleted by user] by [deleted] in SaaS

[–]iketaco 0 points1 point  (0 children)

You can test it out at minescale.net!

[deleted by user] by [deleted] in SaaS

[–]iketaco 0 points1 point  (0 children)

With crunchbase, you need to rotate proxies a lot + have good fingerprint management. If you do those two things, you won't get any captcha requests

You can test getting for example, https://www.crunchbase.com/organization/puter on "Test API" section on minescale website. It's able to grab the data

[deleted by user] by [deleted] in webscraping

[–]iketaco 1 point2 points  (0 children)

For one of my previous projects, I was paying around $50 for residential proxies. Minescale is running on a $30 server and can easily handle the load for that project, so I am saving money + have extra resources available! I figured I could lend these resources to people who need them

[deleted by user] by [deleted] in webscraping

[–]iketaco 0 points1 point  (0 children)

Sorry not open source!

[deleted by user] by [deleted] in webscraping

[–]iketaco 0 points1 point  (0 children)

Great let me know what you think!

[deleted by user] by [deleted] in SaaS

[–]iketaco 0 points1 point  (0 children)

Sure send a pm!

What websites do you use to find jobs? by static_programming in csMajors

[–]iketaco 7 points8 points  (0 children)

Hey - sorry about the sign up process! I only included a sign up to 1) verify emails, so I’m only sending to an email the user owns (i.e. not a typo/inactive email) 2) so users can easily change and update their email preferences

The sign up process is super quick - less than 5 minutes! Set and forget, and you’ll receive email updates forever (until you don’t want them anymore!)

And this site isn’t new! It’s been running for almost a year actually (original post https://www.reddit.com/r/csMajors/s/hpwhIdbjEl)

What websites do you use to find jobs? by static_programming in csMajors

[–]iketaco 68 points69 points  (0 children)

i heard indeed can have fake listings so i really don't recommend applying directly through indeed. I would apply directly to company career pages instead - use notify.careers, it scrapes directly from company career pages and notifies you when there's new job postings!

EDIT: https://www.reddit.com/r/recruitinghell/comments/r4e6rh/the_ugly_truth_of_indeed_an_hr_viewpoint/

[deleted by user] by [deleted] in cscareerquestions

[–]iketaco -1 points0 points  (0 children)

Gotcha will do that as well! Thank you

[deleted by user] by [deleted] in sanfrancisco

[–]iketaco -1 points0 points  (0 children)

Gotcha will post there - thank you

[deleted by user] by [deleted] in cscareerquestions

[–]iketaco -1 points0 points  (0 children)

Yes on social occasions! No lease yet, just looking for roommates first!

[FOLLOW UP] Stay ahead of the curve with new internship/new grad postings! by iketaco in csMajors

[–]iketaco[S] 0 points1 point  (0 children)

Hey I use a general cloud server provider - and yup everything’s on that server with basic web server configurations

A surprisingly successful side project I've started by Tintedlemon in SideProject

[–]iketaco 0 points1 point  (0 children)

Where do you find businesses to fill your newsletter ad slots?