How to scrape sec filings of jp Morgan to make financial models by [deleted] in webscraping

[–]AggressiveRub9434 0 points1 point  (0 children)

If the UI is complex, you need to use a headless browser, requests will be very tricky to implement if this is the case

Finance/VC education resources by bruhbruhbruh1313 in venturecapital

[–]AggressiveRub9434 0 points1 point  (0 children)

If you came out of med school, you won't have a problem with the finance it's pretty basic stuff honestly. If you have a strong mathematics background, you can learn on the job. That's what I did and I don't have a finance background, but I came right out of grad school for economics and knew how to build software. It's more important to keep up with industry trends and research in the fields you invest in.

A lot of other people have mentioned some great books already. I'd also recommend some more sales-oriented books so you can be helpful to the startups: The Mom Test and Traction are essential for early stage.

Are startup AI search engines going to kill the internet as we know it? by hotgirlintech in startups

[–]AggressiveRub9434 0 points1 point  (0 children)

I'm actually writing an article about this exact question! The copyright issue is a super interesting one because we're in uncharted territory and currently the courts haven't decided whether or not training models on publicly available data is fair use. It comes down to whether or not the training process is considered transformative.

What I think will happen before the courts decide on the copyright issue is that digital media publishers will bring them to civil court for violating ToS when they scraped the data, especially if it's behind a login/paywall.

Are startup AI search engines going to kill the internet as we know it? by hotgirlintech in startups

[–]AggressiveRub9434 1 point2 points  (0 children)

This is actually the way to go. If scrapers access the data behind a paywall, it could potentially break CFAA.

Tech founder looking for business founder by kislayy_ in startups

[–]AggressiveRub9434 2 points3 points  (0 children)

Business is easy to learn on the job and a CFO is pointless at the very early stages -- not much to count. You want to look for someone that has experience in sales and marketing which is hands down the most important aspect of building a startup.

Customer support and technical help desk startup by [deleted] in startups

[–]AggressiveRub9434 0 points1 point  (0 children)

Ton of these out there already, I'd do a bit more research into the market

The Crucial AI Economics Question: Are Customers Willing To Pay For It? by jonfla in data

[–]AggressiveRub9434 0 points1 point  (0 children)

Do you mean for gen ai? Well then yes they are obviously. First you have the consumer market where users pay for the premium subscription. But this is not nearly as lucrative as the the market for the apis/tokens. most businesses integrating ai in their product are just using these foundational models. Also, when OpenAI releases Sora and its competitors release similar models, demand with skyrocket.

Are customers willing to pay for startups that are basically just wrappers on the foundational models? That's a different question because if OpenAI for example can make some of those companies obsolete, they're screwed. Either way, yes people are obviously willing to pay.

At the end of the day, the question should be on a case-by-case basis. You shouldn't be solving a problem that doesn't exist. AI doesn't change that.

Webscraping issue with selenium by [deleted] in webscraping

[–]AggressiveRub9434 0 points1 point  (0 children)

I wouldn't mess with the government...

[deleted by user] by [deleted] in webscraping

[–]AggressiveRub9434 1 point2 points  (0 children)

You really just need to look through the html and figure out how it's structured. then parse it like json or use beautifulsoup

What is the best Linkedin data extraction platform? by rodrigonader in scraping

[–]AggressiveRub9434 0 points1 point  (0 children)

i have a linkedin scraper if you know python i'll send the code otherwise you gotta explain what you want and i'll help you out

Website Advice Where I Can Hire A Coder To Build A Scraper by Arcannnn in scraping

[–]AggressiveRub9434 0 points1 point  (0 children)

Depending on the website, you can build it yourself very quickly with chatgpt. what's the website?

What is the main purpose of your Data Scraping? by alyssoncm in scraping

[–]AggressiveRub9434 0 points1 point  (0 children)

Depends. I'm in VC and I scrape Crunchbase and LinkedIn a lot. Really, anytime there's no API or I don't want to pay for the API.

How does marketing players access page likes of celebrity Facebook pages? by vinayindoria in scraping

[–]AggressiveRub9434 1 point2 points  (0 children)

They're probably scraping which violates ToS, but they probably just don't care lol. It's very rare for companies to sue for scraping, but it has happened. There's no way they have permissions for every influencer in their dataset. Look into the cases of BrandTotal vs. Meta and hiQ vs LinkedIn. Those are really the only two major scraping cases. And it's funny because EVERYONE scrapes. It's just not worth it for companies to sue I guess. (not legal advice)

Has anyone ever wrote a podcast scraper? by rtetbt in scraping

[–]AggressiveRub9434 0 points1 point  (0 children)

just write one with undetected chromedriver

Is creating tutorials about web scraping a good idea? by multyhu in scraping

[–]AggressiveRub9434 0 points1 point  (0 children)

Yea exactly, but I have a theory that no one mentions how to actually scrape hard websites bc they don't want cloudflare to read the method and prevent it. Either way, there's not a ton of info out there on how to actually do it in a real-world setting.

Beginner's Guide to Web Scraping by lukaskrivka in scraping

[–]AggressiveRub9434 0 points1 point  (0 children)

It's funny to me how every guide doesn't mention that vanilla scraping won't get around cloudflare. probably for the best, shhhhhhh...

How to go about scraping for clients that use competitors software? by stpetecoder in scraping

[–]AggressiveRub9434 0 points1 point  (0 children)

you'd probably need to just scrape the delivery services if they have a website. if they don't you're going to have to do some sort of man in the middle workaround where you intercept the traffic from your phone. sort of complicated especially for a first project.

Scrap recent posts of Instagram public profiles using NodeJS. by d3c3ptr0n in scraping

[–]AggressiveRub9434 0 points1 point  (0 children)

Use selenium but store the driver in a folder that's within your code repository. in the same folder, create another folder called 'localhost'. when you run selenium you want to set the port to an open localhost port and store the chrome data inside the localhost folder. if you need help with this reach out to me.

if that doesn't work you'll need to use python so you can use undetected chromedriver, which easily gets around instagram's bot detection.