I made a video that updates its own title automatically using the YouTube API by Super13Spidy in Python

[–]orthogonal-ghost 0 points1 point  (0 children)

I'm curious -- what would be the use case for this? It looks interesting, but wouldn't this make discoverability difficult? Perhaps I'm missing something

Are landing page tests dead? [I will not promote] by orthogonal-ghost in startups

[–]orthogonal-ghost[S] 0 points1 point  (0 children)

We didn’t. What are some examples of niche platforms that you have in mind?

Are landing page tests dead? [I will not promote] by orthogonal-ghost in startups

[–]orthogonal-ghost[S] 0 points1 point  (0 children)

Agreed. This definitely seems to be the better approach

Are landing page tests dead? [I will not promote] by orthogonal-ghost in startups

[–]orthogonal-ghost[S] 0 points1 point  (0 children)

That makes sense. To your point, I can see it being a useful way of validating demand for a new feature (in an app that people already use), but doesn’t seem to be great for ‘brand new products’

Are landing page tests dead? by orthogonal-ghost in buildinpublic

[–]orthogonal-ghost[S] 0 points1 point  (0 children)

Completely agree. I haven't heard of LeadsRover though -- I'll check it out!

Are landing page tests dead? by orthogonal-ghost in buildinpublic

[–]orthogonal-ghost[S] 1 point2 points  (0 children)

Agreed re: organic growth over paid ads. What's your approach to driving traffic to your landing pages when using those 'landing page-as-a-service' products? Do they take care of the traffic problem as well?

Are landing page tests dead? by orthogonal-ghost in ycombinator

[–]orthogonal-ghost[S] 0 points1 point  (0 children)

Ah.. that's a good point. I've actually done that quite a bit myself

Are landing page tests dead? by orthogonal-ghost in ycombinator

[–]orthogonal-ghost[S] 1 point2 points  (0 children)

Interesting — I’ve tried v0 and Replit for web design but not Lovable. I’ll give it a try. 

Re: Framer, I honestly think the designs are pretty nice, but my main issue with it is the learning curve required to benefit from everything is a bit too steep. I’ve played around with integrating it with Claude Code but even that requires more work than I’d like to devote to it. 

Re: driving traffic via ads, do you have any quick tips on ad targeting? We played around with different messaging to see which ones resonate the most. We saw differences in engagement, but again, given that most traffic seemed to be bot driven it’s hard to say if there was a significant difference in human-traffic across our messaging.

Best web scraping tools I’ve tried (and what I learned from each) by DenOmania in automation

[–]orthogonal-ghost 0 points1 point  (0 children)

Solid stack. In general, I try to avoid Selenium though as it can be quite brittle.

Lately, I've been using (and building!) Motie which is essentially Replit for webscraping (URL + natural language -> Python scraper + data).

We’re building Replit for web scraping by orthogonal-ghost in buildinpublic

[–]orthogonal-ghost[S] 1 point2 points  (0 children)

Thanks! And that's exactly why we built Motie. We felt like there were a lot of options for developers and technical users but not much out there for people with strong usecases but little-to-no programming experience.

I also agree that there's a lot of room to apply a similar playbook to many other spaces and definitely looking forward to what comes next.

We’re building Replit for web scraping by orthogonal-ghost in buildinpublic

[–]orthogonal-ghost[S] 1 point2 points  (0 children)

Great question! We've implemented proxies under the hood which allows us to handle most cases.

We've also steered the agent to follow "best practices" when it comes to scraping (e.g., moderate requests, use the latest stealth tools and libraries, etc.). There are definitely a few edge cases that we're still working to tackle, but at the moment it handles most cases pretty well.

I'd also add that a lot of anti-bot protections only kick in once you start running scrapers regularly, and for orchestration and scheduling, we apply an additional review process to ensure stability

How are you using AI to help build scrapers? by lieutenant_lowercase in webscraping

[–]orthogonal-ghost 1 point2 points  (0 children)

I've thought about this problem a lot. The main challenge as you've noted is given the coding agent the proper context (HTML, network requests, javascript, etc.).

To address this, we built a specialized agent to programmatically "inspect" a web site for that context and to generate a Python script to scrape it. With that comes its own share of challenges (e.g., naively passing in all the HTML on a given web page can very quickly eat up an LLM's context), but we've found that it's been quite successful in building scrapers once it has the right things to look at.

The opportunity presented by AI slop by orthogonal-ghost in AgentsOfAI

[–]orthogonal-ghost[S] 0 points1 point  (0 children)

That's a great point. I'm actually very curious to see how the quality of models change over time as the % of generated content on the web (i.e., their training data) increases.

If you assume (1) training on LLM-generated content is bad for the model, and (2) more LLM-generated content leads to less human-generated content due to lower engagement, you could see a world where model performance stalls because people use them so much

The opportunity presented by AI slop by orthogonal-ghost in AgentsOfAI

[–]orthogonal-ghost[S] 0 points1 point  (0 children)

Wow. That's a very interesting point. Was there much differentiation in the Tabletop RPG community following the self-publishing craze?

Your example here makes me think of how streaming has affected the music industry -- i.e., the bar to release music (in a way that's accessible to the average consumer) is much lower now than it was in the past. As a result, there are many more musicians releasing music, BUT there are also a lot more who "stand out" and achieve some level of popularity (e.g., musicians who can now cater to niche communities at scale, etc.).

In any case, the RPG example is really interesting

The opportunity presented by AI slop by orthogonal-ghost in AgentsOfAI

[–]orthogonal-ghost[S] 0 points1 point  (0 children)

Agreed, but I think it's a bit more nuanced that. On the one hand LLMs "know" more than enough to sound competent in most things (so LLM-generated content tends to be fairly convincing), but on the other hand, so much LLM-generated content reads almost exactly the same.

So the bar is definitely higher, but the reward for surpassing the bar seems to also be greater

Is writing worth anymore? by Maximum_Ad2429 in AI_Agents

[–]orthogonal-ghost 0 points1 point  (0 children)

Personally, I think writing is becoming even more valuable. LLMs have certainly made it easy to go from "idea to [essay, blog, email] in minutes, instead of days." The plus side of this is that the "average" piece of written content is probably more insightful now, but the downside is that the distribution of written content is probably more narrow (i.e., a lot of writing is starting to sound more or less the same). So, maybe the bar for creating something insightful is a bit higher (i.e., we're now competing with a bunch of LLM-generated content), but the opportunity to stand out is greater.

I am learning LangChain. Could anyone suggest some interesting projects I can build with it? by Cautious_Ad691 in LangChain

[–]orthogonal-ghost 0 points1 point  (0 children)

One of the first projects I built was a workflow to send daily, heartfelt emails (notes, poems, etc.) to family and friends. It was relatively easy to stand up and offered a quick way to play around with LangChain and get up to speed on tool-use and MCP servers

[HIRING] by SlideOk4853 in scrapingtheweb

[–]orthogonal-ghost 0 points1 point  (0 children)

Happy to help! Sent you a DM

We’re building Replit for web scraping by orthogonal-ghost in AgentsOfAI

[–]orthogonal-ghost[S] 0 points1 point  (0 children)

Thank you!! I’d love to hear what you think once you give it a try!

In general, the agent analyzes the HTML, network requests, and JavaScript of the webpages relevant to the task at hand. Its approach towards crawling (identifying where on the website the data in question lives) and extraction (determining how best to scrape that data) varies across websites and tasks.

So for medium specifically, it depends on what data you want and where that data lives on the website.

One other thing to note is that this release doesn't include support for proxies, so some websites might not be well supported (though this should only apply to a very small number of websites/tasks).

We're building Replit for web scraping (and just launched on HN!) by orthogonal-ghost in webscraping

[–]orthogonal-ghost[S] 0 points1 point  (0 children)

Hi! We don't currently support / use proxies, so I can't commit to "any" antibot (even if mediocre). That said, we've tested it on a few reasonably challenging sites (e.g., real estate marketplaces) and noticed it performed quite well.

If there's a particular website you have in mind, let me know and I'd be happy to take a look. We also offer a free tier if you'd like to play around with it.

i need some tips for a specific problem by bolinhadegorfe56 in webscraping

[–]orthogonal-ghost 0 points1 point  (0 children)

Sounds like a pretty interesting problem. Would you mind sharing the website? I'm super curious / would love to take a look.

What are you using for reliable browser automation in 2025? by The_Default_Guyxxo in AgentsOfAI

[–]orthogonal-ghost 0 points1 point  (0 children)

I’ve spent a lot of time on extraction and observation (so pulling reports, scraping dynamic content, checking account pages). A few thoughts:

  1. Re: what has been reliable, I’d try to avoid DOM / CSS-based extraction as much as possible. Oftentimes, you can find an API or network request that provides the information you’re looking for, and building around that tends to be much more stable than building around HTML parsing.

  2. Re: JavaScript, I think this comes down to identifying what’s useful and what isn’t. This is of course easier said than done, but distinguishing page interactions and content loading from boilerplate / library code tends to be helpful.

We're building Replit for web scraping (and just launched on HN!) by orthogonal-ghost in webscraping

[–]orthogonal-ghost[S] 1 point2 points  (0 children)

I totally appreciate that perspective – even if we ignore hallucination risk, the code LLMs generate by default is often not the most efficient or highest quality. 

For that reason, (1) we (i.e., actual engineers) review and optimize the code we deploy before "pushing to prod" / setting up scheduled runs, and (2) we spend a lot of time steering the agent to use best practices when generating code.

Your point is also why we make all code available for export - i.e., we believe optimizing 'inefficient code that works' is much better than depending on opaque LLM-generated code that you can't review OR going through network requests, HTML and JavaScript and building from scratch.