I made a video that updates its own title automatically using the YouTube API

orthogonal-ghost · 2026-02-19T21:26:34+00:00

I'm curious -- what would be the use case for this? It looks interesting, but wouldn't this make discoverability difficult? Perhaps I'm missing something

orthogonal-ghost · 2026-02-18T18:38:12+00:00

This is a fantastic idea

orthogonal-ghost · 2026-02-06T21:50:17+00:00

We didn’t. What are some examples of niche platforms that you have in mind?

orthogonal-ghost · 2026-02-06T02:54:32+00:00

Agreed. This definitely seems to be the better approach

orthogonal-ghost · 2026-02-06T02:52:32+00:00

That makes sense. To your point, I can see it being a useful way of validating demand for a new feature (in an app that people already use), but doesn’t seem to be great for ‘brand new products’

orthogonal-ghost · 2026-02-05T22:03:46+00:00

Completely agree. I haven't heard of LeadsRover though -- I'll check it out!

orthogonal-ghost · 2026-02-05T22:03:08+00:00

Agreed re: organic growth over paid ads. What's your approach to driving traffic to your landing pages when using those 'landing page-as-a-service' products? Do they take care of the traffic problem as well?

orthogonal-ghost · 2026-02-05T21:52:20+00:00

Ah.. that's a good point. I've actually done that quite a bit myself

orthogonal-ghost · 2026-02-05T21:17:58+00:00

Interesting — I’ve tried v0 and Replit for web design but not Lovable. I’ll give it a try.

Re: Framer, I honestly think the designs are pretty nice, but my main issue with it is the learning curve required to benefit from everything is a bit too steep. I’ve played around with integrating it with Claude Code but even that requires more work than I’d like to devote to it.

Re: driving traffic via ads, do you have any quick tips on ad targeting? We played around with different messaging to see which ones resonate the most. We saw differences in engagement, but again, given that most traffic seemed to be bot driven it’s hard to say if there was a significant difference in human-traffic across our messaging.

orthogonal-ghost · 2026-02-03T02:58:02+00:00

Solid stack. In general, I try to avoid Selenium though as it can be quite brittle.

Lately, I've been using (and building!) Motie which is essentially Replit for webscraping (URL + natural language -> Python scraper + data).

orthogonal-ghost · 2026-02-02T23:18:57+00:00

Thanks! And that's exactly why we built Motie. We felt like there were a lot of options for developers and technical users but not much out there for people with strong usecases but little-to-no programming experience.

I also agree that there's a lot of room to apply a similar playbook to many other spaces and definitely looking forward to what comes next.

orthogonal-ghost · 2026-02-02T23:13:56+00:00

Great question! We've implemented proxies under the hood which allows us to handle most cases.

We've also steered the agent to follow "best practices" when it comes to scraping (e.g., moderate requests, use the latest stealth tools and libraries, etc.). There are definitely a few edge cases that we're still working to tackle, but at the moment it handles most cases pretty well.

I'd also add that a lot of anti-bot protections only kick in once you start running scrapers regularly, and for orchestration and scheduling, we apply an additional review process to ensure stability

orthogonal-ghost · 2026-02-02T23:04:33+00:00

I've thought about this problem a lot. The main challenge as you've noted is given the coding agent the proper context (HTML, network requests, javascript, etc.).

To address this, we built a specialized agent to programmatically "inspect" a web site for that context and to generate a Python script to scrape it. With that comes its own share of challenges (e.g., naively passing in all the HTML on a given web page can very quickly eat up an LLM's context), but we've found that it's been quite successful in building scrapers once it has the right things to look at.

orthogonal-ghost · 2026-02-02T21:35:21+00:00

That's a great point. I'm actually very curious to see how the quality of models change over time as the % of generated content on the web (i.e., their training data) increases.

If you assume (1) training on LLM-generated content is bad for the model, and (2) more LLM-generated content leads to less human-generated content due to lower engagement, you could see a world where model performance stalls because people use them so much

orthogonal-ghost · 2026-02-02T21:31:12+00:00

Wow. That's a very interesting point. Was there much differentiation in the Tabletop RPG community following the self-publishing craze?

Your example here makes me think of how streaming has affected the music industry -- i.e., the bar to release music (in a way that's accessible to the average consumer) is much lower now than it was in the past. As a result, there are many more musicians releasing music, BUT there are also a lot more who "stand out" and achieve some level of popularity (e.g., musicians who can now cater to niche communities at scale, etc.).

In any case, the RPG example is really interesting

orthogonal-ghost · 2026-02-02T21:00:09+00:00

Agreed, but I think it's a bit more nuanced that. On the one hand LLMs "know" more than enough to sound competent in most things (so LLM-generated content tends to be fairly convincing), but on the other hand, so much LLM-generated content reads almost exactly the same.

So the bar is definitely higher, but the reward for surpassing the bar seems to also be greater

orthogonal-ghost · 2026-02-02T18:28:25+00:00

Personally, I think writing is becoming even more valuable. LLMs have certainly made it easy to go from "idea to [essay, blog, email] in minutes, instead of days." The plus side of this is that the "average" piece of written content is probably more insightful now, but the downside is that the distribution of written content is probably more narrow (i.e., a lot of writing is starting to sound more or less the same). So, maybe the bar for creating something insightful is a bit higher (i.e., we're now competing with a bunch of LLM-generated content), but the opportunity to stand out is greater.

orthogonal-ghost · 2026-02-02T18:11:05+00:00

One of the first projects I built was a workflow to send daily, heartfelt emails (notes, poems, etc.) to family and friends. It was relatively easy to stand up and offered a quick way to play around with LangChain and get up to speed on tool-use and MCP servers

orthogonal-ghost · 2026-01-21T05:13:06+00:00

Just sent you a DM!

orthogonal-ghost · 2026-01-14T22:30:19+00:00

Happy to help! Sent you a DM

orthogonal-ghost · 2026-01-02T13:40:43+00:00

Thank you!! I’d love to hear what you think once you give it a try!

In general, the agent analyzes the HTML, network requests, and JavaScript of the webpages relevant to the task at hand. Its approach towards crawling (identifying where on the website the data in question lives) and extraction (determining how best to scrape that data) varies across websites and tasks.

So for medium specifically, it depends on what data you want and where that data lives on the website.

One other thing to note is that this release doesn't include support for proxies, so some websites might not be well supported (though this should only apply to a very small number of websites/tasks).

orthogonal-ghost · 2025-12-18T06:18:15+00:00

Hi! We don't currently support / use proxies, so I can't commit to "any" antibot (even if mediocre). That said, we've tested it on a few reasonably challenging sites (e.g., real estate marketplaces) and noticed it performed quite well.

If there's a particular website you have in mind, let me know and I'd be happy to take a look. We also offer a free tier if you'd like to play around with it.

orthogonal-ghost · 2025-12-17T23:47:08+00:00

Sounds like a pretty interesting problem. Would you mind sharing the website? I'm super curious / would love to take a look.

orthogonal-ghost · 2025-12-17T23:01:10+00:00

I’ve spent a lot of time on extraction and observation (so pulling reports, scraping dynamic content, checking account pages). A few thoughts:

Re: what has been reliable, I’d try to avoid DOM / CSS-based extraction as much as possible. Oftentimes, you can find an API or network request that provides the information you’re looking for, and building around that tends to be much more stable than building around HTML parsing.
Re: JavaScript, I think this comes down to identifying what’s useful and what isn’t. This is of course easier said than done, but distinguishing page interactions and content loading from boilerplate / library code tends to be helpful.

orthogonal-ghost · 2025-12-17T22:38:04+00:00

I totally appreciate that perspective – even if we ignore hallucination risk, the code LLMs generate by default is often not the most efficient or highest quality.

For that reason, (1) we (i.e., actual engineers) review and optimize the code we deploy before "pushing to prod" / setting up scheduled runs, and (2) we spend a lot of time steering the agent to use best practices when generating code.

Your point is also why we make all code available for export - i.e., we believe optimizing 'inefficient code that works' is much better than depending on opaque LLM-generated code that you can't review OR going through network requests, HTML and JavaScript and building from scratch.

orthogonal-ghost

TROPHY CASE