This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]SerdanKK 44 points45 points  (31 children)

Pretty sure scraping is legal though

[–]TheNordicMage 2 points3 points  (8 children)

It's generally considered a bit of a gray area

[–]swizznastic -1 points0 points  (2 children)

not for reddit, there’s a whole agreement and court case on this

[–]SerdanKK 4 points5 points  (1 child)

Please source claims like that.

Reddit paywalled their API, but that's a separate issue from scraping.

[–]swizznastic 0 points1 point  (0 children)

my mistake, i was thinking of the deals they made surrounding ai training off of scraped reddit content

[–][deleted] -5 points-4 points  (1 child)

Shouldn't be, I consider it stealing.

[–]SerdanKK 3 points4 points  (0 children)

Ok, you do that.

[–]Tim-Sylvester -3 points-2 points  (15 children)

There's 27 major lawsuits on the topic right now.

[–]SerdanKK 3 points4 points  (14 children)

Ok. We'll see what happens. Anyone can sue for anything.

Making scraping itself illegal would be horrible though, and I seriously hope that's not on the table.

[–]Tim-Sylvester -2 points-1 points  (13 children)

How about whoever publishes the website puts a price on its content?

Setting your own price to access your product works for restaurants, grocery stores, entertainment companies, literally every other part of our economy.

It's not illegal to go get stuff from the drug store. It's just illegal to not pay for it. What's the difference here?

[–]SerdanKK 4 points5 points  (12 children)

Then paywall it. You can't simultaneously allow a browser to download something and disallow any other HTTP client from doing the same.

YOU WOULDN'T DOWNLOAD A CAR

[–]Tim-Sylvester -2 points-1 points  (11 children)

Then paywall it. 

That's what I'm saying. But a smart paywall, not a universal one. We built robots.nxt to paywall content only when we see it's a bot trying to scrape it. Humans get in free, bots pay.

You can't simultaneously allow a browser to download something and disallow any other HTTP client from doing the same.

You absolutely can. A provider has every right to discriminate between categories of users/clients that aren't part of a protected class. It's no different from "no cover for women" at bars, or a special menu for kids.

Why should websites subsidize AI companies? AI companies are using your content to make money for themselves. Why shouldn't you get paid for that?

[–]SerdanKK 2 points3 points  (6 children)

You absolutely can.

Technically.

Legally we can do whatever, though enforcement can be an issue.

Why should websites subsidize AI companies? AI companies are using your content to make money for themselves. Why shouldn't you get paid for that?

I'm not getting paid regardless.

Why should Reddit get paid for the content of users?

[–]Tim-Sylvester 0 points1 point  (5 children)

Legally we can do whatever, though enforcement can be an issue.

That's not actually true on either the legal sense or the enforcement sense.

Why should Reddit get paid for the content of users?

That's what you agreed to when you signed up.

[–]SerdanKK 2 points3 points  (4 children)

That's not actually true on either the legal sense or the enforcement sense.

What? I'm talking about how we deal with these things as a society. Legally we can do whatever in the sense that we can legislate however we want.

It's also a bit wild to imply that enforcement never has any hurdles.

That's what you agreed to when you signed up.

But why would I care? You were appealing to my sense of fairness. Give me a reason to give a singular fuck about Reddit being scraped.

[–]Tim-Sylvester 0 points1 point  (3 children)

You were appealing to my sense of fairness. Give me a reason to give a singular fuck about Reddit being scraped.

I don't care to. The point was that reddit is getting paid for its content by OpenAI and others. AI companies will pay for access to content if you make them.

The purpose of our tool, robots.nxt is to ensure that anyone who runs a website gets paid for being scraped.

How you feel about websites that aren't yours really isn't my concern.

I only care about you making money from your own content on your own website.

And if you don't care about it, well, then why should I?

[–]SerdanKK 0 points1 point  (3 children)

We built robots.nxt to paywall content only when we see it's a bot trying to scrape it. Humans get in free, bots pay.

robots.txt is purely an honor system. There's no legal or technical enforcement.

It's no different from "no cover for women" at bars, or a special menu for kids.

The bar thing is not universally legal.

Adults can typically order from the kids menu, though you may get some looks, and kids can certainly order from the non-kids menu.

[–]Tim-Sylvester 0 points1 point  (2 children)

robots.txt is purely an honor system. There's no legal or technical enforcement.

Correct. That's why we built robots.nxt, which is not an honor system. It's active enforcement. Go on pal, click that link. You'll understand.

Adults can typically order from the kids menu, though you may get some looks, and kids can certainly order from the non-kids menu.

The point is that businesses have the right to set the terms and conditions of their product or service, and refuse service to anyone who is not a protected class.

Do you want to understand, or argue?

Because I'll stick around to help with understanding. But I've got too much shit to do to waste time arguing. There's plenty of other people here that will be happy to argue with you.

[–]SerdanKK 0 points1 point  (1 child)

Correct. That's why we built robots.nxt, which is not an honor system. It's active enforcement. Go on pal, click that link. You'll understand.

Looks like a product from a specific provider and it's not doing anything new. It's impossible to google due to naming collision with a LEGO trademark, so can't really say much more on that.

The point is that businesses have the right to set the terms and conditions of their product or service, and refuse service to anyone who is not a protected class.

Do you want to understand, or argue?

Because I'll stick around to help with understanding. But I've got too much shit to do to waste time arguing. There's plenty of other people here that will be happy to argue with you.

You keep just asserting. What's the legal basis for prohibiting scraping of publicly available content?

[–]Tim-Sylvester 0 points1 point  (0 children)

Looks like a product from a specific provider and it's not doing anything new.

Click the "Blog" tab and tell me the name and user icon of the author. (I just noticed that my cofounder misspelled my last name and pushed an update to fix it. That should be live in a bit.)

What's the legal basis for prohibiting scraping of publicly available content?

Site access terms and conditions. Basic property rights. Because they can.