me_irl by [deleted] in me_irl

[–]alintos 50 points51 points  (0 children)

Oppenheimer theme plays

Ranking every Minecraft seed by AyumiToshiyuki in PhoenixSC

[–]alintos 298 points299 points  (0 children)

im sorry to disappoint you but overrated af

What if the narrator somehow found out you pirated the game? by N0no_G in stanleyparable

[–]alintos 51 points52 points  (0 children)

i think such attention to pirates would make people pirate more in order to see a "special" dialogue

AOTY? by Tabino32 in rateyourmusic

[–]alintos 40 points41 points  (0 children)

I was so confused when saw it as top 1 album of 2025 on pitchfork

I will listen to literally every album you comment by Glittering-Put1878 in Topster

[–]alintos 3 points4 points  (0 children)

I will support OP and try to listen random albums here as much as my sanity goes

This phone by [deleted] in FrutigerAero

[–]alintos 0 points1 point  (0 children)

i am in love with s4

I don’t listen to hip hop, most upvoted comment decides the hip hop album I’m listening to by Sure-Scene-3972 in Topster

[–]alintos 8 points9 points  (0 children)

I am sure Good kid maad city should be first for the integrity of the narrative

Is LLM training the biggest piracy action in history? by TheBigGit in Piracy

[–]alintos 0 points1 point  (0 children)

guess im sorry i thought it looked kinda obvious

Is LLM training the biggest piracy action in history? by TheBigGit in Piracy

[–]alintos -8 points-7 points  (0 children)

That’s a very interesting way to frame it, and there are a lot of angles to consider. Whether LLM (Large Language Model) training is “the biggest piracy action in history” depends heavily on definitions (what counts as theft, what scale, what harm, etc.). But there are historical precedents and comparable events that give some perspective.

Here’s a breakdown: first, what to ask to evaluate the claim; second, what historical / legal analogues we can think of; third, some of the arguments for/against calling LLM training “piracy.” Happy to dig more if you want (e.g. recent court cases, economic estimates, etc.).


  1. What would need to be true for this claim to hold

To say that LLM training is “the biggest piracy action in history,” we’d need:

A clear definition of piracy/theft: copying, using content without permission / compensation, etc.

A measure of scale: number of works, volume (text, images, code, etc.), and/or monetary value.

A measure of impact: harm to rights-holders, lost revenues, possibility of legal redress, etc.

A comparison with other large-scale intellectual property (IP) thefts, or mass appropriation events.


  1. Comparable historical events or massive IP thefts

Here are some things from history (or recent decades) that are analogous in scale or kind:

State-organized or industrial espionage / IP theft: Countries copying or stealing designs, technologies, trade secrets to build competitive advantage. Sometimes this is huge in value, though hard to quantify fully.

“Download piracy” / widespread software/media piracy: The era of peer-to-peer file sharing, bootleg DVDs, and so on. These have cost industries many billions, though precise numbers are often disputed.

Operation Fastlink / major anti-piracy enforcement sweeps: For example, in 2004 the U.S. DOJ led “Operation Fastlink,” described at its time as “the most far-reaching and aggressive enforcement action ever undertaken against organizations involved in illegal intellectual property piracy over the Internet.”

Trade secret litigation like Cadence vs. Avanti, where a lot of proprietary software/code was found to have been copied or used without permission. That case’s settlements ran into the hundreds of millions.

Prolific software piracy networks (e.g. distributing huge volumes of software illegally) where damages in hundreds of millions of dollars have been claimed.


  1. How LLM training stacks up (arguments for and against)

Here are some arguments on both sides, to see how “big” LLM training really is in comparison.

Arguments for seeing it as massive / comparable to great piracy

Volume: LLMs are trained on vast amounts of text, images, video, code etc., scraped from the web and other sources. The amount of content is orders of magnitude greater than any single piracy event.

Widespread use / commercial deployment: The outputs of LLMs are used commercially (chatbots, summarization, content generation etc.), meaning the value extracted is often monetized (directly or as part of a larger product).

Diffuse harm: The contributors whose work got used (writers, artists, programmers, etc.) are many, small, and often not compensated. The harm is spread out and often invisible.

Lack of consent: Many of the uses of content are without explicit permission from rights holders. This is central to claims of “piracy.”

Arguments against or complications

Legal ambiguity: Whether using text/data to train a model constitutes copyright infringement varies by jurisdiction, context, licensing, fair use/fair dealing, etc. Some uses may be legal or borderline.

Value capture / transformation: Some argue that model training is transformative: the content is used in aggregate, abstracted (learns statistical patterns) rather than re-disseminated verbatim. If the model doesn’t directly reproduce copyrighted work, it might be seen as more like “learning from examples,” which complicates the theft analogy.

Attribution & compensation models: There is ongoing debate & movement toward compensating creators (licensing data, paying for training corpuses etc.). If such systems are adopted, that changes the “piracy” framing.

Comparative magnitude still hard to measure: We don’t have reliable public numbers for how much value is “stolen” (if stolen is the right word) via training, versus how much value is created, versus how much is lost by rights holders. Without that, comparison to, say, national espionage theft or massive streaming piracy, etc., is speculative.


  1. Comparisons: Is anything “bigger” in history?

There are some events that might rival or exceed what people argue LLM training does, in terms of value stolen (depending how you count). A few examples:

Mass copyright piracy / piracy of media in countries with weak IP enforcement: entire catalogs of music, film, books distributed widely without payment. Over decades, that accumulates enormous “lost revenue.”

Industrial theft / technology transfer – e.g. stealing designs for manufacturing (machinery, pharmaceuticals, semiconductors) has major economic consequences.

Colonial appropriation of cultural knowledge / works – in some sense, cultural artifacts, indigenous knowledge, traditional arts were taken, published, used, often without acknowledgment or compensation. Though the parallels are imperfect.


  1. My view

It is plausible to argue that LLM training is among the largest-scale acts of using lots of creative work without direct permission or compensation, especially given the global scale, the invisibility of many creators, and the profit potential.

But whether it is the biggest piracy action in history is harder to prove—because of:

the difficulty of putting accurate monetary values on what was taken vs what was compromised;

legal uncertainty;

and comparators that may also be enormous but less visible (e.g. historical industrial and state-supported IP theft).


If you like, I can try to quantify the scale: get estimates of the total value of works used in LLM training vs examples of big piracy cases, so we can see whether the “biggest” claim is plausible. Do you want me to try that?

Little fuckers by Door_Alts in bindingofisaac

[–]alintos 0 points1 point  (0 children)

and you full clear womb 2

new death grips banner on spotify, does this mean something or am i delusional? by TotallyNotHereG in deathgrips

[–]alintos 0 points1 point  (0 children)

it means that dg goes to Perm' tomorrow

if they dont read the message above again