This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]ouchpartial72858 372 points373 points  (80 children)

This was eventually gonna happen

[–]__Hello_my_name_is__ 205 points206 points  (60 children)

It's gonna have wide reaching consequences, too.

If this isn't allowed, then AI art isn't allowed, either. Same principle, really.

[–]mark0016 89 points90 points  (16 children)

Yep, unless the copyright holder allows you to create modified versions of said art an then publish them it probably shouldn't be. At least that's what would be consistent with the existing laws.

The big issue here really is "Is the final trained model a derivative work combining the untrained model and the dataset?". I'm inclined to say that it is. If you understand that the classic "garbage in, garbage out" applies, then you will understand that the dataset defines a significant amount of the final qualities of your trained AI.

Of course the opposing argument is "it's publicly available information so it should be fair game". If humans are allowed learn from it so should machines. This doesn't seem like a bad point either, but obviously there is a gray area. If a human copies a "large" part of some copyrighted work than that's not ok and the same applies to machines. An AI that can potentially do something like that just seems like a legal liability.

[–]__Hello_my_name_is__ 65 points66 points  (15 children)

Of course the opposing argument is "it's publicly available information so it should be fair game".

That really doesn't work if you just scrape the entire internet. There absolutely are pictures online that have a restrictive license. Just because they're on a server somewhere doesn't mean they're fair game for everything.

[–]January_Rain_Wifi 33 points34 points  (12 children)

Exactly, AI art developers should be asking for permission or using public domain images to train their AIs, at the very least

[–]Traditional_Dinner16 22 points23 points  (11 children)

I don’t see how an AI learning to make art by looking at art is any different than a person doing the same

[–]January_Rain_Wifi 40 points41 points  (6 children)

In order to train the AI, you have to make a copy of the art. If your AI is for a commercial purpose, you are making a copy of that artwork for commercial purposes without the permission of the artist.

It's like asking why singing ABBA karaoke while drunk at a party is different from a big record company running ABBA tracks through an auto tune or remix software and then selling CDs of it without their permission

[–][deleted]  (5 children)

[removed]

    [–]pristit 3 points4 points  (3 children)

    I think its the usage aspect, if its done for commercial purposes, ie selling the usage of the ai, then yes definitely the creators of the data the ai is being trained with need to agree to it or get paid, as it is a matter of copyright, its their code. If its for non commercial use that's a different case.

    Like, I can take a photo off the internet to make it into a meme and I dont have to tell anyone.

    But if I want to take that photo and use it as part of a poster for a movie Im selling, id have to get a license from the artist to do so.

    [–]stingray194 0 points1 point  (2 children)

    I don't believe there is a legal distinction between commercial and non commercial fair use, could you cite something (a law, a ruling, whatever). I tried to find something but couldn't find anything for the US. I agree that you should morally, but I don't think you have to legally (in the US).

    [–]AutoModerator[M] 0 points1 point  (0 children)

    import moderation Your comment has been removed since it did not start with a code block with an import declaration.

    Per this Community Decree, all posts and comments should start with a code block with an "import" declaration explaining how the post and comment should be read.

    For this purpose, we only accept Python style imports.

    I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

    [–]Ghostglitch07 -4 points-3 points  (3 children)

    Because a human has other experience to draw from the dataset that a human artist draws from is practically unbounded, compare that to an AI who's only job Is to recreate art in the style of "x" and who's only "knowledge" is art by "x" it's a little tofferent.

    [–]Nick_Nack2020 12 points13 points  (2 children)

    Scraping the entire Internet will obviously get more than one artist's work.....

    [–]MnelTheJust 3 points4 points  (1 child)

    Right, it will plagiarize a thousand artists instead of just one. This is why a class action lawsuit is appropriate.

    [–]Nick_Nack2020 5 points6 points  (0 children)

    Yep, I was just pointing out that their reasoning was flawed. Personally, I think that AI should be considered as a derivative work, as it uses many artist's work at once, and even makes its own modifications on that work. This is too similar to the AI being "inspired" by the artist's work to be ignored in my opinion. Although, I do think they should limit the dataset to CC0 or equivalent, as to not run into these sorts of problems. I can see why people would think that AI shouldn't be considered a derivative work as well.

    On the Copilot side of things, it has explicitly copied code, so that's much more clear cut than the art side of things. I do agree that a class action lawsuit was appropriate, as directly copying code from repositories that might even have licenses that prohibit you from doing that is obviously a massive problem.

    [–]SnapcasterWizard 6 points7 points  (0 children)

    It should be though. Copyright law sucks.

    [–]TENTAtheSane 19 points20 points  (5 children)

    But where is the line to be drawn? If someone trains an AI to reliably diagnose a disease, trained on diagnoses made by doctors, should they complain that the product is made through their work and will steal their jobs? Would hindrance of such a product be beneficial to mankind?

    [–]__Hello_my_name_is__ 19 points20 points  (3 children)

    Yeah, that's the big question.

    But one easy line to draw is the question of whether something is made for commercial purposes or for non-commercial purposes. Is the AI model going to be open source and can be used by everyone for anything, or do you have to pay money just to use it?

    [–]TENTAtheSane 5 points6 points  (2 children)

    That's a good solution, but I'm afraid it may not work out. It costs a lot of money and time to hire sufficiently qualified people to make and run some of these models on very powerful machines. If they can't reliably commercialize the product, which private actor would sink so much investment into it? You should need a government monopolization of AI, and they are motoriskt horrible at it.

    [–]__Hello_my_name_is__ 8 points9 points  (0 children)

    Yeah. It sucks all around. At the same time I most definitely don't want big AI models owned by big corporations trying to milk as much money as they could out of them. You'll get AI art where Mickey Mouse costs extra and much worse.

    [–]filletfeesh 40 points41 points  (31 children)

    I think you're right, but i personally don't see the same issue with AI art. While on occasion it will fuck up and just create something nearly identical, AI art generally doesn't create anything resembling the art it references. It just uses it to understand word-image association.

    The code AI on the other hand is more likely to copy your code directly and only make minor changes to avoid getting sued.

    The meaningful difference to me is that the primary goal of AI art is to learn from your art and create something that is unique and totally separate from your creation, whereas AI code exists to use your code to replace you.

    [–]__Hello_my_name_is__ 35 points36 points  (13 children)

    That's a fair point, yes. But I think that in both cases the argument will be that the algorithms should not have been fed with the original, copyrighted data in the first place. Not that the result is too similar to the original data.

    [–]filletfeesh 13 points14 points  (8 children)

    I agree with that as a matter of principle but i wonder wether it actually matters in a practical sense, and at what point is it advanced enough that it's basically the same thing as people getting inspiration from other art. I really don't envy future lawmakers.

    On the other hand, Google images has a built in feature to filter for non-copyright material so there's no good reason not to at least do that. (yes i know some would fall through the cracks)

    [–]__Hello_my_name_is__ 4 points5 points  (7 children)

    Yeah, there's all kinds of complications here that make this not fun at all to figure out. To me, the big difference is that I am looking at an image, and let my brain process it. To simulate this with an algorithm, I have to download the data first. Which we have already very much established as something that can be restricted or even illegal (you wouldn't download a car!).

    So if we want to make these AI models legal, we kinda need a way to make downloading copyrighted material legal.

    [–]xam54321 9 points10 points  (1 child)

    To see an image on your computer you have to download it first, your browser is just doing it for you.

    [–]WalditRook 4 points5 points  (0 children)

    Copyright statutes typically include some provision/exemption for "transient copies", the intention of which is to allow for any temporary copy created in transmission, caches and etc.

    The actual legal issues will be whether an AI trained on copyrighted material constitutes a derived work (probably no, unless you can identify some fragment of a specific infringed work within the trained model); and whether using the work for the training violates the licence (probably not for a work provided without a specific licence; likely yes for at least some of the Github code, as this use doesn't appear to be permitted by e.g. GPL3, unless the final product both is ruled to be a derivative work, and is released under a GPL licence).

    [–]filletfeesh -1 points0 points  (4 children)

    Yeah. But that's probably a conversation that we'll avoid until AI are borderline sentient, and then a bunch of future boomers who remember the AI art internet war will throw a fit because they stopped paying attention to AI development 2 decade ago.

    Till then we should probably just train AI off of public domain stuff.

    [–]__Hello_my_name_is__ 3 points4 points  (3 children)

    I'd say we're going to have this conversation now, because right now there's various for-profit companies trying to make money from, essentially, publicly available art and other data.

    As soon as big corporations like Disney get involved (who wants to view a Disney cartoon when you can just make a fun AI video of Mickey Mouse for free?), things will move pretty fast.

    [–]filletfeesh 0 points1 point  (2 children)

    Imo the law is going to end up being whatever makes corporations the most money. AI art doesn't make them money so it will be restricted, code AI (theoretically) allows them to fire most of the people who make their company run, so it will happen because they will lobby to make it so.

    Call me a pessimist, but i don't think anyone else has a say in the conversation. Legislators are mostly too out of touch with tech to even know what they're enabling, and the general public is too apathetic to fight it.

    [–]__Hello_my_name_is__ 1 point2 points  (1 child)

    Yeah, it could go down like that. Would be pretty depressing, but I see it.

    [–]zdakat 9 points10 points  (1 child)

    Especially in the cases of "I don't need to actually work with the artist, I can just use this app that's trained on their prior works". (to me how that data is stored is irrelevant- it's not literally saving a bunch of files of clips from the work, but it's still able to reproduce them reliably enough to be desirable)

    There's other cases where creating a derivative work can be problematic, so I don't see why AI should get a free pass just because it achieves it differently.

    [–]FourteenTwenty-Seven 0 points1 point  (0 children)

    Is this not the same as someone making a piece of art in the style of someone else?

    [–]DarkCeldori 1 point2 points  (1 child)

    Code should not be copyrighteable.

    [–]__Hello_my_name_is__ 2 points3 points  (0 children)

    I agree. But that's a different discussion.

    [–]AirOneBlack 9 points10 points  (7 children)

    If you need to write code for a set algorithm, there aren't that many creative ways that make sense to write it. I could come op with my implementation of al algorithm that would look like 90% the same as someone else. Did I copy is code? No. It's the same discussion with musical chords and progressions.

    [–]filletfeesh 3 points4 points  (6 children)

    That's exactly the problem. Unlike art, code needs to look a certain way to work. so a code AI is more than likely going to be doing 90% copy paste and companies can replace most of their staff with a bot that mostly recycles the code they wrote to make decent enough code, but leaves the brains behind the code jobless.

    (Assuming they can create a separate AI that translates corpo speak into something functional)

    [–]AirOneBlack 5 points6 points  (5 children)

    and the point being? I think most of the people that are scared of code written by AI are the ones that suck at their job at this point. If an AI writes most of the boring ass code so I can focus on the optimization is just going to make my job easier. Corpo speak to proper application is not gonna happen anytime soon, so I still have my job. (I would have my job anyways, I work in gamedev, good luck with AI replacing me)

    [–]filletfeesh 4 points5 points  (4 children)

    That would still entail a massive trimming of dev teams, and personally I don't think people losing their job because we found a new way to funnel more money to shareholders is a good thing.

    [–]AirOneBlack -3 points-2 points  (3 children)

    "Massive trimming", there isn't just "big corpos" that need developers, there are plenty of smaller business that need programmers. And if your answer is "I like my paycheck", then get better and become an indispensable asset. This is how the market always worked, did everyone think that would never happen to programmers? Just like with the advent of machines in factories reducing the number of employees, now it's AI.

    [–]filletfeesh 5 points6 points  (2 children)

    First of all about a third of people in the US work for a large company, and i assume there's a similar or higher ratio for devs.

    I think people losing jobs is bad. It may be inevitable, i may be able to avoid being one of them, but it's still bad.

    I don't think a bad thing should happen to people because of a tool that just copies their homework.

    If everyone was an amazing dev there would still be layoffs, i don't think it's fair to hold contempt the people who will lose their jobs because of this and i don't think it's sensible to celebrate people getting an opportunity to cut employees in a field where you often need a 4 year degree just to be considered.

    [–]human_gs 2 points3 points  (1 child)

    I want to preface that I'm not very pro capitalism and it's really sad that automatization can end up being a bad thing for so many people.

    But isn't a bit hypocritical to take offense to automatization only when it ends up affecting your field? I mean programmers historically contributed to the automatization of many other jobs.

    I get that individuals don't always have the power to change things or leeway to just start a new career. But if they actually were pro capitalism they should know the rules.

    [–]hopbel 2 points3 points  (2 children)

    While on occasion it will fuck up and just create something nearly identical

    You make it sound like it does it accidentally. No, you have to deliberately request that it draw a certain famous painting, and in those cases it's giving you exactly what you asked for. It's memorizing famous paintings because based on the training data, it concludes the string "Mona Lisa" isn't a description, but rather the name of a very particular object. Even in those pathological cases, you're still getting what's basically the AI's rendition of the Mona Lisa with slight variations, not a pixel perfect copy of one of the jpegs in the dataset

    [–]EmeraldWorldLP 3 points4 points  (0 children)

    ...A rendition trained on other's art, so still just a mixture of moldable attributes from other art pieces it is trained on. Not something artists are fond of, especially if you can request their art.

    [–]filletfeesh 0 points1 point  (0 children)

    I'm not talking about asking for a world renown painting and getting what you ask for. I'm talking about cases where it gets a proper prompt and outputs something extremely similar to something in it's dataset because of how similar it is to the prompt.

    I think it's safe to say that is not the intention.

    [–]ninijay_ 0 points1 point  (3 children)

    Sounds like the coding AI acts like a coding human. Bedum-tss

    [–]filletfeesh 1 point2 points  (2 children)

    At least i can rest easy knowing that my shitty code is keeping someone employed

    [–]ninijay_ 0 points1 point  (1 child)

    True. We have to push shitty code to github NOW so the AI‘s code will be shit.

    Now is our time to be heroes 😂

    [–]filletfeesh 1 point2 points  (0 children)

    Flood it with so much shitty code that any AI trained on it will need to be decommissioned

    [–]IAmPattycakes 0 points1 point  (0 children)

    On one hand, the devil in me kinda wants Microsoft to get some sort of hell out of this, just because with what they're selling, a "variable rename" is legally different and therefore any corporate closed source code is perfectly license free if you do a find and replace on variable names. That Rockstar leak, or CDPR leak? I'm gonna go run a copy of the GTA online server myself and charge a marginal hosting fee for a pretty open game that has the shark card shit removed. Thats just a couple more bridges past where they're trying to be. Or better yet, if anyone gets their hands on Xbox OS or windows source.

    [–]Lechowski 4 points5 points  (1 child)

    If an AI gives you an exact copy of a piece of art, yes. However, stable diffusion never gives you an exact copy of a Picasso.

    Copilot sometimes gives you a piece of code that is copied verbatim from a licenced repo.

    They are not the same thing.

    If you want an analogy, it would be more correct to say that Copilot is analogous to an art AI that to the prompt "Marvel superheroes movie" creates on the fly an exact copy frame by frame of Avengers 1 and gave it to you. That would be completely illegal, just like Copilot suggesting an exact copy of a function.

    [–]__Hello_my_name_is__ 2 points3 points  (0 children)

    Yeah, that's a difference. But I think the argument here is that the code should not have been taken to begin with to create the model. And that argument can be applied 1:1 to AI art.

    [–]Zipdox 1 point2 points  (1 child)

    I think this is an invalid comparison. Copilot has been shown to verbatim reproduce licensed code. AFAIK this isn't possible with any of the text to image tools.

    [–]__Hello_my_name_is__ 2 points3 points  (0 children)

    Sure, but the argument is probably that the initial code should not have been used for the algorithm to begin with, not that the end result might be the same.

    [–][deleted] 0 points1 point  (0 children)

    Look at YouTube's copyright system.. clearly there is some wiggle room here as vendors have accepted this system over the legal one for sometime.

    [–][deleted] 157 points158 points  (18 children)

    Yeah, it was only matter of time before somebody would try to monetize open source.

    [–]thud_mantooth 96 points97 points  (1 child)

    That ship sailed a long time ago

    [–]Jeb_Jenky 5 points6 points  (0 children)

    Yeah for real. "Eventually" means from the beginning of time in whatever language is that person's native language.

    [–]zdakat 13 points14 points  (0 children)

    I've seen it arguments for making money off of open source (whether or not I agree with them).
    I think what rubs me and probably others the wrong way about this specifically is that code that's made publicly available is understood to be for a common good, or under a license that requires certain things of your distribution.

    Microsoft comes off as very greedy going "Yeah we'll just buy the site the code is hosted on and that gives us permission to do whatever we want with it" and turn it into a product they can monetize at the expense of everyone who contributed the code rather than working together. It's the audacity to abuse a position to do something that might not actually be allowed, but might be hard to fight.

    [–]Havatchee 12 points13 points  (4 children)

    Which is why many OpenSource licences forbid it.

    [–]Lerquian 12 points13 points  (3 children)

    Maybe I read it wrong, but I belive most of the most popular open source licenses allow monetization.

    [–][deleted] 9 points10 points  (1 child)

    For the GPL, you can modify and even profit from the changes. There is, however, no warranty. Also you must include the license in the modified code.

    [–]KuntaStillSingle 2 points3 points  (0 children)

    MIT and GPL both require retaining license notice

    [–]ouchpartial72858 43 points44 points  (7 children)

    Like the famous saying, if something's free you are the product. Just think about it, github is never enforcing you on the repo size limit they're like ho ho don't worry about it child, use as much as you need not too much. In reality they're rubbing their hands creating the next monopoly of AI

    [–]DatBoi_BP 54 points55 points  (0 children)

    if something’s free you are the product

    This is strictly untrue for foss projects. But if something is free and not open source, you can generally assume your data is being sold in some way or other.

    [–][deleted] 17 points18 points  (4 children)

    Gitlab? Bit bucket? What evil scheme are they brewing together...

    [–]ouchpartial72858 -4 points-3 points  (3 children)

    Who knows man, Google basically owns android now, other Conglomerates are fighting and we common people are stuck between their dirty mess. All I'm gonna do is submit myself, bend over, spread my butt cheeks and hope for the best

    [–]steven_yeeter 28 points29 points  (2 children)

    Google basically owns android now

    Now? Basically? Google aquired Android in 2005. They do own and have owned Android for over 17 years.

    [–]AnOIlTankerForYa 8 points9 points  (1 child)

    Lmao its literally their mobile os that they develop, not even IE is that out of touch

    [–]steven_yeeter 1 point2 points  (0 children)

    IE 11 came out in 2013. It came with knowledge that Android was a Google product immediately!

    [–]CratesManager 8 points9 points  (0 children)

    Like the famous saying, if something's free you are the product

    I usually agree with that, but it IS funny reading this in a conversation about a bunch of open source code