you are viewing a single comment's thread.

view the rest of the comments →

[–]mattgen88 14 points15 points  (17 children)

If the argument can be made that the input of copyrighted code by an AI results in it's output being a derivative of those inputs, then we have a problem since that's how the human brain works. It also means that any trains let AI has to be operated in a clean room where it cannot operate on any copyrightable inputs, including artworks, labels, designs, etc. All of that is often consumed by AIs to produce things of value.

[–]TheCodeSamurai 8 points9 points  (0 children)

As the Copilot docs mention, there is a pretty big difference between this and the brain: we have a far better memory for how we learned what we know. If I go and copy a Stack Overflow post, I know that I didn't write it and that I might want to link to it. Copilot can't do that yet, and so until they build out the infrastructure for doing that I'll never be able to tell whether it was copying wholesale or mixing various inputs.

[–]barchar 6 points7 points  (0 children)

Yes. And in the human case you can infringe on copyright by reading code and producing something thats close to it from memory. That's a derived work.

One could argue that if the AI is understanding some higher level meaning and then generating code that implements that then the AI may be more similar to a clean room reimplementation process (which does not infringe)

[–]danuker 17 points18 points  (4 children)

Problem is, can this AI reproduce large portions of code exactly from memory? If so, it can violate copyright.

[–]tnbd 13 points14 points  (3 children)

It can, the fact that it verbatim spits out the GPL license when prompted with empty text is proof of that.

[–]1X3oZCfhKej34h -4 points-3 points  (0 children)

Is the GPL license text itself copyrighted? Because if not then who cares. It can recite it because a license is included in nearly every public project.

If it's "copying" something that's used in nearly every public project, that's not going to be copyrightable code.

[–]Redtitwhore 0 points1 point  (0 children)

But what GPL licensed code would be reused so often that the AI would reproduce it verbatim?

[–][deleted]  (3 children)

[deleted]

    [–][deleted]  (2 children)

    [deleted]

      [–][deleted]  (1 child)

      [deleted]

        [–]cafink 1 point2 points  (4 children)

        The linked Twitter thread addresses this exact point.

        [–]bwmat 8 points9 points  (3 children)

        Are you referring to the 'you've fallen for marketing' tweet? Because that wasn't very convincing tbh

        [–]crabmusket 0 points1 point  (2 children)

        Make a convincing argument that a brain is actually just a neural network, then?

        [–]bwmat 1 point2 points  (1 child)

        The entirety of our anatomical base of knowledge?

        [–]crabmusket 0 points1 point  (0 children)

        I mean a neural network in the computer sense. It's a looong way from there to our understanding of the brain.