The public is not learning from it. A person or corporation is creating a deriva...

archontes · on July 10, 2024

It's an extreme stretch to say that the model weights are a derivative work of the training data given the legal definition of "derivative work".

account42 · on July 11, 2024

It's not more a stretch than saying that re-encoding a PNG as a JPEG is a derivative work even though the process is lossy and the resulting bits look nothing alike.

archontes · on July 12, 2024

I'm not sure you're being intellectually honest.

You think that a model that's capable of being prodded into producing an infringing output in addition to all the other non-infringing outputs it could produce is no different than a compression algorithm?

timeon · on July 10, 2024

It is processed data at the end of the day. And no it is not like human reading. You can't read whole Github.

stale2002 · on July 10, 2024

That doesn't make it a derivative work.

If I "process data" by doing a word count of a book, and then I publish the number of words in that book (not the words themself! Just a word count!) I haven't created a derivative work.

Processing data isn't automatically infringement.