Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The public is not learning from it. A person or corporation is creating a derivative work of it. Training a model is deriving a function from the training data. It is not "a human learning something by reading it".


It's an extreme stretch to say that the model weights are a derivative work of the training data given the legal definition of "derivative work".


It's not more a stretch than saying that re-encoding a PNG as a JPEG is a derivative work even though the process is lossy and the resulting bits look nothing alike.


I'm not sure you're being intellectually honest.

You think that a model that's capable of being prodded into producing an infringing output in addition to all the other non-infringing outputs it could produce is no different than a compression algorithm?


It is processed data at the end of the day. And no it is not like human reading. You can't read whole Github.


That doesn't make it a derivative work.

If I "process data" by doing a word count of a book, and then I publish the number of words in that book (not the words themself! Just a word count!) I haven't created a derivative work.

Processing data isn't automatically infringement.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: