Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ah, I assumed, that the clauses regarding the use in training of an LLM are printed inside the book somewhere.


It would still be unenforceable because there's no consideration.

There is nothing of value that the license gives me that I wouldn't already have if the contract didn't exist. I can already read the book, merely by having it in front of me.


How does that give you the right to train an LLM on it?

Or are we talking about training an LLM on it and never releasing that LLM to anyone ever? Then I guess it wouldn't matter. But if that LLM is released to anyone, shouldn't the author of the book have a say on it?


> How does that give you the right to train an LLM on it?

Fair use gives me that right, not a contract or license.


Whether that falls under fair use is highly debatable.


It's going through the courts right now. We'll probably have an answer in a year or two.


I felt for a long time that it should be fair use. If an LLM can abstract what it learns from the copyrighted work, then that seems "fair" because that's what humans do.

But ... as I've thought about it more, it doesn't really feel just to me. The kind of value reaped from the works seems to suggest that the creator is due some portion of that value. Also, in practice - there's just an absolutely enormous amount of knowledge that can be consumed from the public domain. Even if Meta, OpenAI and friends decided to license a ~small handful of the long-term archives of some globally-read newspapers, they could get very broad and deep knowledge about the events, trends, terms of the last century to fill in a lot of gaps.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: