Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
yurlungur
on Feb 7, 2025
|
parent
|
context
|
favorite
| on:
Meta torrented & seeded 81.7 TB dataset containing...
I think the difference may be LLMs may not be laundered clean of copyright data anytime soon. Even if chatgpt got big and profitable, it's not so clear that it won't contain copyrighted data as that may simply be necessary to train the best models.
cma
on Feb 8, 2025
[–]
Most of the web is copyrighted
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: