Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

well yes but actually no I guess: the transformers benefit at the time was that they were more stable while learning, enabling larger and larger network and dataset to be learnt.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: