Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A rigorous approach to predicting the future of text was proposed by Li et al 2024, "Evaluating Large Language Models for Generalization and Robustness via Data Compression" (https://ar5iv.labs.arxiv.org/html//2402.00861) and I think that work should get more recognition.

They measure compression (perplexity) on future Wikipedia, news articles, code, arXiv papers, and multi-modal data. Data compression is intimately connected with robustness and generalization.





Thanks for the paper, I just read it and loved the approach. I hope the concept of using data compression as a benchmark will take off. In a sense it is kind of similar to the maxim "If you cannot explain something in simple terms, you do not understand it fully".



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: