Paper ID: 2307.15002
Gzip versus bag-of-words for text classification
Juri Opitz
The effectiveness of compression in text classification ('gzip') has recently garnered lots of attention. In this note we show that `bag-of-words' approaches can achieve similar or better results, and are more efficient.
Submitted: Jul 27, 2023