Researchers Develop New Audio Codec: Sparse, Interpretable Representation Challenges Block-Coding Methods

Toward a Sparse and Interpretable Audio Codec

View PDF HTML (experimental) Abstract:Most widely-used modern audio codecs, such as Ogg Vorbis and MP3, as well as more recent "neural" codecs like Meta's Encodec or the Descript Audio Codec are based on block-coding; audio is divided into overlapping, fixed-size "frames" which are then compressed. While they often yield excellent reproductions and can be used for downstream tasks such as text-to-audio, they do not produce an intuitive, directly-interpretable representation. In this work, we int...