[music-dsp] Low bitrate audio compression
decoy at iki.fi
Mon Dec 30 07:38:01 EST 2002
On 2002-12-30, Roman Katzer uttered to Sampo Syreeni:
>Last number I heard was less than 60 _phonemes_ for all languages.
Really? Don't have access to the IPA handbook, but the last widely
repeated number I heard was 11 to 141 phonemes for any one language, and
nearer to a thousand phones for the whole inventory of sounds utilized by
currently known languages. If I'm not entirely mistaken, that's excluding
suprasegmental variation, like tone and stress. Cf.
http://www.angelfire.com/ego/pdf/ng/lng/how/how_sounds.html , and
http://www.wikipedia.org/wiki/Phoneme . IPA is capable of marking most (if
not all) of these sounds, but uses a hoarde of accents and modifiers to
accomplish the task.
>One professor of mine made an interesting calculation once to show that
>it's possible to intelligibly code speech with a data rate of about 56
>bits per second.
Double that, and I wouldn't be surprised. But 56bps does appear somewhat
tight -- a typical phoneme rate in speech is about 10 per second, and can
reach higher depending on language and speaker. With a typical palette of
30 phonemes in a language we'd get to around 50bps without any duration,
framing, stress or tone data. Even with intelligent coding and statistical
compression (phoneme, length, whathaveyou, distributions rarely approach
flat) 56bps feels a wee bit low.
Sampo Syreeni, aka decoy - mailto:decoy at iki.fi, tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
More information about the music-dsp