On Thu, Sep 21, 2023 at 6:28 AM Tom Beecher <beecher@beecher.cc> wrote:
My understanding has always been that 30ms was set based on human perceptibility. 30ms was the average point at which the average person could start to detect artifacts in the audio.
Hi Tom, Jitter doesn't necessarily cause artifacts in the audio. Modern applications implement what's called a "jitter buffer." As the name implies, the buffer collects and delays audio for a brief time before playing it for the user. This allows time for the packets which have been delayed a little longer (jitter) to catch up with the earlier ones before they have to be played for the user. Smart implementations can adjust the size of the jitter buffer to match the observed variation in delay so that sound quality remains the same regardless of jitter. Indeed, on Zoom I barely noticed audio artifacts for a friend who was experiencing 800ms jitter. Yes, really, 800ms. We had to quit our gaming session because it caused his character actions to be utterly spastic, but his audio came through okay. The problem, of course, is that instead of the audio delay being the average packet delay, it becomes the maximum packet delay. You start to have problems with people talking over each other because when they start they can't yet hear the other person talking. "Sorry, go ahead. No, you go ahead." Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/