Hi Dave,

You did not tell: is it interactive? Because we could use big buffers and convert jitter to latency (some STBs have sub-second buffers).

Then jitter would effectively become Zero (more precise: not a problem), and we deal only with latency consequences.

Hence, your question is not about jitter, it is about latency.

 

By all 5 (or 6?) senses, the Human is a 25ms resolution machine (limited by the animal part of our brain: limbic system). Anything faster is “real-time”. Even echo cancellation is not needed – we hear echo but cannot split signals.

Dog has 2x better resolution, cat is 3x better. They probably hate cheap monitor pictures (PAL/SECAM had 50Hz, NTSC had 60Hz).

25ms is for everything round trip. 8ms is wasted just for visualization on the best screen (120Hz).

The typical budget left for the networking part (speed of light in the fiber) is about 5ms one way (1000km or do you prefer miles?).

Maybe worse, depends on the rendering in GPU (3ms?), processing in the app (3ms?), sensor of the initial signal (1ms?), and so on.

The worst problem is that the jitter buffer would be substructed from the same 25ms budgetL

Hence, it is easy to consume (by the jitter buffer) the 10ms that we typically have for networking and come to the situation when we left just with 1ms that pushe us to install MEC (distributed servers to every municipality).

Accounting for jitter buffer, it is pretty hard to be “real-time” for humans.

Hint: “Pacing” is the solution. The application should send packets with equal intervals. It is very much adopted by all OTTs.

By the way, “pacing” has many other positive effects on networking.

 

The next level is about our reaction (possibility to click). That is 150ms for some people, 250ms on average.

Hence, gaming is pretty affected by 50ms one-way latency because 2*50ms is becoming comparable to 150ms – it affects the gaming experience. In addition to seeing the dealy, we lose the time – the enemy would shoot us first.

 

The next level (for non-interactive applications) is limited only by the memory that you could devote to the jitter buffer.

The cinema would be fine even with a 5s jitter buffer. Except for zipping time, but it is a different story.

 

Eduard

From: NANOG [mailto:nanog-bounces+vasilenko.eduard=huawei.com@nanog.org] On Behalf Of Dave Taht
Sent: Wednesday, September 20, 2023 3:12 AM
To: NANOG <nanog@nanog.org>
Subject: what is acceptible jitter for voip and videoconferencing?

 

Dear nanog-ers:

 

I go back many, many years as to baseline numbers for managing voip networks, including things like CISCO LLQ, diffserv, fqm prioritizing vlans, and running

voip networks entirely separately... I worked on codecs, such as oslec, and early sip stacks, but that was over 20 years ago.

 

The thing is, I have been unable to find much research (as yet) as to why my number exists. Over here I am taking a poll as to what number is most correct (10ms, 30ms, 100ms, 200ms),

 

https://www.linkedin.com/feed/update/urn:li:ugcPost:7110029608753713152/

 

but I am even more interested in finding cites to support various viewpoints, including mine, and learning how slas are met to deliver it.


--