On 1/Aug/20 02:17, Sabri Berisha wrote:
I'm not sure if you read their entire Mea Culpa, but they did indicate that the root cause of this issue was the provisioning of a legacy filter that they are no longer using. So effectively, that makes it a human error.
We're going to a point where a single error is no longer causing outages, something very similar to my favorite analogy: avation. Pretty much every major air disaster was caused by a combination of factors. Pretty much every major outage these days is caused by a combination of factors.
The manual provisioning of an inadequate filter, combined with an automation error on the side of a customer (which by itself was probably caused by a combination of factors), caused this issue.
We learn from every outage. And instead of radio silence, they fessed up and fixed the issue. Have a look at the ASRS program :)
What I meant by "TOTALLY avoidable" is that "this particular plane crash" has happened in the exact same way, for the exact same reasons, over and over again. Aviation learns from mistakes that don't generally recur in the exact same way for the exact same reasons. Telia and others have known about these issues from them happening to other operators. When we see these issues, we go back and look at our own networks to implement the fixes that solve the problem the last time it happened. That's the idea. The difference between us and aviation is that fundamental flaws or mistakes that impact safety are required to be fixed and checked if you want to keep operating in the industry. We don't have that, so... Mark.