We use the BRIX active measurement system (BRIX now owned by EXFO) which gathers round trip time, packet loss, and jitter randomly every minute 24x7x365 for our major backbone links to calculate SLAs. "Network Availability" can be measured empirically using BRIX calculated values of packet loss, and expressed in terms of #9's, which BRIX will also calculate over any time period for which BRIX historical data is being kept. BRIX historical data is kept on an embedded Oracle data base. BRIX usually runs on a Solaris SMP server. -----Original Message----- From: Bill Woodcock [mailto:woody@pch.net] Sent: Tuesday, July 28, 2009 9:34 PM To: nanog Subject: Ahoy, SLA boffins! So I've embarked on the no-doubt-futile task of trying to interpret SLAs as empirically-verifiable technical specifications, rather than as marketing blather. And there's something that I'm finding particularly puzzling: In most SLAs, there seem to be two separate guarantees proffered: one concerning "network availability" and one concerning "packet loss." Now, if I were to put my engineer hat on, and try to _imagine_ what the difference might be, I might imagine "network availability" to have something to do with layer-2 link status being presented as "up," while packet loss would be the percentage of packets dropped. But when I actually read SLAs, "network availability" is generally defined as the portion of the month that the path from the customer's local loop to the transit or peering routers was "available" to transmit packets. Packet loss, on the other hand, is generally defined as the portion of packets which are lost while crossing that exact same piece of network. Now, what am I missing here? Is this one of those Heisenberg things, where "network availability" is the time the network _could have_ delivered a packet _when you weren't actually doing so_, while "packet loss" is the time the network _couldn't_ deliver a packet when you _were_ actually doing so? Is "network availability" inherently unmeasurable on a network that's less than 100% utilized? Am I over-thinking this? Seriously, though, I know there are people who don't consider SLAs to be fantasy-fiction, and some of them must not be innumerate, and some subset of those must be on NANOG, and the intersection set might be equal to or greater than one, right? Can anybody explain this to me in a way I can translate into code, while still taking myself seriously? -Bill