And a very few population centers such as New York, London, Tokyo, and Cheyenne Mountain should probably have more than 5 paths.
I disagree. They may need that many spare paths beyond what is required to provide their services, but in my experience pretty much all infrastructure services are overloaded (or at least heavily loaded), and even if they have N+M "redundancy", eliminating just one or two of those "M" links will be enough to overwhelm the others and take everything down in a complete cascade failure.
Now you are talking about path capacity as well as the separacy issue. Let's consider the situation where you are designing a network to serve a city of over 1 million population. According to the rule of thumb, 5 paths are enough. What does this mean in practice? First, it probably means that you need to have 5 PoPs fully meshed within the city, or perhaps a larger number of PoPs in a partial mesh, so that you can get traffic from any point in the city to one of your 5 exits. However, the rule of thumb doesn't talk about this at all. Let's further simplify and consider the city to be a node, with 5 paths radiating out from it. One path could fail and the traffic from that path would have to be distributed to the other 4. As you have pointed out, this doesn't work unless all paths are running at four fifths of a normal load. Presumably, since IP circuits cannot run at 100% capacity, the 5 circuits are at somewhat less than 80%, but for now, let's just assume that they are running at no more than 80%. This is an N+1 scenario where one link can fail and service continues. But if two links fail, then 3 links must carry all the traffic, therefore all links must be at less than 60% of capacity. This would be an N+2 scenario. In the rule of thumb, I didn't really consider what failover scenario was right because it isn't a rule of thumb if it goes into too much detail. But the numbers, 1, 2, 3, 5, were chosen because I think that the sites which close up every evening, can live with N+0, then the others can probably live with N+1 except when you get to population centers above 1 million where connectivity to the rest of the world is important enough to have N+2 overall. In the real world, at the city level, there is more than one company providing the connectivity so it becomes tricky to analyze the true connectivity. Remember, paths are not equal to circuits. Therefore, the aggregate of all circuits from all companies which connect Chicago to St. Louis should count as one path. I'd love to see someone do a serious academic analysis along these lines to see what kind of "rules of thumb" have EMERGED from the past decades of network building and consolidation. It would be interesting if this type of research compared the network's topology to the topology of villages, market towns and cities which is remarkably uniform across continents and civilizations. --Michael Dillon