I still fail to see how "peak bits" or "bursted bits" are more expensive than "regular bits". A 100Mbit FE port costs whatever it costs, and does not fluctuate with usage. This is true of almost all of your links within the network - excluding those where you have negotiated usage-based billing. An OC3, point to point, costs as much as it costs irrelevant of its usage. Therefore, every bit that crosses this circuit has a cost.
Why not simply pass this cost on to the customer bit for bit?
let's say that you've sold 1Gb/s of connectivity to each of 50 customers. if you don't do anything fancy in your backbone provisioning, two things will happen: (1) you will need 50Gb/s of peering connectivity to EVERY PEER and on EVERY BACKBONE LINK (just in case all five customers send full blast to the same destination at the same time, as in a DDoS or flash crowd scenario); (2) when you try to pass the costs of this provisioned capacity to all of your customers (equally), you will have noncompetitive rates. what actually happens is that you provision as much capacity (including whatever headroom your business plan calls for) as is actually needed, and then try to upgrade it fast enough so that your customers always get their bits through. it is not actually possible to get 50Gb/s worth of peering connectivity to most peers -- and if you could then they'd be dropping most of your traffic one hop inbound on their network anyway. charging for 95th percentile (highest of in or out, not additive) is a way to get those customers whose traffic puts the most strain on the provisioning plan to pay more. but consider another model for a moment. the power companies hereabouts use the monthly peak to set the rate (per KwH) for the month. so if you never hit them very hard then every KwH you buy from them is cheaper. this model correctly passes the provisioning (especially headroom) costs to commercial customers who put the most stress on the system, but also correctly rewards customers with lower costs based on them keeping their actual usage down. i'm not suggesting that this would apply perfectly to the IP transit situation, but it takes more of the relevant factors into account.