Owen DeLong <owen@delong.com> writes:
Like it or not (and I really don’t), the majority of modern CDNs are using TCP over Anycast.
It’s ugly and it’s prone to problems like this. It’s nice to see a customer with know-how actually publicizing and digging into the problem.
Thanks. I do plan to write this whole story up as a blog post, BTW. Apart from just being a nice "battle story" I also think it's important to get more visibility into these kinds of issues. I've mostly been interested in issues related to ECN in general, but its interaction with anycast is certainly... interesting :)
Until now, I believe an unknown number of customers have been suffering in silence or relegated to the ISPs “We can’t reproduce you problem” bin without resolution.
I’ve had lots of discussions on the subject and the usual end result is “It’s too hard to measure or quantify and there’s no visible contingent of impacted users”.
Now we at least have one visible impacted user.
As I said, happy to be an exponent if it can help others resolve these kinds of problems. Incidentally, in case you're not aware, there are currently two competing schemes being discussed at the IETF to re-purpose the ECT(1) code point in the IP header. One proposal[0] is to use it as an additional high-fidelity congestion indicator, while the other[1] is to use it as an identifier for a new type of traffic that should get special treatment (which almost, but not quite, amounts to priority queueing). So if either proposal gains traction, expect more ECN-marked traffic coming to a network near you in the maybe-not-so-distant future; with all the interesting issues that can bring with it. If someone feels like introducing some operational considerations into the IETF discussions, I do believe both drafts will be discussed at the tsvwg working group meetings at the Singapore IETF next week. -Toke [0] https://datatracker.ietf.org/doc/draft-morton-tsvwg-sce/ [1] https://datatracker.ietf.org/doc/draft-ietf-tsvwg-ecn-l4s-id/