On 13 November 2019 17:20:18 CET, Matt Corallo <netadmin@as397444.net> wrote:
This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP is... out of spec to say the least), not a bug in ECN/ECMP.
Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will split the flow over multiple paths; avoiding that is the whole point of doing the flow-based hashing in the first place. Anycast "only" turns a potential degradation of TCP performance into a hard failure... :) -Toke
Not ideal, sure, but if it’s only for the SYN (as you seem to indicate), splitting the flow shouldn’t have material performance degradation?
On Nov 13, 2019, at 11:51, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
On 13 November 2019 17:20:18 CET, Matt Corallo <netadmin@as397444.net> wrote: This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP is... out of spec to say the least), not a bug in ECN/ECMP.
Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will split the flow over multiple paths; avoiding that is the whole point of doing the flow-based hashing in the first place.
Anycast "only" turns a potential degradation of TCP performance into a hard failure... :)
-Toke
Not to condone what cloudflare is doing, but... An ECN connection will have different bits on various packets for the duration of the connection -- pure ACKs (ACKs not piggybacking on data) will have the ECN bits as 00b, while all other packets will have either 01b, 10b (when no congestion was experienced) or 11b (when congestion was experienced). So using the ECN bits as part of the hash would affect performance throughout the life of the connection. On Wed, Nov 13, 2019 at 9:00 AM Matt Corallo <nanog@as397444.net> wrote:
Not ideal, sure, but if it’s only for the SYN (as you seem to indicate), splitting the flow shouldn’t have material performance degradation?
On Nov 13, 2019, at 11:51, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
On 13 November 2019 17:20:18 CET, Matt Corallo <netadmin@as397444.net> wrote: This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP is... out of spec to say the least), not a bug in ECN/ECMP.
Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will split the flow over multiple paths; avoiding that is the whole point of doing the flow-based hashing in the first place.
Anycast "only" turns a potential degradation of TCP performance into a hard failure... :)
-Toke
Like it or not (and I really don’t), the majority of modern CDNs are using TCP over Anycast. It’s ugly and it’s prone to problems like this. It’s nice to see a customer with know-how actually publicizing and digging into the problem. Until now, I believe an unknown number of customers have been suffering in silence or relegated to the ISPs “We can’t reproduce you problem” bin without resolution. I’ve had lots of discussions on the subject and the usual end result is “It’s too hard to measure or quantify and there’s no visible contingent of impacted users”. Now we at least have one visible impacted user. Owen
On Nov 13, 2019, at 09:19 , Anoop Ghanwani <anoop@alumni.duke.edu> wrote:
Not to condone what cloudflare is doing, but...
An ECN connection will have different bits on various packets for the duration of the connection -- pure ACKs (ACKs not piggybacking on data) will have the ECN bits as 00b, while all other packets will have either 01b, 10b (when no congestion was experienced) or 11b (when congestion was experienced). So using the ECN bits as part of the hash would affect performance throughout the life of the connection.
On Wed, Nov 13, 2019 at 9:00 AM Matt Corallo <nanog@as397444.net <mailto:nanog@as397444.net>> wrote: Not ideal, sure, but if it’s only for the SYN (as you seem to indicate), splitting the flow shouldn’t have material performance degradation?
On Nov 13, 2019, at 11:51, Toke Høiland-Jørgensen <toke@toke.dk <mailto:toke@toke.dk>> wrote:
On 13 November 2019 17:20:18 CET, Matt Corallo <netadmin@as397444.net <mailto:netadmin@as397444.net>> wrote: This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP is... out of spec to say the least), not a bug in ECN/ECMP.
Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will split the flow over multiple paths; avoiding that is the whole point of doing the flow-based hashing in the first place.
Anycast "only" turns a potential degradation of TCP performance into a hard failure... :)
-Toke
Owen DeLong <owen@delong.com> writes:
Like it or not (and I really don’t), the majority of modern CDNs are using TCP over Anycast.
It’s ugly and it’s prone to problems like this. It’s nice to see a customer with know-how actually publicizing and digging into the problem.
Thanks. I do plan to write this whole story up as a blog post, BTW. Apart from just being a nice "battle story" I also think it's important to get more visibility into these kinds of issues. I've mostly been interested in issues related to ECN in general, but its interaction with anycast is certainly... interesting :)
Until now, I believe an unknown number of customers have been suffering in silence or relegated to the ISPs “We can’t reproduce you problem” bin without resolution.
I’ve had lots of discussions on the subject and the usual end result is “It’s too hard to measure or quantify and there’s no visible contingent of impacted users”.
Now we at least have one visible impacted user.
As I said, happy to be an exponent if it can help others resolve these kinds of problems. Incidentally, in case you're not aware, there are currently two competing schemes being discussed at the IETF to re-purpose the ECT(1) code point in the IP header. One proposal[0] is to use it as an additional high-fidelity congestion indicator, while the other[1] is to use it as an identifier for a new type of traffic that should get special treatment (which almost, but not quite, amounts to priority queueing). So if either proposal gains traction, expect more ECN-marked traffic coming to a network near you in the maybe-not-so-distant future; with all the interesting issues that can bring with it. If someone feels like introducing some operational considerations into the IETF discussions, I do believe both drafts will be discussed at the tsvwg working group meetings at the Singapore IETF next week. -Toke [0] https://datatracker.ietf.org/doc/draft-morton-tsvwg-sce/ [1] https://datatracker.ietf.org/doc/draft-ietf-tsvwg-ecn-l4s-id/
It does when the split flows land in different anycast origin POPs. Making a few assumptions from the traceroutes, the ECMP paths are sending some packets to Hamburg and some to Denmark. Each POP may be getting parts of what should be a single TCP stream, and I doubt they have anything to cope with that (another assumption). On Wed, 13 Nov 2019, Matt Corallo wrote:
Not ideal, sure, but if it’s only for the SYN (as you seem to indicate), splitting the flow shouldn’t have material performance degradation?
On Nov 13, 2019, at 11:51, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
On 13 November 2019 17:20:18 CET, Matt Corallo <netadmin@as397444.net> wrote: This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP is... out of spec to say the least), not a bug in ECN/ECMP.
Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will split the flow over multiple paths; avoiding that is the whole point of doing the flow-based hashing in the first place.
Anycast "only" turns a potential degradation of TCP performance into a hard failure... :)
-Toke
---------------------------------------------------------------------- Jon Lewis, MCP :) | I route StackPath, Sr. Neteng | therefore you are _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
participants (5)
-
Anoop Ghanwani
-
Jon Lewis
-
Matt Corallo
-
Owen DeLong
-
Toke Høiland-Jørgensen