Hello
ARP timeout should be lower than MAC timeout, but usually the default is the other way around. Which is extremely stupid. To those who do not know why, let me give a simple example:
Router R1 is connected to switch SW1 with a connection to server SRV: R1 <-> SW1 <-> SRV
Router R2 is connected to switch SW2 with a connection to server SRV: R2 <-> SW2 <-> SRV
The server is using R1 as default gateway. Traffic is arriving from the internet through R2 towards the server. The server will however send replies back through the default gateway at R1. This is a usual case with redundant routers - only one will be used as a default gateway but traffic may come from both.
Initially all will be good. But SW2 is only seeing unidirectional traffic from R2. No traffic goes from SRV to R2 and thus, after some time, SW2 will expire the MAC learning for SRV. This has the unfortunate result that SW2 will start flooding traffic to SRV out through all ports.
Then after more time has passed, R2 will renew the ARP binding by sending out an ARP query to SRV. The server will send back an ARP reply to R2. This packet from SRV to R2 will pass SW2 and thus have the effect of renewing the MAC binding at SW2 too. The flooding stops and all is well again. Until the MAC binding expires and the story repeats.
If the MAC timeout is 5 minutes and the ARP timeout is 20 minutes, which is very usual, you will have flooding for 15 minutes out of every 20 minutes interval! Stupid!
Why have vendors not fixed their defaults for this case?
Regards,
Baldur