Javier,

I have seen a few potential hangups, most of which affect the setup equally if it is within the same datacenter or across datacenters. The difference there usually comes down to a greater chance of disconnects and “split-brain” scenarios when there are servers in multiple datacenters. In that case sharding (AKA cells, zones, etc.) is your friend to ensure that you can operate one site autonomously while disconnected from the others. 

Using DHCP servers in this way often reveals some bugs in the implementation depending on which server you are using. Fortunately I have seen several bugs get squashed in a couple of the open source implementations when members of my team reported them to the maintainers, so you should be confident in using one of the most common implementations (ISC, dnsmasq, a few others).

You also need to make sure that your network routing infrastructure tends toward stability and stickiness, so the same client talks to the same server throughout a flow. Of course a failure in the middle of the flow will eventually lead to a failover, but anything in progress is unlikely to recover given the limited error correction and sanity checking in the mentioned protocols. Best to take this into account and plan for a number of retries on any failure. Also make sure to test that all your servers eventually reach consensus after you test failure scenarios, and come up with a plan to force synchronization if needed. 

Also, with IPv6 you want to make sure that if you are assigning multiple addresses to clients that all servers will offer the same set of IPv6 IPs. That can be a real headache to debug. You don’t necessarily need a DHCPv6 server to issue IPs at all depending on if your setup supports autoassignment (you’ll need the proper setup of route advertisers on your routers).

Best of luck, I suspect it will work “like magic.” It does work but it flies in the face of past convention about how IP protocols are supposed to be used and requires control over areas that usually cross boundaries of responsibility (system admins vs. network admins vs. security admins).

-Dan Sneddon


On Feb 27, 2024, at 10:05 AM, Javier Gutierrez <GutierrezJ@westmancom.com> wrote:


Thanks to you all for your answers, it has helped me a lot already.

My design is very simplistic, I have 2 sets of firewalls that I will have advertising a /32 unicast to the network at each location and it will have a TFTP server behind each firewall.

I have no intention to have this be part of the internet as it will be used to serve internal customers devices that require TFTP
For the setup where you are running Anycast on a datacenter, are you running it inside the datacenter only or across multiple datacenters? other than having to replicate IPs and file services between datacenters have you seen any other issues?

Kind regards,

 

Javier Gutierrez,

Network Architect – AS19016
https://www.peeringdb.com/net/4073

Westman Communications Group

1906 Park Ave.  Brandon, MB  R7B 0R9

204.720.1158
 gutierrezj@westmancom.com

A close up of a sign

Description automatically generated

 

      

This e-mail and any attachments contain confidential and privileged information. If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this e-mail and destroy any copies. Any dissemination or use of this information by a person other than intended recipient is unauthorized and may be illegal.

 


From: NANOG <nanog-bounces+gutierrezj=westmancom.com@nanog.org> on behalf of Bill Woodcock <woody@pch.net>
Sent: Saturday, February 24, 2024 1:09 AM
To: Ask Bjørn Hansen <ask@develooper.com>
Cc: nanog@nanog.org <nanog@nanog.org>
Subject: Re: TFTP over anycast
 

CAUTION: This email is from an external source. Do not click links or open attachments unless you recognize the sender and know the content is safe.

The system Ask is describing is the traditional method of using anycast to geographically load-balance long-lived flows.  The first time I did that was with FTP servers in Berkeley and Santa Cruz, in 1989. 

I did a bigger system, also load balancing FTP servers for Oracle, their public-facing documentation stores, with servers in San Jose and Washington DC, a couple of years later.   A couple of years further on and the World Wide Web was a thing, and everybody was doing it. 
    
                -Bill


On Feb 24, 2024, at 7:38 AM, Ask Bjørn Hansen <ask@develooper.com> wrote:



On Feb 23, 2024, at 20:32, William Herrin <bill@herrin.us> wrote:

The relay server `dhcplb` could, maybe, help in that scenario
(dhcplb runs on the anycast IP, the “real” DHCP servers on
unicast IPs behind dhcplb).

Although they used the word "anycast", they're just load balancing.

The idea is to run the relays on an anycasted IP (so the load balancer / relay IP is anycasted).

[….] Relying on ECMP for anycasted DHCP would be a disaster
during any sort of failure. Add or remove a single route from an ECMP
set and the hashed path selection changes for most of the connections.

Consistent hashing (which I thought was widely supported now in ECMP implementations) and a bit of automation in how announcements are added can greatly mitigate this.



Ask