Hello Joe - On 9 May 2006, at 01:23, Joe Shen wrote:
Can you indicate in more detail what the problems were with the L4 switch?
We seperate our Radius servers into two farms, each farm has a L4 switch in front. To our understanding, radius authentication info. and accounting info. of a PPPoE session should be processed by the same Radius server. So, although L4 switch provides a single IP for BRAS configuration each BRAS is specified a real server IP in L4 switch. So, there comes the problem:
Normal RADIUS does not require authentication and accounting for a single session to go to the same RADIUS server.
1) Load is not balanced automatically but by human estimation; there is server whose load is twice of some other server.
You should use a loadbalancer that can distribute RADIUS requests on a per-request basis according to round-trip times which will be a reasonable indication of server load. Ie. the fastest round-trip time will be from the least-loaded server.
2) L4 switch becomes bottleneck of service availability. In past years, L4 switch caused several times of service failure. Just last friday, L4 switch does not repond to any network packets while its ethernet interface seems OK.
I suggest you find a better loadbalancer. Contact me off list if you need suggestions.
3) As L4 switch is the only entrance to a single server farm, DoS attack or some other kind of software bug will surely degrade security level. While, a farm using ECMP rely on server groups to resist DoS attack.
You should design your system with two loadbalancers, and configure your NAS equipment to use one as primary and the other as secondary. You should configure half of your NAS equipment to use loadbalancer A as primary, and the other half of your NAS equipment to use loadbalancer B as primary (and the converse for secondary).
4) Maintence is a little bit costy. Any maintence , no matter on radius server or on L4 switch, need a scheduled time window.
A design as above will have no single point of failure.
5) Service protection is hard ( as you mentioned as 'cascade' one). As there are two server farms, if one farm failed it takes ten or more minute to migrate those Radius traffic to the other farm. This is unacceptable.
If you set your RADIUS timeouts and retries on the NAS equipment sensibly, depending on what end-user devices are being used (PC modems, DSL modems, GPRS WAP phones, mail servers, web servers ...) any outage should have almost imperceptible impact.
So, we consider to find a more scable, reliable, secure and automatic multi-farm radius solution.
hope that helps regards Hugh
Joe
If the loadbalancing is done by source/destination IP address pairs, then you can have problems when a target goes down, as all of the source/destination IP address pairs will get switched to another target which then gets into difficulty and you end up with a cascading failure. It is generally preferable to have the loadbalancing done on a weighted per-packet basis, ideally distributed according to round-trip times.
Also note that you can only do per-packet loadbalancing with simple RADIUS, things like EAP that require multiple exchanges of RADIUS requests typically require state to be maintained in the single RADIUS server that is processing the entire EAP sequence.
regards
Hugh
On 8 May 2006, at 14:07, Joe Shen wrote:
Hi,
we have a radius server farm. there is a L4 switch installed behind all servers. Incoming AAA packets
switched by L4 switch to different servers.
In previous days we met a couple of problems with L4 switch which degraded our service a lot. Could it be possible to implement IPv4 Anycast architecture for radius server farm? Could it be any problem with AAA procedure?
Any advice will be highly appreciated
Joe
__________________________________ Do you Yahoo!? Yahoo! Movies - Search movie info and celeb
are profiles and photos.
NB:
Have you read the reference manual ("doc/ref.html")? Have you searched the mailing list archive (www.open.com.au/archives/ radiator)? Have you had a quick look on Google (www.google.com)? Have you included a copy of your configuration file (no secrets), together with a trace 4 debug showing what is happening?
-- Radiator: the most portable, flexible and configurable RADIUS server anywhere. Available on *NIX, *BSD, Windows, MacOS X. - Nets: internetwork inventory and management - graphical, extensible, flexible with hardware, software, platform and database independence. - CATool: Private Certificate Authority for Unix and Unix-like systems.
__________________________________ Do you Yahoo!? Yahoo! Movies - Search movie info and celeb profiles and photos. http://sg.movies.yahoo.com/
NB: Have you read the reference manual ("doc/ref.html")? Have you searched the mailing list archive (www.open.com.au/archives/ radiator)? Have you had a quick look on Google (www.google.com)? Have you included a copy of your configuration file (no secrets), together with a trace 4 debug showing what is happening? -- Radiator: the most portable, flexible and configurable RADIUS server anywhere. Available on *NIX, *BSD, Windows, MacOS X. - Nets: internetwork inventory and management - graphical, extensible, flexible with hardware, software, platform and database independence. - CATool: Private Certificate Authority for Unix and Unix-like systems.