RE: wifi for 600, alex

15 Feb 2007

...
-----Original Message-----
From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On 
Behalf Of Suresh Ramasubramanian
Sent: Wednesday, February 14, 2007 6:25 PM
To: Marshall Eubanks
Cc: Carl Karsten; NANOG
Subject: Re: wifi for 600, alex
[snip]
...
2. Plan the network, number of APs based on session capacity, 
signal coverage etc so that you dont have several dozen 
people associating to the same AP, at the same time, when 
they could easily find other APs ... I guess a laptop will 
latch onto the AP that has the strongest signal first.
Speaking from experiences at Nanog and abroad, this has proven difficult
(more like impossible) to achieve to the degree of success engineers
would expect. In an ideal world, client hardware makers would all
implement sane, rational, and scalable 'scanning' processes in their
products. However, we find this to be one market where the hardware is
far from ideal and there's little market pressure to change or improve
it. On many occasions I've detected client hardware which simply picks
the first 'good' response from an AP on a particular SSID to associate
with, and doesn't consider anything it detects afterward! If the first
"Good" response came from an AP on channel 4, it went there! 

Also incredibly annoying and troubling are cards that implement 'near
continuous' scanning once or say twice per second or cards that are
programmed to do so whenever 'signal quality' falls below a static
threshold. A mobile station would likely see very clean hand-over
between AP's and I'm sure the resulting user experience would be great.
However, this behavior is horrible when there are 200 stations all
within radio distance of each other...  you've just created a storm of
~400 frames/sec across _all_ channels, 1 on up! Remember, the scan
sequence is fast - dwell time on each channel listing for a
probe_response is on the other of a few milliseconds. If a card emits 22
frames per second across 11 channels, that 2 frames/sec per channel
becomes a deafening roar of worthless frames. It's obvious that the CA
part of CSMA/CA doesn't scale to 200 stations when we consider these
sorts of issues.

I can think back to Nanogs years ago where folks tended to have junky
prism II radios which did this (type of scanning). Nanog 29 in
particular was quite rife with junky prism II user hardware. A lot of
the laptops were "sager" or something silvery-plastic-generic from far
overseas.

In my selfish, ideal world, a "wifi" network would behave more like a
CDMA system does. Unfortunately, wifi devices were not designed with
these goals in mind. If they had, the hardware would be horribly
expensive, no critical mass of users would have adopted the technology,
and it wouldn't be ubiquitous or cheap today. The good news is that
because it's gotten ubiquitous and popular, companies have added-in some
of the missing niceties to aid in scaling the deployments. 

We now see 'controller based' systems from cisco and Meru which have
implemented some of the core principals at work in larger mobile
networks. One of the important features gained with this centralized
controller concept is coordinated, directed association from AP to AP.
The controller can know the short-scale and long-scale loading of each
AP, the success/failure of delivering frames to each associated client,
and a wealth of other useful tidbits. Armed with these clues, a
centralized device would prove useful by directing specifically active
stations to lesser loaded (but still RF-ideal) APs. 

True, the CCX (cisco client extensions) support on some devices can
permit stuff like this to be shared with the clients (i.e. CCX exposes
AP loading data in the beacon frames, and can tell the client how to
limit it's TX power) in the hopes that this can be used in the 'hybrid'
AP selection logic of the station card. What stinks for us is that very
few (generally fewer than 10% at Nanog) of the clients *support* CCX.
What's even more maddening is that about 35 to 40% of the MAC addresses
associated at the last Nanog could support CCX, but it's simply not
enabled for the ssid profile! Here we have one potential solution to
some of the troubles in scaling wireless networks that depends entirely
on the user doing the right thing. Failure, all around.

This gets back to the point of #2 here, in that only "some" of the
better-logic'd client hardware will play by the rules (or even do the
right thing). In a lot of theses cases it's better to expertly control
where a client _can_ associate with a centralized authority (i.e.
controller with data from all active APs). We simply cannot depend on
the user doing the right thing, especially when the 'right thing' is
buried and obscured by software and/or hardware vendors.
...
3. Keep an eye on the conference network stats, netflow etc 
so that "bandwidth hogs" get routed elsewhere, isolate 
infected laptops (happens all the time, to people who 
routinely login to production routers with 'enable' - 
telneting to them sometimes ..), block p2p ports anyway (yea, 
at netops meetings too, you'll be surprised at how many 
people seem to think free fat pipes are a great way to update 
their collection of pr0n videos),
I would add that DSCP & CoS maps on the AP's can be used to great effect
here. What I've done at Nanog meetings is to watch what's going on
(recently with some application-aware netflow tracking) and continually
morph and adapt the policy map/class map on the router to set specific
DSCP bits on "p2p" or "hoggish users" traffic. These packets from the
"hog" can then be mapped in the AP (by virtue of their specific DSCP
coloring) to a CoS queue which has lower delivery priority than other
non-hoggish traffic. This way, the p2p/hog/etc bits act as a 'null fill'
and use the available space on the air, but they cannot overtake or
crowd out the queued data from higher priority applications. 

You'd have to ask other Nanog meeting attendees, but I think it's fairly
safe to say that the way we treat certain SSH, dns, and tcp-initial
packets over the wireless network has yielded happy, content, and
less-lagged-n-frustrated users.
...
...
...
How was the wifi at the resent Nanog meeting?
I liked it! (heh)
...
...
...
I have heard of some success stories 2nd hand.  one 
'trick' was to 
have "separate networks" which I think meant unique SSID's.  but 
like I said, 2nd hand info, so about all I can say is supposedly 
'something' was done.
The quick & dirty formula for Nanog wireless is as follows:

A) ssid for b/g (typically "Nanog" or "nanog-arin")

B) ssid for a-only (typically "nanoga" or "Nanog-a")

C) 1/4/8/11 channel plan for the 11b/g side, and tight groupings of
5.1/5.2 GHz and 5.7 GHz 11a. Many 11a client cards do not cleanly
scan/roam from 5.3 -> 5.7 or from 5.7 -> 5.3, so this apparent 'panacea
of 8 channels' really isn't. for each 'dwelling area' where users are
expected to sit, try to stick with 5.3 channels + reuse or 5.7 channels
+ reuse. If you have to, go ahead and mix them, but ensure that
'roaming' from this area to other areas features AP's along the way
which are in the same band, less 'hardest handoff' causes the card to
rescan both bands.

D) Ensure 1/4/8/11 channel re-use considers the "full" RF footprint. An
active, busy user at the 'edge' of an AP's coverage area might as well
be considered another AP on the same channel because the net effect is
to be another RF source which grows the footprint. Basically, the
effective "RF load" radius is twice as wide as the AP's own effective
coverage if the users transmitter power and receiver sensitivity is the
same (or nearly the same) as the AP's. This is perhaps the most subtle
and ignored part of wifi planning. 

E) Help 'stupid' hardware do the right thing by ensuring an 'ideal'
channel is 'most attractive' to clients at every point in space where
capacity is the goal.

In areas where you are able to receive three AP's on channel 1, ensure
that you 'attract' the stupid hardware to a better AP that isn't
co-channel with others. A situation might be three AP's on channel 1
being heard at -72 dBm to -65 dBm by a client. You should place another
AP on channel 8 or 11 near by and ensure it's received level is
approximately 10 dB higher than the AP's on channel 1. This will tend to
attract even the dumbest of stupid hardware. 

Why do you want to attract this client to another channel other than 1
so badly? Because if you are in a point in space that can receive three
AP's worth of beacons on the same channel, those three AP's can all hear
your transmissions. Every transmission your hardware makes means you've
consumed airtime on not one, not two, but three AP's at once.

An especially bad situation which you should strive to avoid is one in
which you are able to hear AP's on channel 1, 4, 8, and 11 with levels
of, say, -75 to -70 dBm. There is no clear winner here (a 5 dB
difference wouldn't be large enough in many drivers to base the decision
on it alone). In this case, ensure that co-channel AP's are heard at
least 20 db down from the strongest AP's so that users landing on
1/4/8/11 don't consume airtime from co-channel AP's and that the other
adjacent AP's transmissions fall below the threshold of 'clear channel'
assessment of the clients.

F) Use DSCP, CoS, and .11e EDCF support to allow the wireless devices to
treat different types of packets over the air appropriately. As
mentioned weeks ago by Todd and others (but apparently missed or ignored
by the thread) there's a PDF and a video up covering some of the results
of this concept. 

Check it out at: 

http://www.nanog.org/mtg-0610/presenter-pdfs/kapela.pdf 

...and http://www.nanog.org/mtg-0610/real/delivered.ram

Perhaps this wasn't quick (little that's complicated is), but it's dirty
all the same! Try these ideas out at your next wireless event!

-Tk

RE: wifi for 600, alex

Anton Kapela