RE: Let's talk about Distance Sniffing/Remote Visibility
On Thu, Mar 28, 2002 at 9:20AM -0700, Richard A Steenbergen wrote:
I'm imagining that even with a relatively speedy box, if you were trying to do analysis from multiple interfaces you'd at least choke the disk I/O. There's always stringent filters, I guess.
Disk I/O on a sniffer box? Sounds like you've been sniffing something other than packets my friend. :)
Why do you say that? In the 10/100 range, yes, no problems. But at the Gigabit range (say with two GbE cards or a single OC-48 card) on an x86 box with IDE disks (or even SCSI RAID0), doesn't disk I/O become a severe problem? Under Solaris or Linux, scaling disk seems relatively easy with Veritas Foundation Suite on Solaris or GFS under Linux http://www.sistina.com/products_gfs.htm However, I don't think Linux or Solaris can handle the packet capture capabilities like FreeBSD and BPF can. I've heard things about the new LPF capabilities and turbopacket, but it's just hard to believe coming from such a joke/toy operating system. Whether you are passively tapping a gigabit ethernet or SONET fiber, or even spanning an entire VLAN or mirroring a gigabit ethernet or SONET port on a router/switch -- you've got a lot of packets/frames to deal with, especially if you want to keep all of them for analyzing later. Sounds like a disk I/O problem to me. Are you doing packet capture at these rates and ran into no disk problems? How did you deal with that? We are doing so right now, but only with the IP headers and some "top N" information. Getting full packets and keeping them for awhile (a day or two even) is going to take a lot of I/O and disk space. I don't think it's worth it, really.
You can build your own box like that easily enough. If you're going for FastE sniffing I highly recommend the Adaptec Quartet 4-port cards. If you're going for GigE sniffing, I STILL highly recommend anything Alteon Tigon 2 based (NetGear GA620's were the cheapest if you can still find them, not the 621/622).
You can accomplish almost anything with the Tigon2's and FreeBSD, agreed. Another vendor I'm sort of looking at now is Endance (DAG cards): http://www.endace.com/ http://www.endace.com/products/dag42ge.html This only does IP headers, but that's the fun stuff anyways ;>
You don't even have to do anything fancy with the card firmware, there is a native command for receiving only part of the frame. Check out the programming manuals at http://people.freebsd.org/~wpaul/Alteon/, and I recommend you use FreeBSD for this of course. Just add in a PARTIAL_RX_CNT command, and the card will only DMA part of the packet (say 64 bytes for full headers) across the PCI bus. Combined with interrupt coalescing (or luigi's device polling and tuning the card to allocate all memory to RX and remove the TX functionality completely), you can sniff quite a few "gigabits" of traffic on a single cheap PC server. You can dump it through the BPF mechanism and still maintain support for all your favorite sniffer programs. Or if you're comfortable writing kernel code, I recommend you make a character device for sniffer device control, and use it to pass page-aligned malloc'd memory pointers from userland into the nic driver, which you then pass to the card as the RX ring buffers. This will let you DMA your packets directly into userland. If not, at least unhook ether_input(). :)
Can you post more details or catch up with me offline about this? I'm really very interested in your implementations and results.
Or you can buy these things commercially. My favorite was from a company called Tekelec, who sold a VERY expensive box which turned out to be a pentium 200ish box running solaris x86 and completely useless sniffing software, with a bunch of ISA ethernet cards hooked up by proprietary (and VERY expensive) cables, all in a box made out of what I swear was some kind of lead/neutron star material alloy. Of course that was a couple years ago, maybe they've upgraded to the current market's $50 processor. :)
We've been looking at NetVCR from Niksun which sounds similar except that is actually is FreeBSD-based. Somebody needs to put together a list of all these companies and do some comparisons of the product offerings. Like you, I'd rather just build my own box and run with it ;> -dre
-- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras PGP Key ID: 0x138EA177 (67 29 D7 BC E8 18 3E DA B2 46 B3 D8 14 36 FE B6)
On Thu, Mar 28, 2002 at 03:14:53PM -0800, Gironda, Andre wrote:
Why do you say that? In the 10/100 range, yes, no problems. But at the Gigabit range (say with two GbE cards or a single OC-48 card) on an x86 box with IDE disks (or even SCSI RAID0), doesn't disk I/O become a severe problem? Under Solaris or Linux, scaling disk seems relatively easy with Veritas Foundation Suite on Solaris or GFS under Linux http://www.sistina.com/products_gfs.htm
Capturing packets for realtime analysis is an attainable goal using cheap off the shelf hardware and a little bit of clue. Storing many Gbps of data on a harddrive is much harder task. Even using 160Gig drives, 1Gbps fills one in about 20 minutes (10 if you're recording full duplex). Unless you're the FBI, I really don't think you want to store that much data for any reason. Be smart in what you write to disk, and how you write it.
However, I don't think Linux or Solaris can handle the packet capture capabilities like FreeBSD and BPF can. I've heard things about the new LPF capabilities and turbopacket, but it's just hard to believe coming from such a joke/toy operating system.
The data capture mechanism of BPF is pretty simple (the filter language is whats complex), I doubt even Linux can get it too wrong. All you need is a buffer in the kernel (FreeBSD defaults to 4096, libpcap turns it up to 32768 I believe but doesn't expose the value to the user, you should probably turn that up a bit if you want to capture at high speed). Read data from the nic, copy it into the buffer (or preferably have the NIC be responsable for transfering it into the buffer :P), and increment the offset. Then when someone comes along to read for more data, copy out the buffer into the userland buffer, use the offset value to indicate the total length, and reset the offset to 0. If you need more than 20 lines to do that part, you're probably doing it wrong. :) In normal use of bpf the data is copied 3 times, from the NIC to an mbuf, from the mbuf to the bpf buffer if there is a configured bpf reader, and then from the bpf buffer to the user supplied buffer when the user does a read() on the BPF descriptor. Fortunately multiple packets are buffered into a single copy in stages 2 and 3. If you want to eliminate some of those copies, you have to make a dedicated reader mechanism. Malloc the memory in userland so you get nice page aligned chunk, allocate the counters in userland and pass it in via a character device similar to BPF. You probably want to go with a ring structure, use 2 counters as a producer and consumer index. The kernel updates the producer index, and you update the consumer index as you process data. When both values are equal, the ring is empty. When the end is 1 below the start, the ring is full. With an intelligent card, you pass the memory address of your userland allocated memory as where you want the RX data to be DMA'd. The kernel updates the producer index, discarding any data which the consumer can't read. Then you just have your userland program constantly scanning the ring for new data, put a usleep(1); in there and you'll stay below 0.01% cpu. Think there would be a benefit to writing this as an extension to BPF? -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras PGP Key ID: 0x138EA177 (67 29 D7 BC E8 18 3E DA B2 46 B3 D8 14 36 FE B6)
participants (2)
-
Gironda, Andre
-
Richard A Steenbergen