Last notes of the day... Matt 2006.02.14 Tools BOF Todd Underwood, panel moderator A number of interesting tools presented earlier today; all of them are good and interesting and solve a particular set of problems. None are in widespread use. There's a lot of possible reasons; do they solve problems you don't have, in which case they can move onto something new; or they solve a problem similar to one you have, but not quite. Or they solve a problem you can't quite implement yet. Discuss use cases, problems they're trying to solve, and give feedback, as interactively as comfortably as people can. 3 tools, OpenBGPD, IRR powertools/webtools (to get feedback and is the IRR even useful anymore?) and Flamingo as one of 2 netflow platforms. Start with Henning, active in open source software development; he'll go in more depth on openbgpd. OpenBGPD Henning Brauer henning at openbsd.org 3 process design Principle of least privilege the RDE (route decision engine) does not need any special priv at all, so it runs as __bgpd:__bgpd: chrooted to /var/empty SE needs to bind to TCP/179 parent needs to modify kernel routing table. Session Engine (SE) needs to bind to 179/tcp we have the parent open sockets see recvmsg(2) parent needsd to keep track of which fds the SE has open, so it doesn't bind() again to same ip/port the SE can drop all privs, then. SE 2 since one process handles all bgpd, need nonblocking sockets. on blocking, you call write(2), won't reurn until it's done or get errors on nonblocking, returns as soon as it can't proceed immediately So, have to handle buffer managmeent SE 3 designed an easy to use buffer API and mesg handling system. Messaging internal messaging is core comp. in reused for OpenNTPD, OPenOSPFD, and somee more. bgpd has more than 52 message types, more than OpenSSH bgpctl talks to bgpd using same imsg socket tcp md5 some very old code in kernel for tcp md5, from 4.4 BSD never worked tcp md5 is somewhat similar to ipsec, ah, so implement it within IPSec maze. Had to add pfkey interface to bgpd; committee designed API. that made IPSec that much easier; extended the API so they can request unused SPIs from kernel, don't have to be configured manually. tcp md5/ipsec when you don't have tcp md5 or ipsec in place, big tcp windows are risky stay at 16k window unless you have tcp md5 or ipsec, then you get 64K so ipsec improves performance. Joel Yagli asks how big a tcp window do you need for a BGP session at all? initial connection gets faster with 64K, but thereafter, similar. looking glass just added an optional second control socket that is restricted to the "show" operations regular bgpctl binary can be used with it cgi, yeah, that needs to be hacked in shape, but it's easy. Juniper only does static IPSec setup, so requires nasty setup. OpenBGPD is dynamic, but interoperates with Junipers. So back to looking glass, security on OpenBSD, the httpd (an apache 1.3 variant) runs in a chroot jail by default th readonly socket can be placed inside that jail bgpd_flags="-r /var/www/bgpd.rsock" in rc.conf.local put a statically linked bgpctl binary in the chroot /path/to/bgpctl -s /bgp.rsock, $ impressions from road to ipv6 most heinous checkin message yet. The lower 2 bytes of the scopeID overwrite part of the v6 address...ugly! Performance http://hasso.linux.ee/linux/openbgpd.php it's quick openBGPD 3.6 port for linux; can't communicate with kernel, no v6, no md5; 8 times faster than quagga. future plans and ideas the biggest task waits outside bgpd itself; kernel routing table. we need to make use of the radix mpath capabilities added in 2004, and add route source markers (BGP, OSPF, etc) bgpd and ospfd can blindly install their routes kernel then knows precedence hard to do, once it's done, routing will be easier. Also need multiple routing tables, with pf acting as table selector so unholy route-to can died, and associated issues vanish/ might be useful with bgpd as well. iddeas for quite radical changs, speed up packet forwarding dramatically. will have fast path where all easy cases can be handled on specialized PCI cards multiple 10GE at wire speed within 2 years. hardware exists, on way to him. for route servers, reversing filter and best path selection would be good. filter generation from RIPE DB or similar but IRR toolset sucks hairy moose balls should be solvable in perl "someone" has to code it. (maybe use IRR power tools for it instead!) [ we can fail over IP addresses already, thanks to CARP we can hve synchronized state tables on multiiple machines, gives HA firewall clusters. Would be really cool to be able to fail over TCP sessions and bgp sessions. could make for BGP hitless failover syncging BGP stuff shouldn't be too hard lots of work, not much time. Money has to come from somewhere, obviously. Unfortunately, people forget about this, just go to mirrors. Vendors don't help Never got anything for OpenSSH yet it comes down to you. yes YOU. buy our CDs, donate cash, tee-shirts, etc. or sponsor events, plane tickets, etc. Matt asks if they can be configured actively only, not binding on port 179. Currently, no. Can bind to loopbacks for the moment, or modify the code as appropriate. RAS asks if they support the lastest IETF draft for collision handling. They don't implement collision detection RFC-wise, no. And they're not active only, without passive. He also asks about comparing a RIB-in; that's been in the code for 2 weeks now, still has some bugs. Add support for passing more than one best path in case people are using this for a looking-glass. ie the VRF style hack for getting multiple views of the same route. Todd asks if they log dump output in MRT format. Short answer, yes. Many thanks to Henning! OK, over to RAS for IRR power tools. One leading question is whether the IRRs are even worthy of supporting, or are they a dead horse we've been flagellating for ten years? Todd kicks it off with question--you developed toolset because you think IRRs are a worthy effort to support, right? RAS notes that manual filters aren't a reasonable means to configure routers. Our current system is broken, something needs to change. Tool was written for internal use initially, Chris Morrow convinced him to release it for others. Not really meant as a plug for the IRRs, it's more "if you're going to use the IRRs Riverdomain, package base, if it can be a .rpm or .deb package, it would help; RAS doesn't do Linux, but others are welcome to package it up. When people go to implement this, they find a lot of cruft lurking around. Verio is the primary forcer right now for using IRRs; Level3 proxy objects are a great pain point. No motivation to remove old information out of the system. Larry Blunt with Merit; if you send email to them, they will remove old stale objects. It may take a little while, but it will get done. They're starting to check for inconsistencies so you can do bulk deletions of blocks that aren't being announced. Could they send out a monthly nag message, send that initially. They charge a yearly fee. When someone doesn't pay, companies will eventually be phased out. About 1600 maintainers, some are proactive about keeping them up to date, others are pretty minimal or are laggards. Website will show out-of-date info, but won't allow cleaning it up; next step is to allow them to clean it up via the web. They're looking to have a simple form vs complex form coming up, for those who don't have a complete understanding of BGP. Report section works across all registries. Could check to see if there's a less specific entry already with a different MAINT-ID. What about an INET-num to block any more specifics from the block from being registered. Challenge centers around the split between IP allocation and route registration. Peter Shoenmaker, works for Verio; many people mirror other databases, so there's compounded stale data. There are certain companies that have large blocks, and they'll register all the more specifics under it as well. Generating those filter lists gets ugly. RAS notes his tool auto-aggregates to least specific block possible. Samantha Billington, ISC, current maintainers of the current IRR ToolSet; would love to re-do it if there were some backing behind it. Currently trying to develop peering tools to make keeping track of peerings more easily. Randy Bush, IIJ, peers at the SIX using a perl script to generate filter lists around peval; it's a decade later, and we're still hacking more tools on top of the same broken data. It's time to do something new; doing the same thing, and expecting different outcome is definition of insanity. He says let's just give up, and change. Todd asks about SBGP/SO-BGP; they haven't caught on, and perhaps never well. But in the meantime, Randy notes it will let you build filters that are at least rigorously verifiable. And it'll be a backend only change in the future to drop in for the IRR in the future. It'll be downhill, incremental work at that point. Randy doesn't think it makes sense to continue forward at all. Let's stop for a year and redo things on better course before another ten years goes past. Randy offers to show Todd his ass...the audience groans, and we return to a more serious note. This is quicksand, let's get out of the quicksand. it's just an issue of going in the wrong direction. If you don't mind doing work twice, OK; but if you do, then save your time and energy, and let's focus on a PKI system. So, let's do PKI now, and we add this into the tools to jump start the process. You get a list of signed objects, you traverse the list of signed attestations that you recieve, you verify up the chain to IANA, and verify that the chain is intact. For Addresses and ASes, IANA is the root. For the identities of the ISPs and RIRs, may not be a single root. IRRs aren't secure enough to use for the database system. It's not transport free? LDAP would very likely be the storage/retrieval system. Let's start coding it; 2585 for queries, it's over HTTP, lets you talk to PKI certificate abstraction. Need to cover changes to certs needed to cover address space and ASNs. Can 2585 cover certificte modification, though? Needs Andy Putnams, Intelliden, config challenge will exist regardless of the pki; how do you ensure that you don't have multiple, overlapping entries? RAS notes that if we could handle PKI inside the IRRR, it would jumpstart the data. Chris asks why we can't start doing verifications with the existing infrastructure? Randy's not saying throw away the existing infrastructure; just pause on chasing it down, and change direction for the future. RPSL is a huge overkill, 90% of it isn't used at all, would we want to put PKI in there? Sandy Murphy, Sparta. Could we use the PKI to prevent people from adding or removing routes that we don't want touched? Could the people RUNNING the IRR check before you stick it in; then we also check on the way out. RFC 2725--her name is on it, but it's really Curtis Villamizer's proposal, but it does allow for the web of trust between systems to start with. Now onto Flamingo; Manish will walk through a live demo off his netflow data. flamingo.merit.edu:4444 it's more for security than traffic engineering; for those of us doing traffic engineering, though might still be of some use. Gives a large overview when we start it up, use the minimum threshold to filter out background noise. Traffic volume based on source AS view is of somewhat interesting. Ooops, neighbor AS, actually. (or origin, depending on your router setting) quad tree works same for AS number, just 16 bit number, not 32 bit archive. collector is based off flow tools, uses similar file format. Can pause updates so you don't loose current data you're about to zoom in on, for example. last view is combined source, dest, port, etc. everything sampled by netflow all in one window. sliders are a little crude, would be nice to have text entry boxes as well as slider bars. You can also list specific networks you care about that you want to see data for. So you can see traffic sourced from a given subnet. Every bar will then be a /32 hostname, will show traffic volumes for each; can shift zoom to limit what is on the display. Can do same visualizations of darknet space. They generate darkspace network off an unnumbered interface, and generating flow data from what is seen on the scatterback. Lots of scans from lots of people. Last 10 seconds or last 2 seconds, very dynamic. so the vertical here is walking across ports. single source port hitting lots of destination ports. Can also playback feeds from netflow collector format, etc. The platform is displaying pretty well at about 5k records/second, but scaling up for for a large content provider would be tough. Most of the code is written in-house, on Linux; may port to other platforms later. Running on a 3.6ghz p4 right now, nice graphics card, 256MB nvidia graphics card, no other real specs right now. Heavy dependencies on openGL, so no licensing for that. Software tool is largely GPL free. Availability is an issue right now. In trials, happy to do more trials, that's where they'll leave it for now. Manpower issues, no QA, no support, so not sure if they can really release it to the public yet. Are there additional visualizations that might help? On the ops side, which of these ways of looking at this would help the poor folks who run NOCs? What about saving profiles the NOC can compare against to recognize quickly "bad things". Mike Hughes (I think) says it's not really a first-alert type tool, it's for a second or third level escalation tool; too hard to interpret for a front line NOC person. They find it most useful for playback, rather than realtime 'spot the attack'. Chris--scaling for netflow is always a challenge. He has 9 routers reporting 50,000 flows/second; he has 300 others to add. So the scaling part is a challenge. The SOC might find it useful. It can use pre-existing netflow/flowtool archive files, server can play those back. Chris Morrow would just like to see some rough stats on what it could support. Mike Hughes, runs layer 2 environment, could it be sFlow and layer 2 aware. Walt Prue asks if he could point at a line to see info about it. On the feature list.