NANOG36-NOTES 2006.02.14 Tools BOF Notes

15 Feb 2006

      Last notes of the day...

Matt

2006.02.14 Tools BOF
Todd Underwood, panel moderator

A number of interesting tools presented earlier today;
all of them are good and interesting and solve a
particular set of problems.  None are in widespread
use.  There's a lot of possible reasons; do they
solve problems you don't have, in which case they
can move onto something new; or they solve a problem
similar to one you have, but not quite.  Or they solve
a problem you can't quite implement yet.
Discuss use cases, problems they're trying to solve,
and give feedback, as interactively as comfortably
as people can.

3 tools, OpenBGPD, IRR powertools/webtools (to get
feedback and is the IRR even useful anymore?) and
Flamingo as one of 2 netflow platforms.

Start with Henning, active in open source software
development; he'll go in more depth on openbgpd.

OpenBGPD
Henning Brauer henning at openbsd.org

3 process design

Principle of least privilege
the RDE (route decision engine) does not need any special priv
at all, so it runs as __bgpd:__bgpd: chrooted to /var/empty

SE needs to bind to TCP/179

parent needs to modify kernel routing table.

Session Engine (SE)
needs to bind to 179/tcp

we have the parent open sockets
see recvmsg(2)

parent needsd to keep track of which fds the SE has open,
so it doesn't bind() again to same ip/port

the SE can drop all privs, then.

SE 2
since one process handles all bgpd, need nonblocking sockets.

on blocking, you call write(2), won't reurn until it's done
or get errors

on nonblocking, returns as soon as it can't proceed
immediately
So, have to handle buffer managmeent

SE 3
designed an easy to use buffer API and mesg handling
system.

Messaging
internal messaging is core comp. in
reused for OpenNTPD, OPenOSPFD, and somee more.
bgpd has more than 52 message types, more than OpenSSH
bgpctl talks to bgpd using same imsg socket

tcp md5
some very old code in kernel for tcp md5, from 4.4 BSD
never worked
tcp md5 is somewhat similar to ipsec, ah, so implement
 it within IPSec maze.
Had to add pfkey interface to bgpd; committee designed
 API.
that made IPSec that much easier; extended the API so they
can request unused SPIs from kernel, don't have to be
configured manually.

tcp md5/ipsec
when you don't have tcp md5 or ipsec in place, big tcp
windows are risky

stay at 16k window unless you have tcp md5 or ipsec,
then you get 64K
so ipsec improves performance.

Joel Yagli asks how big a tcp window do you need for
a BGP session at all?  initial connection gets faster
with 64K, but thereafter, similar.

looking glass
just added an optional second control socket that is
restricted to the "show" operations
regular bgpctl binary can be used with it
cgi, yeah, that needs to be hacked in shape, but it's easy.

Juniper only does static IPSec setup, so requires nasty
setup.  OpenBGPD is dynamic, but interoperates with Junipers.

So back to looking glass, security
on OpenBSD, the httpd (an apache 1.3 variant)
runs in a chroot jail by default
th readonly socket can be placed inside that jail
bgpd_flags="-r /var/www/bgpd.rsock" in rc.conf.local

put a statically linked bgpctl binary in the chroot
/path/to/bgpctl -s /bgp.rsock, $

impressions from road to ipv6
most heinous checkin message yet.  The lower 2 bytes
of the scopeID overwrite part of the v6 address...ugly!

Performance
http://hasso.linux.ee/linux/openbgpd.php

it's quick openBGPD 3.6 port for linux; can't communicate
with kernel, no v6, no md5; 8 times faster than quagga.

future plans and ideas
the biggest task waits outside bgpd itself; kernel routing
 table.

we need to make use of the radix mpath capabilities
added in 2004, and add route source markers (BGP,
OSPF, etc)
 bgpd and ospfd can blindly install their routes
 kernel then knows precedence
hard to do, once it's done, routing will be easier.

Also need multiple routing tables, with pf acting as
table selector
so unholy route-to can died, and associated issues
vanish/

might be useful with bgpd as well.

iddeas for quite radical changs, speed up packet
forwarding dramatically.
will have fast path where all easy cases can be handled
on specialized PCI cards
multiple 10GE at wire speed within 2 years.  hardware
exists, on way to him.

for route servers, reversing filter and best path selection
would be good.

filter generation from RIPE DB or similar
 but IRR toolset sucks hairy moose balls
 should be solvable in perl "someone" has to code it.

(maybe use IRR power tools for it instead!)
[
we can fail over IP addresses already, thanks to CARP

we can hve synchronized state tables on multiiple machines,
 gives HA firewall clusters.

Would be really cool to be able to fail over TCP sessions
and bgp sessions.
could make for BGP hitless failover
syncging BGP stuff shouldn't be too hard
lots of work, not much time.

Money has to come from somewhere, obviously.
Unfortunately, people forget about this, just go to mirrors.
Vendors don't help
Never got anything for OpenSSH yet

it comes down to you.  yes YOU.
buy our CDs, donate cash, tee-shirts, etc.
or sponsor events, plane tickets, etc.

Matt asks if they can be configured actively only,
not binding on port 179.  Currently, no.  Can bind
to loopbacks for the moment, or modify the code
as appropriate.

RAS asks if they support the lastest IETF draft
for collision handling.
They don't implement collision detection RFC-wise,
no.  And they're not active only, without passive.
He also asks about comparing a RIB-in; that's been
in the code for 2 weeks now, still has some bugs.

Add support for passing more than one best path
in case people are using this for a looking-glass.
ie the VRF style hack for getting multiple views
of the same route.

Todd asks if they log dump output in MRT format.
Short answer, yes.

Many thanks to Henning!

OK, over to RAS for IRR power tools.  One leading
question is whether the IRRs are even worthy of
supporting, or are they a dead horse we've been
flagellating for ten years?

Todd kicks it off with question--you developed
toolset because you think IRRs are a worthy effort
to support, right?
RAS notes that manual filters aren't a reasonable
means to configure routers.  Our current system is
broken, something needs to change.  Tool was written
for internal use initially, Chris Morrow convinced
him to release it for others.  Not really meant as
a plug for the IRRs, it's more "if you're going to
use the IRRs

Riverdomain, package base, if it can be a .rpm or .deb
package, it would help; RAS doesn't do Linux, but others
are welcome to package it up.

When people go to implement this, they find a lot of
cruft lurking around.  Verio is the primary forcer
right now for using IRRs; Level3 proxy objects are
a great pain point.

No motivation to remove old information out of the
system.

Larry Blunt with Merit; if you send email to them,
they will remove old stale objects.  It may take a
little while, but it will get done.
They're starting to check for inconsistencies so
you can do bulk deletions of blocks that aren't
being announced.
Could they send out a monthly nag message, send
that initially.
They charge a yearly fee.  When someone doesn't
pay, companies will eventually be phased out.
About 1600 maintainers, some are proactive about
keeping them up to date, others are pretty minimal
or are laggards.
Website will show out-of-date info, but won't allow
cleaning it up; next step is to allow them to clean
it up via the web.

They're looking to have a simple form vs complex form
coming up, for those who don't have a complete
understanding of BGP.

Report section works across all registries.  Could
check to see if there's a less specific entry already
with a different MAINT-ID.
What about an INET-num to block any more specifics
from the block from being registered.
Challenge centers around the split between IP
allocation and route registration.

Peter Shoenmaker, works for Verio; many people mirror
other databases, so there's compounded stale data.
There are certain companies that have large blocks,
and they'll register all the more specifics under it
as well.  Generating those filter lists gets ugly.
RAS notes his tool auto-aggregates to least specific
block possible.

Samantha Billington, ISC, current maintainers of the
current IRR ToolSet; would love to re-do it if there
were some backing behind it.  Currently trying to
develop peering tools to make keeping track of peerings
more easily.

Randy Bush, IIJ, peers at the SIX using a perl script
to generate filter lists around peval; it's a decade
later, and we're still hacking more tools on top of
the same broken data.  It's time to do something new;
doing the same thing, and expecting different outcome
is definition of insanity.
He says let's just give up, and change.
Todd asks about SBGP/SO-BGP; they haven't caught on,
and perhaps never well.
But in the meantime, Randy notes it will let you build
filters that are at least rigorously verifiable.  And
it'll be a backend only change in the future to drop
in for the IRR in the future.  It'll be downhill,
incremental work at that point.
Randy doesn't think it makes sense to continue forward
at all.  Let's stop for a year and redo things on better
course before another ten years goes past.

Randy offers to show Todd his ass...the audience groans,
and we return to a more serious note.

This is quicksand, let's get out of the quicksand.
it's just an issue of going in the wrong direction.
If you don't mind doing work twice, OK; but if you do,
then save your time and energy, and let's focus on a
PKI system.

So, let's do PKI now, and we add this into the tools
to jump start the process.
You get a list of signed objects, you traverse the
list of signed attestations that you recieve, you
verify up the chain to IANA, and verify that the
chain is intact.
For Addresses and ASes, IANA is the root.
For the identities of the ISPs and RIRs, may not
be a single root.
IRRs aren't secure enough to use for the database
system.  It's not transport free?

LDAP would very likely be the storage/retrieval
system.  Let's start coding it; 2585 for queries,
it's over HTTP, lets you talk to PKI certificate
abstraction.  Need to cover changes to certs needed
to cover address space and ASNs.  Can 2585 cover
certificte modification, though?
Needs

Andy Putnams, Intelliden, config challenge will
exist regardless of the pki; how do you ensure that
you don't have multiple, overlapping entries?

RAS notes that if we could handle PKI inside the
IRRR, it would jumpstart the data.

Chris asks why we can't start doing verifications
with the existing infrastructure?
Randy's not saying throw away the existing
infrastructure; just pause on chasing it down,
and change direction for the future.

RPSL is a huge overkill, 90% of it isn't used at
all, would we want to put PKI in there?

Sandy Murphy, Sparta.  Could we use the PKI to
prevent people from adding or removing routes
that we don't want touched?  Could the people
RUNNING the IRR check before you stick it in;
then we also check on the way out.

RFC 2725--her name is on it, but it's really
Curtis Villamizer's proposal, but it does allow
for the web of trust between systems to start
with.

Now onto Flamingo; Manish will walk through a
live demo off his netflow data.
flamingo.merit.edu:4444

it's more for security than traffic engineering;
for those of us doing traffic engineering, though
might still be of some use.

Gives a large overview when we start it up, use the
minimum threshold to filter out background noise.
Traffic volume based on source AS view is of somewhat
interesting.
Ooops, neighbor AS, actually.  (or origin, depending
on your router setting)
quad tree works same for AS number, just 16 bit
number, not 32 bit archive.
collector is based off flow tools, uses similar
file format.
Can pause updates so you don't loose current data
you're about to zoom in on, for example.

last view is combined source, dest, port, etc.
everything sampled by netflow all in one window.
sliders are a little crude, would be nice to have
text entry boxes as well as slider bars.

You can also list specific networks you care about
that you want to see data for.
So you can see traffic sourced from a given subnet.
Every bar will then be a /32 hostname, will show
traffic volumes for each; can shift zoom to limit
what is on the display.

Can do same visualizations of darknet space.

They generate darkspace network off an unnumbered
interface, and generating flow data from what is
seen on the scatterback.
Lots of scans from lots of people.  Last 10 seconds
or last 2 seconds, very dynamic.

so the vertical here is walking across ports.
single source port hitting lots of destination
ports.

Can also playback feeds from netflow collector
format, etc.

The platform is displaying pretty well at about 5k
records/second, but scaling up for for a large content
provider would be tough.

Most of the code is written in-house, on Linux; may
port to other platforms later.
Running on a 3.6ghz p4 right now, nice graphics card,
256MB nvidia graphics card, no other real specs right
now.  Heavy dependencies on openGL, so no licensing
for that.  Software tool is largely GPL free.
Availability is an issue right now.  In trials, happy
to do more trials, that's where they'll leave it for
now.
Manpower issues, no QA, no support, so not sure if
they can really release it to the public yet.

Are there additional visualizations that might help?

On the ops side, which of these ways of looking at
this would help the poor folks who run NOCs?  What
about saving profiles the NOC can compare against
to recognize quickly "bad things".

Mike Hughes (I think) says it's not really a
first-alert type tool, it's for a second or third level
escalation tool; too hard to interpret for a front
line NOC person.
They find it most useful for playback, rather than
realtime 'spot the attack'.

Chris--scaling for netflow is always a challenge.  He
has 9 routers reporting 50,000 flows/second; he has 300
others to add.  So the scaling part is a challenge.
The SOC might find it useful.

It can use pre-existing netflow/flowtool archive files,
server can play those back.

Chris Morrow would just like to see some rough stats
on what it could support.

Mike Hughes, runs layer 2 environment, could it be
sFlow and layer 2 aware.

Walt Prue asks if he could point at a line to see
info about it.  On the feature list.

NANOG36-NOTES 2006.02.14 Tools BOF Notes

Matthew Petach