RE: NANOG Digest, Vol 21, Issue 72

19 Oct 2009

      -----Original Message-----
From: nanog-request@nanog.org [mailto:nanog-request@nanog.org]
Sent: Monday, October 19, 2009 3:32 PM
To: nanog@nanog.org
Subject: NANOG Digest, Vol 21, Issue 72

Send NANOG mailing list submissions to
        nanog@nanog.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://mailman.nanog.org/mailman/listinfo/nanog
or, via email, send a message with subject or body 'help' to
        nanog-request@nanog.org

You can reach the person managing the list at
        nanog-owner@nanog.org

When replying, please edit your Subject line so it is more specific than "Re: Contents of NANOG digest..."

Today's Topics:

   1. UPDATE: NANOG 47 PGP signing party. (Joel Jaeggli)
   2. IPv6 Allocations (Esposito, Victor)
   3. Science vs. bullshit (Patrick W. Gilmore)
   4. Webcasts of NANOG47 (Justin Shore)
   5. Re: IPv6 Allocations (Simon Perreault)
   6. Re: IPv6 Allocations (Nathan Ward)
   7. Re: NetFlow analyzer software (Rubens Kuhl)
   8. RE: Cisco VSS-1440 migration query (Mishka, Jason)
   9. RE: NetFlow analyzer software (Jeffrey Negro)
  10. 2009.10.19 NANOG47 Monday notes, second half (Matthew Petach)

----------------------------------------------------------------------

Message: 1
Date: Mon, 19 Oct 2009 11:47:13 -0700
From: Joel Jaeggli <joelja@bogus.com>
Subject: UPDATE: NANOG 47 PGP signing party.
To: NANOG list <nanog@nanog.org>
Message-ID: <4ADCB431.1080502@bogus.com>
Content-Type: text/plain; charset=UTF-8

The  second session for the NANOG 47 pgp key signing party will be during the tuesday morning break (11:00 - 11:30) in the Desoto Foyer.

If you wish to participate in the pgp keysigning there is still time to add your key to the keyring at:

http://biglumber.com/x/web?ev=97301

Then come to the last session with some form of government issued photo ID.

If you have any further questions, feel free to contact me via email or corner one of the people with the pgp signing dots since they mostly know the score.

Thanks
Joel

------------------------------

Message: 2
Date: Mon, 19 Oct 2009 15:01:37 -0500
From: "Esposito, Victor" <Victor.Esposito@deltacom.com>
Subject: IPv6 Allocations
To: <nanog@nanog.org>
Message-ID:
        <F2F18D9E67CC624085F34A28E29D1B2807464A3F@EXCHANGEPOST01.its.local>
Content-Type: text/plain;       charset="US-ASCII"

Since there is a lot of conversation about IPv6 flying about, does anyone have a document or link to a good high level allocation structure for v6?

It seems there are 100 different ways to sub allocate the /32, and I am trying to find a simple but scalable method... .

Thanks!

Victor Esposito

------------------------------

Message: 3
Date: Mon, 19 Oct 2009 16:05:15 -0400
From: "Patrick W. Gilmore" <patrick@ianai.net>
Subject: Science vs. bullshit
To: NANOG list <nanog@nanog.org>
Message-ID: <E47F5FB6-2FAB-4886-8A9D-1CF2B7A8961F@ianai.net>
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes

Lightning talk followup because I want to make sure there was not a miscommunication.  A two sentence comment at the mic while 400+ of your not-so-close friends are watching does not a rational discussion make.

The talk in question:

    <http://nanog.org/meetings/nanog47/presentations/Lightning/Cowie_Recession_li...
...
The disagreement is whether Renesys can reliably find out how many transit providers an AS has.  Remember, we are discussing transit providers here, not peers.

My point is if an AS has _transit_, then it must be visible in the global table (assuming a reasonably large set of vantage points), or it would not be transit.  Of course, this is not perfect, but it is a pretty close approximation for fitting curves over 10s of 1000s of ASes.  So things like "I have two transit providers, and one buys transit from the other" is a small number and not relevant to fitting curves.  (It also means you are an idiot, or in a corner of the Internet where you should probably be considered as having only one
provider.)

Majdi has pointed out other corner cases where transit is not viewable through systems like Rensys.  For instance, announcing prefixes to Provider 2 with a community to local-pref the announcement below peer routes.  That means only one transit is visible in BGP data.

There were several reasons some of us did not think edge cases like this were important.  For instance, Renesys keeps -every- update ever, so if Provider 1 ever flaps, Rensys will see Provider 2.  Also, when looking for the number of providers, a "backup path" may not be relevant since no packets take that path.

More importantly, I thought the point of the talk was to show that the table was growing during the recession and people were still getting
more providers.  The result is a curve, not a hard-and-fast number.
Corner cases like the one above are barely noise, so the curve it still valid.

It is true that finding peering edges with things like route-views is problematic at best, so finding ASes with one transit plus peering might be problematic.  But since I do not think that was the point of the talk, I do not consider that problem.

If anyone who still thinks the problems with finding transit edges somehow make the talk 'bullshit' could clarify their position, I would be grateful.

--
TTFN,
patrick

------------------------------

Message: 4
Date: Mon, 19 Oct 2009 15:06:06 -0500
From: Justin Shore <justin@justinshore.com>
Subject: Webcasts of NANOG47
To: NANOG <nanog@merit.edu>
Message-ID: <4ADCC6AE.7070107@justinshore.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Does anyone know if there will be video streams of the events from rooms
other than what's in the Grand room?  For example I would like to see
the ISP Security Track BOF or the one tomorrow on Peering.  I don't see
a way to select those specific feeds though.

Thanks
  Justin

------------------------------

Message: 5
Date: Mon, 19 Oct 2009 16:23:42 -0400
From: Simon Perreault <simon.perreault@viagenie.ca>
Subject: Re: IPv6 Allocations
To: nanog@nanog.org
Message-ID: <4ADCCACE.6060509@viagenie.ca>
Content-Type: text/plain; charset=ISO-8859-1

Esposito, Victor wrote, on 2009-10-19 16:01:
...
Since there is a lot of conversation about IPv6 flying about, does
anyone have a document or link to a good high level allocation structure
for v6?
See RFC 3531 and here:

http://www.ipv6book.ca/allocation.html

Simon
--
DNS64 open-source   --> http://ecdysis.viagenie.ca
STUN/TURN server    --> http://numb.viagenie.ca
vCard 4.0           --> http://www.vcarddav.org

------------------------------

Message: 6
Date: Tue, 20 Oct 2009 09:25:22 +1300
From: Nathan Ward <nanog@daork.net>
Subject: Re: IPv6 Allocations
To: NANOG <nanog@nanog.org>
Message-ID: <9B2A035D-901A-442A-9DF3-506F76DD32CE@daork.net>
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes

On 20/10/2009, at 9:01 AM, Esposito, Victor wrote:
...
Since there is a lot of conversation about IPv6 flying about, does
anyone have a document or link to a good high level allocation
structure
for v6?
It seems there are 100 different ways to sub allocate the /32, and I
am
trying to find a simple but scalable method... .
This discussion has been done a bunch of times.

Here is my scheme, which has been adopted (sometimes with small
modifications) by quite a few providers I have spoken with.
http://mailman.nanog.org/pipermail/nanog/2009-August/012681.html

Read the whole thread because there was a bit of confusion.

--
Nathan Ward

------------------------------

Message: 7
Date: Mon, 19 Oct 2009 18:37:43 -0200
From: Rubens Kuhl <rubensk@gmail.com>
Subject: Re: NetFlow analyzer software
To: Nanog <nanog@nanog.org>
Message-ID:
        <6bb5f5b10910191337x46f52447q6c43eabe5d4f49b@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Manage Engine flow receiver with no user sessions viewing statistics
runs at 100% CPU for 200+ Mbps unsampled traffic. It's suited to SMBs
only.

Rubens

On Mon, Oct 19, 2009 at 4:36 PM, Michael J McCafferty
<mike@m5computersecurity.com> wrote:
...
ManageEngine's product is the one that kills browsers because you can tell it to list the top X ASNs.
------Original Message------
From: Brian R. Watters
To: nanog@nanog.org
Cc: nanog@nanog.org
ReplyTo: Brian R. Watters
Subject: RE: NetFlow analyzer software
Sent: Oct 19, 2009 10:45 AM
We have used this product with great success and its reasonable in pricing
and well supported.
http://www.manageengine.com/products/netflow/index.html
BRW
-----Original Message-----
From: mark jackson [mailto:markcciejackson@gmail.com]
Sent: Monday, October 19, 2009 10:47 AM
To: mike@m5computersecurity.com
Cc: nanog@nanog.org
Subject: Re: NetFlow analyzer software
What's up gold
Mrtpg
Scrtinizer
Nagios
Riverbed Cascade
Solarwinds
Sent from my iPhone
Please excuse spelling errors
On Oct 19, 2009, at 10:43 AM, "Michael J McCafferty"
<mike@m5computersecurity.com
?> wrote:
...
All,
? I am looking for decent netflow analyzer and reporting ?software
with good support for AS data.
? ManagEngine's product crashes or locks up my browser when I try to
list/sort the AS info because it's too large of a list and there is
no way to tell it to show just the top x results.
? Plixer's Scrutenizer, while it seems like it's a pretty decent
product, is no longer supporting Linux... We are a Linux shop
(servers, desktops, laptops).
? What else is there that I might want to look at?
Thanks!
Mike
M5Hosting.com
Sent from my Verizon Wireless BlackBerry
Sent from my Verizon Wireless BlackBerry
------------------------------

Message: 8
Date: Mon, 19 Oct 2009 17:04:14 -0400
From: "Mishka, Jason" <Jason.Mishka@UToledo.Edu>
Subject: RE: Cisco VSS-1440 migration query
To: <nanog@nanog.org>
Message-ID:
        <A3E7037859B34541B7271243CA11EF804E8021@MSG01CV02.utad.utoledo.edu>
Content-Type: text/plain;       charset="US-ASCII"

On Mon, 2009-10-19 at 13:06 -0400, Jason Giles wrote:
...
...
From my test, all physical interfaces configs on switch 2 are factory
defaulted and SVI interfaces deleted on switch 2 upon running the
conversion commands.
When you convert to vss mode the interfaces are renamed.  The interface
in switch 2 that was g1/1 becomes 2/1/1.  Any configuration applied to
g1/1 will be rejected because that interface no longer exists.  If you
intended to keep interface configuration, you will need to reapply that
to the new interface name.

Jason

------------------------------

Message: 9
Date: Mon, 19 Oct 2009 17:07:30 -0400
From: "Jeffrey Negro" <jnegro@billtrust.com>
Subject: RE: NetFlow analyzer software
To: "Rubens Kuhl" <rubensk@gmail.com>,  "Nanog" <nanog@nanog.org>
Message-ID:
        <3C5B084431653D4A9C469A22AFCDB5B903F65284@LOGAN.billtrust.local>
Content-Type: text/plain;       charset="iso-8859-1"

Yes my experience was the same on with Manage Engine.  Although, they do have an article buried in their archives that shows how to tweak the mysql and java memory settings on start of the app.  We found that helped a bit.  We were successfully using it for netflows from more than 100Mbps, so I would say it can handle a bit more than typical SMB traffic.

I don't know if anyone mentioned it, but a good commercial product a former customer of mine used to use was Solarwinds Orion.

Jeffrey

-----Original Message-----
From: Rubens Kuhl [mailto:rubensk@gmail.com]
Sent: Monday, October 19, 2009 4:38 PM
To: Nanog
Subject: Re: NetFlow analyzer software

Manage Engine flow receiver with no user sessions viewing statistics
runs at 100% CPU for 200+ Mbps unsampled traffic. It's suited to SMBs
only.

Rubens

On Mon, Oct 19, 2009 at 4:36 PM, Michael J McCafferty
<mike@m5computersecurity.com> wrote:
...
ManageEngine's product is the one that kills browsers because you can tell it to list the top X ASNs.
------Original Message------
From: Brian R. Watters
To: nanog@nanog.org
Cc: nanog@nanog.org
ReplyTo: Brian R. Watters
Subject: RE: NetFlow analyzer software
Sent: Oct 19, 2009 10:45 AM
We have used this product with great success and its reasonable in pricing
and well supported.
http://www.manageengine.com/products/netflow/index.html
BRW
-----Original Message-----
From: mark jackson [mailto:markcciejackson@gmail.com]
Sent: Monday, October 19, 2009 10:47 AM
To: mike@m5computersecurity.com
Cc: nanog@nanog.org
Subject: Re: NetFlow analyzer software
What's up gold
Mrtpg
Scrtinizer
Nagios
Riverbed Cascade
Solarwinds
Sent from my iPhone
Please excuse spelling errors
On Oct 19, 2009, at 10:43 AM, "Michael J McCafferty"
<mike@m5computersecurity.com
?> wrote:
...
All,
? I am looking for decent netflow analyzer and reporting ?software
with good support for AS data.
? ManagEngine's product crashes or locks up my browser when I try to
list/sort the AS info because it's too large of a list and there is
no way to tell it to show just the top x results.
? Plixer's Scrutenizer, while it seems like it's a pretty decent
product, is no longer supporting Linux... We are a Linux shop
(servers, desktops, laptops).
? What else is there that I might want to look at?
Thanks!
Mike
M5Hosting.com
Sent from my Verizon Wireless BlackBerry
Sent from my Verizon Wireless BlackBerry
------------------------------

Message: 10
Date: Mon, 19 Oct 2009 14:32:01 -0700
From: Matthew Petach <mpetach@netflight.com>
Subject: 2009.10.19 NANOG47 Monday notes, second half
To: nanog@nanog.org
Message-ID:
        <63ac96a50910191432s74c07549vc680202632de5239@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Here's my notes from second half of NANOG today.

Now off to bear and gear.  :)

Matt

2009.10.19 NANOG 47 Monday notes, part 2

Mike Hughes starts things off after lunch at
1436 hours Pacific time.

Few bits of administrivia still.
If you want to submit a lightning talk, you can do
it up until 7pm today.

Please vote for the committee members!

PC nominations close this evening as well;
if you'd like to be on it, do that as well,
as much help as possible is needed.

3 lightning talks next up.

First up is Ernest McCracken
http://netlabs.cs.memphis.edu

NetViews: real time visualization of
Internet Path Dynamics for Network Management

Started doing this as part of his undergrad work.

Goal was to help researchers visualize network paths.

Topology mapping typically try to represent
internet architectures.
Scatter, skitter, Rocket Fuel, CAIDA,

why graph in realtime?
monitor realtime reachability
spot anomalous depeering
identify route hijacking and misconfigurations

developing next-gen routing monitor system.
BGPMon -- realtime lightweight BGP monitor
  with over 70 peers--allows for fast updates
NetViews - visualizes both control plane paths
  (via BGP updates) and forwarding paths (via
  active probing)

BGPMon is running, you connect to it, get the
routing updates; data broker sends BGP updates.
Prober probes target network from BGP peers to
get path updates.
GeoCoder and IP crawler get geographic info,
and traceable IPs for probing.

Slide showing data pathway

They probe during routing events; a timeline
showing BGP updates during the timeline.  They
keep probing until they see no additional updates.

Visualization filters to show networks based on the
number of ASes an AS connects to.

You can see the updates scroll in realtime on the
live map as the updates come into the system.

Blue is path additions/changes, Red for changes.

They can also visualize forwarding paths, but
there's challenges in inferring forwarding paths
based on traceroutes.

Future work:
correlate forwarding and routing dynamics to create
 a classification model for internet paths
add scalability by having clients run traceroute jobs
 in a P2P fashion
Give client users the ability to communicate with each
 other

Funded by NSF, and collaborating with UCLA, ColoState
and UofO on BGPMon system.

Any questions?

Q: Dave Meyer--can it be run internally?  What
infrastructure do you need?  Server portal runs
in lab, clients can run on any java client.
Synch up with him afterwards if you have any
summer internships available.

Next talk
Jim Cowie from Renesys
The recession and the routing table
Reading the tea leaves

They dig into the routing tables to see what's happening.

Tough times, tough questions
We konw that internet transit purchases are sensitive
to business conditions (2000 crash)
 is the 2008-2009 recession affecting growth in
  the global/regional routing tables

Should be some sign of pullback in the routing tables
like in 2000.

3 years of North American routing--it's still going up,
there's no depression visible.

Why did the table keep growing?
Enterprises don't cut costs by leaving the internet,
they cut costs by reducing diversity
cheap transit getting cheaper acts like "easy money"
prospect of v4 runout may result in "use it or lose
it" addition of routes into table.

Half the table is just hanging out with 1 provider.

Number of prefixes with 4 or more providers is going
up.

The 1 provider networks either go to no-longer-advertised
or shift to 2 or 3 providers.

More go to the "no-longer-seen" pool; fewer upgrade
to the next category up.
People postpone getting to multihoming.

Triple-homing seems to be sweet spot.

4 or more provider pool is getting larger
and more stable over time; you don't tend
to decrease over time.

Global recession might give more of a break
before v4 exhaustion
Cheap transit killed that theory
some evidence of single- and dual- homed customers
putting off the move to higher order multihoming
 in 2007 and 2008
"obviously practicing for IPv6 transition, after which
 apparently multihoming becomes unnecessary"
Otherwise, growth continues apae
Bring on the post-IPv4 marketplace!

Q: Randy Bush--BGP is a great data hiding system;
it doesn't tell you much about the real topology
of the internet.  How do you determine how a prefix
has a single upstream?
A: ask him afterwards.

Q: is this transit AS?
A: Yes
Q: You have to have seen the AS through another AS,
 that's how you can count the upstreams.

Joe Abley up to the front from ICANN
DNSSec  for the Root Zone
Matt Larson, from VeriSign.

Info update for those who care about DNSSec
collaboration between ICANN and VeriSign with DoC

ICANN is IANA functions operator

Manages the Key Signing Key
Accepts DS records from TLD operators
Verifies and processes request
Sends update requests to DoC for authorization
 and to VeriSign for implementation

DoC NTIA
Authorizes changes to the root zone
 DS records
 Root key sets
 DNSSec updates

VeriSign
manages the zone signing key

Proposed Approach to protect the KSK
CPS--certificate practice statement
DPS, DNSSEC policy and practice statement
basically, to assure people the practices are
adequate to protect it.

community trust
proposal that community representatives have an active
role in management of the KSK
 as crypto officers needed to activate the KSK
 as backup key share holders protecting shares of the
  symmetric keys in case of disaster recovery

Auditing and Transparency
Third-party auditors check that ICANN...
webcast of sessions

KSK is 2048 bit RSA key
rolled every 2-5 years
RFC5011 for automatic key rollovers
propose using signatures based on SHA-256
 but there's no shipping code based on this

Zone signing Key (held by verisign)
ZSK is 1024-bit RSA
 rolled once a quarter
SHA-256 signature

Signature validityRRSIG validity 15 days
 resign every 10 days
Other RRSIG validity 7 days
 resign twice a day

Key Ceremonies
Key Generation
 Generation of new KSK
 Every 2-5 years
Processing of ZSK signing request (KSR)
 signing ZSK for the next upcoming quarter

Root Trust Anchor
published on a web site by ICANN as
XML-wrapped and plain DS record
 to facilitate automatic processing
PKCS#10 certificate signing request

Roll Out
incremental roll out of the signed root
 groups of root server "letters" at a time
watch the query profile to all root servers
 as roll out progresses
Listen to community feedback for any issues

No validation
Real keys will be replaced by dummy keys
 while rolling out the signed root
signatures not valid during roll out
actual keys will be published at end of rollout

Timeline
December 1, 2009
 root zone signed
  initially signed zone stays internal to ICANN and Verisign
ICANN
Jan-July 2010
 incremental roll out of signed root
July 1, 2010
 KSK rolled out
 root trust anchor

ISP Security BOF later today will talk about it.

Full architectural documents around the process will
be published in the next few weeks.

Next speaker is Paul Francis, talking about
Virtual Aggregation.

Reducing FIB Size with Virtual Aggregation (VA)
ISPs often want to extend the life of old routers
Routers that have inadequate FIB but otherwise are
 still useful

A common approach--use old routers as customer PE,
 default to core

Other FIB/RIB shrinking tips

Filter out more specific routes

For lower-tier ISPs, default to transit ISPs
 ie use 0/0 and load balance among transit ISPs
BUT
 leads to non-optimal routes
 lots of configuration (peer routes, "important" routes
  like Google)
 Can't be used by transit ISPs themselves

Mitigating non-optimal default routes
Use more-specific "semi-defaults"
AS3303 Swisscom IP-Plus
 point 62/8, 80/7, 21/7, etc. to EU transit ISP
 ARIN space to US transit
 class B 128/3, 160/5, 168/6 to US transit

IETF working on a more general solution: virtual agg
GROW working group
draft-ietf-grow-va-00
-va-gre-00
-va-mpls-00
-va-perf-00

VA is a way to control FIB size in routers
 DFZ FIB, not VPN tables
 does not shrink RIB size
Tight control of FIB size for any or all routers
 no coordination between ISPs
 works with legacy routers

Important today--possibly critical tomorrow?
looking forward, BGP RIB growth rate could increase
 substantially
 exhaustion of v4 erodes aggregation
 because of pressure to shrink default prefix size
 uptake of v6
VA can help ease these pressures

VA not perfect
Requires configuration of its own
Entails a traffic load/FIB size tradeoff
 which can be quite good
 academic study on large transit ISP
  10x fib reduction with negligible latency/load
   penalty
But in general we don't know how easy to achieve
 this--
  configuration...

Why this talk?
You can help us define VA
 certain protocols or configuration details
 alternative ways to deploy
 or tell us that VA is useless

encourage your vendor to implement VA
 current implementations from Huawei and ??

VA Basic Idea
Define "Virtual Prefixes" (VP)
 These are shorter (bigger) than real prefixes
 think of /6s, /7s, /8s
Assign different routers to be "responsible" for
 different virtual prefixes
  ie, they need to know how to route everything in the VP

FIB-suppression
BGP runs as normal
 all routers have full RIB
 important to not muck with BGP operation per se
suppress updates to FIB for more specifics of
 virtual prefixes

APR (aggregation point router) for 22/8
originate route to 22/8 with nexthop being itself
it FIB-installs all sub-prefixes within 22/8

other routers FIB-suppress all prefixes within 22/8

This just tunnel-maps from one router to another
out to the egress point.

The only router with the need to know how to route
that packet was APR1 (well, that, and the ingress
router)

The packet takes a bit of a longer path to do
this with simple aggregates.

You can add "popular prefixes" to routers to point
them along "better" paths.

Types of tunnels defined
MPLS (using LDP)
GRE
...

A deployment example
Robert Rasuzck at Cisco
shows a POP site with 4 PE customer agg routers,
2 Rs, 2 RRs;
core can use tunnels between them already.

Use RRs as APRs -- can optionally
FIB-install routes for which PE is egress

If you do FIB suppression at the RR layer
Then need to install popular prefixes at the PE
 layer--GROW looking to automate that part.

VA from our point of view
Figure out where you need FIB reduction
Based on this, design your deployment
 select VPs
 assign routers as APRs, configure
 configure "VIP-list"

New IETF GROW WG work item for FIB suppression

Q: Patrick, Akamai--this seems very complex; couldn't
we just take prefixes out of the FIB that are covered
by a shorter prefix with same next-hop; wouldn't that
be much easier to do, and save FIB space?  Could we
maybe ask vendors to look into doing that?
A: Lixia may have done some looking into that; she
says that two people on her team, they found out
that you can compress your FIB between 10 and 50%
by simply suppressing more specifics with same
next hops.
She was going to give a talk at GROW at the next
meeting that would do this.

Q: Doni from PeakWeb, was asking vendors for this
around the 200,000 routes in the FIB; the vendors
were wanting to simply sell more hardware.
Which routers need the full FIB in the drawing?
A: None of them need full routes.  Generally got
about 10X saving in all of them.

Q: Owen deLong--if you already have all the
routers everywhere, it might make sense; if you
have just 2 routers in a POP, this looks like
a distributed CAM load, to have multiple routers
pretend to look like one router
A: yes, it's like that.

Q: RAS--remember the 8k Foundry boxes?  They had 8k
CAM table, and their solution was to either have
just default, or break it up into /12s; this is
similar, it just limits based on number of next-hops
they have.  Could we get benefits from doing more
simple aggregation like that?

Q: Igor notes you can probably just upgrade for
cheaper than transferring all sorts of routes
back and forth and paying for additional interconnect
ports.

Q: Anton Kapella, have they considered looking about
Auto-TE QoS stuff internally?

If packets are being redirected around internally,
it does mean something for link-loading; how will
this interact with QoS, since this will transport
packets along links not originally planned for it?
In what they saw, very few packets used the
non-optimal paths.

Next up.
We'll do coffee break at 1615, BOFs at 1645

BGP# - a system for dynamic route control
in data centers

tenants and landlords
one landlord
 owner and manager of the datacenter
many tenants
 internal users
  search, email, gaming
 external users
 utility computing customers
empower tenants to control routing decisions

Routing tensions
tenants have different goals

tenant goal--spread traffic
or migrate traffic from one server to another

current system, tenant submits tickets to get routing
changed.
whole ticket flow is shown

Tenants have limited control over routing

A better system
allow for automated route control
allow tenants independent and safe route control
ensure scalability
allow for maintenance changes

BGP#
simple speakers (multispeaker)
 peer with BGP routers
 send route announcements/withdrawls (ECMP capable)
Stateful controller (controller)
 controls coordinates speakers
custom API ("applications")

Application runs on tenant box; speaks to
controller via API; controller speaks to
multispeaker which peers with router to
send the update

to spread traffic, similar thing;
application uses API to ask controller which
asks mutispeaker; it has 2 sessions to router, with
2 next hops for prefix.

Automated route control
controller API allows for custom applications
Application can automatically manage routes

Independent and safe route control
only allow a tenant to change their own prefixes.

Scalability
Multispeaker and controller not placed in machines
handling user trafic
 eliminates need for one policy controller per machine
 reduces peering sessions to router
eliminate per-ticket manual intervention

Resiliency
ensure system continues operating
instantiate multiple multispeakers
 single multispeaker failure doesn't affect other MS ability
 separate multispeaker and controller
prefix resiliency -- ensure prefix stays available
 announce same prefix from mutiple multispeaker
  router retains prefix even if one MS fails

Automation service could deploy a new multispeaker
with same config if one dies.

No inconsistency with multiple Multispeakers
suppose some multispeakers become unresponsive
 BGP# listening tool detects the lack of router
 readvertisement
suppose multispeaker reboots and is in different state?
 get config and state from persistent store

Alternate approach
each tenant sets up its own BGP instances
 needs one session per machine
landlord may need to deal with many BGP peers

Conclusions:
Tenants have more power
Landlord retains responsibility for validation of routes.
system achieves stability and resiliency

Q: Francis asks if BGP is an awfully coarsegrained
tool to use for something like this--what about
using MPLS for setting up flows.
A: BGP finite state machine is much simpler to follow
and update.

We'll go into coffee break now; BOFs start at 1645
hours Eastern time.

SC elections, JUST DO IT!!
PC nominations open for 3 more hours!

1800 hours in Regency for Bear and Gear.

BOFs, Mobile Data Track, ISP Security BOF,
and DNS BOF will be upstairs in DeSoto room.

Tuesday we start at 0830 again with breakfast.

For now, I head over to the DNS thingie BoF

IPv6 and resolvers; how do we make it less
painful?
For most people, rolling out IPv6 can't break
IPv4, and separate hostnames isn't scaleable
long term.

Per Google, 0.078% users are impacted by
enabling quad A on machines.
Assuming a user base of 600M that's 470K users that
get broken, which isn't acceptable.

Right now, in browsers, IPv4 fallback is on the
order of 21 seconds to 181 seconds, which is for
most SLA numbers considered "broken"

Options?
Don't roll out IPv6
prefer A over AAAA
accept the breakage
what about checking for working IPv6 connectivity
 before sending back AAAA record.

Only way to know if user has working v6 transport
is if the AAAA request came via IPv6 instead of IPv4

Recursive servers need to be set to only return
AAAA *only* when request came in via IPv6; otherwise,
return A record only.

Now, auth DNS server only has to worry about IPv6
reachability to the recursive server.

We've asked if ISC can write this; ISC has done
this, it'll be in BIND 9.7; it'll be in a second
beta coming out in early November; if you want to
check it out, if you're on user list or beta list,
you'll get notification; otherwise, check ISC
site in early november and it should be there.

Feature will be a knob you have to turn on.

There's an additional check put in; if DNSSEC
is set, it won't forge DNS answers unless there's
a knob set "BREAKDNSSEC" that you can turn on,
the knob is going to be very well documented.

But if you've gone through the work of setting
up DNSSEC, you should know how to troubleshoot
it yourself.

This should be be set up for resolvers facing
customers, not for internal services that have
odd configurations.

What about having an ACL for controlling
behaviour for different subnets?
If they fit in a view statement, you already
have that capability.

Will this be available within a view?  Yes,
you can do it there.

But the ACL idea is interesting, and could be
better than pushing people towards views.

This really goes on the recursive side.
We need to try to convince ISPs to use these
options.

If a request comes from a 6to4 address; the
source is a 6to4 address; do you respond with
AAAA or not.

How about ACLs with flexibility to see if it's
over v4 but from a 6to4 address to send different
responses.

A simple default policy is good, but the flexibility
is good for more experienced sites.

Simple on-or-off knob is in 9.7b2; more granular
control will be needed for later versions...

what about 6rd?  They would get no AAAA results
in that scenario.  We might need a DNSv6 option
for DHCPv4 which would be able to give back
the v6 DNS servers.

Should we put together an information draft
for IETF; we can draft one up, so the three
of them should talk;
Igor Gashinksi, Yahoo,
Larissa Shapiro, ISC,
Alain Durand, Comcast

OARC meeting, Beijing, Jason Fesler will be there
to talk about it.

ISOC meeting in Paris next week, we'll be there
to talk about it as well.

Internet2 joint techs talked about it as well.

General consensus is that this is a necessary
evil hack.
If we can get it working with 6rd, it'll be an
interesting working solution to a common problem.

What about shoving JavaScript code into web pages
to report DNS lookups back via REST infrastructure
to get an idea what the types of breakage.
OS, browser, IP, and which test cases break or not.
We could do a series of test queries, and see which
ones break or not.
Give A
Give AAAA
Result of the query comes into the beacon server, so
we can see if they saw the reply or not.
It would be better for the javascript to reply back
with what *it* saw, as well as see what the server
log sees.

There's some javascript coding challenges with
collecting this data.

If we can at least break it down to 3 buckets
6to4
teredo
native
it would help really pin down where the breakage.

Do note that the percentage shown wasn't Yahoo
data, that was Google's data, so we don't have
that breakdown ourselves.

Would going to AS level be too specific for people?
We'd need to consider carefully privacy issues
and anonymizing the data as per our privacy
rules.

What about running an experiment in partnerships
for specific ASNs?

Point is, this is coming out, do share the data,
this will be going into mainline code release.
It's opt-in, defaulted off.

The *actual* names for the config options are...

This will apply to just the RR set, not the
glue set; if glue returns AAAA, it'll still
come back intact.

The tests are really testing recursive lookup
server to the last proxy device in front of
the client.

But what if the recursive server to auth server
breakage happens?

If the recursive server lookup side, can we
turn the knob on in the other direction?

This is an interesting challenge; we'll have to
see how much additional work this will need, and
how much additional funding will be needed to
cover for it.

ISC will...look into the feasability of doing that.

The IETF draft cutoff is tonight for Paris, so
maybe it'll be done for the Anaheim one, at which
point we'll have working code out there, and a bit
more time for writing the draft.

We wrap up the BOF at 1724 hours Pacific time.

(what about a switch for auth servers that allows
for turning off "don't send AAAA records to ZZZZZ"?)

------------------------------

_______________________________________________
NANOG mailing list
NANOG@nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog

End of NANOG Digest, Vol 21, Issue 72
*************************************

RE: NANOG Digest, Vol 21, Issue 72

Sieber, Chris