Test
Threads by month
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2007 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2006 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2005 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2004 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2003 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2002 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2001 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2000 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1999 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1998 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1997 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1996 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1995 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1994 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1993 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1992 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
October 2009
- 361 participants
- 175 discussions
For those following the regulatory / net neutrality debate, the Canadian
Radio and Telecommunications Commission released this morning a decision
requiring additional transparency with respect to the traffic management
practices of Canadian service providers.
News Release:
http://www.crtc.gc.ca/eng/NEWS/RELEASES/2009/r091021.htm
Policy Details:
http://www.crtc.gc.ca/eng/archive/2009/2009-657.htm
Jeff Gallagher Â
Network Engineering
jeff.gallagher(a)bellaliant.ca
5
7
And last, but not least, here's the notes from the morning
part of the NANOG meeting. I strongly, STRONLY suggest
people read Aaron's IPv6 deployment in a nutshell slides;
while I differ from him on some of the thoughts around address
allocation schemes for very large networks, for small to midsized
networks, it's a very, very good cookbook to follow for getting
IPv6 rolled out:
http://nanog.org/meetings/nanog47/presentations/Wednesday/Hughes_Kosters_fu…
Thanks to everyone for participating, both locally and
remotely!
subsequent ARIN notes will be posted to ppml list.
Matt
2009.10.21 NANOG47 Wednesday morning notes
Don't forget to fill out your survey!!
http://tinyurl.com/nanog47
Dave Meyer welcomes everyone back from their
hangovers at 0904 hours Eastern Time.
John Curran isn't here, so he misses his 13
minutes of fame, and we go straight over to
Mark Kosters.
Mark Kosters, IPv6: Emerging stories of success.
A stellar panel of people to talk about
transitioning to v6.
IPv4 is running out; in 2 years or so, we'll
no longer have a flow of addresses from IANA
to the RIRs.
But why isn't more traffic moving onto IPv6,
given the imminent runout? Still less than
1% of the overall traffic.
What do you need to do to make the move to v6,
from the enterprise and ISP viewpoint?
John B, comcast
Matt R, ARIN
Owen DeLong, HE.net
Aaron Hughes, 6connect
John B from comcast is up first
Native, dual stack core and access networks
started as means to leverage device management;
then moved to subscriber access service.
Backoffice, where applicable, also dual stack
Cable modems (DOCSIS) single stack v4 or v6
eMTAs remain v4
eSTBs targeted to support v4 or v6 only
Native dual-stack subscriber services
Leverage well known transition technologies to enable
enterprise desktop IPv6 connectivity.
Some of the backoffice pieces, like DHCP, are
still evolving.
This is a team organizational effort, so it takes
many pieces working together.
Core concept--intial key piece was device management.
Core network, access network, and back office systems
all have to work together, or the program fails.
So, the iteratively extend those three elements to
offer services over IPv6
Native is preferred whenever possible over tunnels
and other techniques; but sometimes it's just not
possible. There's still much learning to happen
in the area to figure out how best to make the
deployment happen.
Lessons Learned
IPv6 must become business as usual for staff from
every area of business
lack of attention here be be problematic for v6 deployment
deferring or avoiding IPv6 will be problematic.
it's really, REALLY important to do large scale testing
of interoperability, especially when you have millions
of devices. You test the key interconnect points where
devices interact, especially with high levels of
diversity in your gear.
Also leverages technologies that newer releases, like
DOCSIS3.0 provide.
Find opportunties like that in your own environment.
Challenges
Need to manage the deployment of v6 relative to other
business needs.
channel bonding vs v6, which gets business priority,
for example
Security on v6 is still a challenge
vendors often say "but you're the only one who has asked
for that"
backoffice and tool upgrades to support IPv6 are
non-trivial
best approach is to divide these efforts into smaller
activities.
Very substantial chunk of work; don't underestimate
the challenges of this!
IPv6 data services for subscribers.
preferred approach is to offer native dual-stack v6
service to customers; v4 continues unchanged, just
adds v6.
Directly connected device that supports v6, or home
gateway device that supports v6.
all the support systems must support both models for
the rollout to work.
Most people in the room use a gateway device at home.
most home gateway devices don't support v6 yet,
so pushing the retail type devices to support v6
natively, off the shelf is a challenge.
Challenges associated with routing for delegated v6
prefixes should be uniformly addressed.
Support for v6 in many products is still considered
'new' and isn't as mature as v4.
testing and interoperability are critical for
successful deployment
bugs and issues will arise
scale makes a difference!
deploying IPv6 must not impact existing services
(this is pretty much true for everyone--can't break
existing customers!)
Content and Services
availability of content and services over IPv6 to date
appears to be lacking
simply having v6 connectivity isn't sufficient
John_Brzowoski(a)cable.comcast.com
Matt Ryanczak, network ops manager at ARIN
History of IPv6 @ARIN
They're a small, 50 person multihomed customer
network.
Their network has been running IPv6 since 2003,
with a beta Sprint circuit.
it was a T1 line, appeared native, but was tunneled
inside Sprint.
v6 internet wasn't well connected.
2004 Worldcom circuit, similar issue.
Started connecting to exchange points, got transit
there, and is now starting to be able to serve large
volumes of traffic.
In 2003, T1 line from Sprint--very adhoc and beta,
used Linux Router, and OpenBSD firewall.
Completely segregated network. Not dual stack, too
many security issues at the time; was a bit afraid
of it at the time.
Path MTU discovery issues, packets just dying;
server MTU issues, upstream issues, great learning
process. Sprint circuit finally being decomm'd.
Sprint support was always really good.
2004, Worldcom circuit, part of Vint Cerf's test v6
network; real router this time, but OpenBSD firewall
still used. T1 into 2800 router. Duplicated the
services that were on Sprint link, provided a second
path to verify issues, see if the problem could be
duplicated or not.
Similar issues, PMTU discovery issues due to tunnels,
problems reaching chunks of Europe (problem for
serving DNS, for example)--good learning exercise.
2006, joined Equi6IX--beta at time, completely free,
100Mb ethernet, transit via OCCAID, things started
to look like production network; still had firewall,
same services, but the service level got a lot better,
still segregated network, but many routing issues
went away, PMTU issues started to disappear. started
to dual stack.
2008: NTT/TiNet IPv6;
built two networks, one west coast, one east coast,
would host all public services out there, separate
provisioning side from public side.
1000Mb links to NTT/TiNet using ASR 1000 routers
Foundry LBs, IPv6 support was Beta. They're very
responsive, been issuing patches for them.
Now it's a full dual-stack network end to end,
and Foundry is still working with them to figure
out how to best support the traffic.
Whois is out there, DNS out there, figuring out how
to expand the services.
Traffic:
Whois, about 0.12%
DNS about 0.55%
WWW IPv6 about 8% traffic in 2009
Most of that is internal ARIN traffic, since they're dual
stacked internally. :D
Lessons learned
Tunnels are not desirable.
he.net tunnel at home is fine for home use, but using
them for production services, PathMTU discovery problems
are just a pain.
Not all transit is equal!
Routing is not as reliable as v4; people are still
learning, backbones aren't as good.
Dual stack isn't so bad, no security issues they're
aware of, stacks have gotten a lot better.
Proxies are good; use 6to4 proxies for current rwhois
servers, older routing registry (moving to v6 someday)
People fear 4byte ASN. They have people who can't
peer with them due to 4byte ASN. More people need to
get 4byte ASN code.
Native support is better.
DHCPv6 is not well supported. This really needs fixing.
Reverse DNS is a pain. No wildcards. Can't use same
tricks as v4, and is very error-prone.
Windows XP is broken but usable; can't do v6 DNS,
but mostly usable.
Bugging vendors does actually work. Helps if they
recognize your name (being ARIN doesn't hurt!)
Today and the Future
standardizing dual stack, ipv6 enabled by default,
including push scripts, back office, etc.
v6 support a requirement for vendors
All RFPs list IPv6 as a requirement
Be prepared to do a lot of work tweaking your back
office scripts!
Patrick, Akamai, -- do you do Google whitelist or do
you just break people who don't have v6 connectivity
to you who ask for AAAA records.
A: It probably happens, but they don't get too many
complaints. It does happen sometimes, but often they
can work with people to get them connected.
Kevin Oberman, ESnet
Recently, he had that issue, AAAA couldn't get there,
but he didn't open a ticket, he just went back to IPv4
address.
A: yeah, there's been some issues like that; they don't
make much money from website, so it's not as critical
for them as for some others, but they do work with
people to try to fix those cases.
Shift focus--what does it take to move enterprises
into v6?
Owen DeLong, what does it take to port systems from
v4 to v6?
Porting to Dual Stack -- not that hard
Why important? We've all seen the exhaustion point
graph.
code examples there
http://owend.corp.he.net/ipv6/
change variable names when changing types, to make
it easier to spot old variables.
compile-repair-recompile-test-debug-retest
AF_INET to AF_INET6
sockaddr_in to sockaddr_in6
sockaddr_storage (generic storage type)
check address scoping (link local vs global, and
interface scope for link locals)
Some gotchas not in sample code
IP addresses in log files
IP addresses stored in databases
Parsing routines for external data
PERL porting example
refer to source code examples
v4_* are v4 only code
Add socket6 as well as socket module to code
replace get*byname calls
change protocol and address families in socket
and bind calls
get*byname to getaddinfo
If you pass in in6addr_any to getaddrinfo returns
localhost, not what you were looking for!
Example of actual old way getservbyname
becomes getaddrinfo
socket and bind calls, AF_INET6, not too bad
PERL client migrations, similar tactics
inet_ntoa to inet_ntop
now getaddrinfo simplifies DNS on client side
You can't cycle socket calls anymore for reads;
you have to explicitly create it each time right
now, as you don't know which family the previous
call was.
Handy function replacement slide from the
website, with structure replacement slide
owend at he dot net
Bill Fenner
some kernels have changed to only bind to v6
sockets when available; has he found that to
be the case?
A: if your kernel does that, it's unfortunate;
what Owen has found is that on his boxes, it
binds to both.
Q: yes, some kernels behaved that way, which is
very unfortunate; there might be knobs that can
change the default behaviour.
A: Recent Linux stacks seem to behave just fine
with dual stack socket calls; get with him if you
have examples of bad kernels so he can post warnings.
Aaron Hughes, deploying dual stack on a network
succesful implementation requires good
supporting policies!
We need to participate in decisionmaking at the
company level, to determine when and how to deploy
v6. Timing will be different for different companies.
Dispel the myths
obtaining v6 addresses is hard
transit providers don't support it
no BGP multihoming
Obtaining IPv6 address space is not really hard.
My provider doesn't support v6!
right now, talk to others, you can get free transit.
he.net, wvfiber, probably others out there you can
talk to. This is really valuable to people right
now!
Starting with IX locations--get IPv6 addresses from
your exchange point providers
Make a list of all relevant peering information
update peeringDB--let the rest of the world know
you have v6 addresses!
Follow your own company change processes for the
deployments!
configure IPv6
locate existing v4 peering interfaces
enable v6 (cisco)
configure v6 address
ping some peers (look their IPs up in peeringDB)
within a minute, you can pass some ICMPv6 packets.
Cisco
ipv6 unicast-routing
Juniper
enabled by default
v4 and v6 are configured almost identically.
At this point, your IX interfaces are dual
stacked. :)
Next up, the backbone.
Keeping track of peering interfaces in peeringDB
is great. For your backbone, you really want a
database to track them. Spreadsheets don't scale
terribly well. ^_^;
At least use a reverse DNS zone file
Come up with a good numbering plan for IPv6!!
If you take first /48 for infrastructure, take
first /64 for loopbacks,
You can take the opportunity to change your
architecture for v6 if you want, but it's easier
to keep is same as v4 so you don't have to keep
track of two topologies.
Architecture choices:
simple one:
loopbacks and connected infrastructure only in IGP
rest of stuff in BGP
Configure your backbone
Numbering plan?
IPv4 4th octet/32 -> IPv6 ::X/128
enable OSPFv3 if you're running OSPF:
ipv6 router ospf 12456
enable v6 on interface
configure OSPF for interface
do same thing on next router, verify the link is
up, reachable, and routed.
Managing assignments with DNS zone
you can just increment /48s in your zone file
don't forget that after 9 comes a!
It shouldn't take very long to do this, even
for a midsized network. It's tedious, but not
hard.
To reach the outside world, need some BGP
configure a new v6 peer group, you can mirror your
v4 peer group, but with route-maps and lists that
match v6 elements. Naming them -V6 makes it easier
to spot them later.
iBGP will be loopback to loopback, next-hop-self,
like with v4.
You can build a common config to be pushed out.
iBGP will handle connected interfaces (except
loopbacks)
route-maps use slightly different syntax for
v6 matching:
match ipv6 address matchall
Don't panic when you do
address-family ipv6
it will reformat your config, your next Rancid run will
look scary, but it really didn't break your whole
router.
You can build a common config chunk for all the IBGP
configs, and push it out.
Still doesn't let you reach outside world.
Next up, configure your external peers.
New peer group for v6 peers; you'll need new sanity
lists, there's not as many well-defined bogon filters;
but at least set filters on sizes
seq 5 permit ::/0 ge 16 le 48
Create a list of your ASNs IPv6 prefix(es) to allow out.
create route maps, use same communities and localprefs
to match what you use in v4
Next, send email to peering(a)he.net to get BGP up;
send the peering info file you collected early on.
Next up, turn up a peer with he.net, you'll see the
neighbor come up, and you can reach the world now. :D
show bgp ipv6 unicast summary
show bgp ipv6 unicast neighbor X::Y adv
make sure you're sending and receiving the expected
routes; do some traceroutes and pings, make sure it
looks good.
Go ahead and continue turning up more peers.
Attaching a host to the v6 network.
use a *nonproduction* host to test with first!
Find a lab box, look at v4 routing and config;
allocate a /64 from your DNS zone file
(figure out your regional aggregation at some point)
Configure interface facing host, and depending on
the OS version, it may autoconfigure itself.
No more ARP, you can try to ping, you can look
at the neighbor table to see if your host is
there.
Check your iBGP, see if you see the subnet in
your table now (first connected non-IGP subnet)
look at http://ripe.net/ to see if you get there
via v6 or not.
Note about SLAAC--the moment you configure the
interface on your router, *every* host on the
subnet can get a v6 address! Make SURE you have
your security concerns squared away before you
do this!
Time to add nameservice
Add DNS
reverse...is ugly. Look at the slide.
forward:
ns0 IN AAAA blech
reload nameserver
Note that your machine is now on the global v6
internet with every port open; in fact, every
host on that subnet is now on the global v6
internet. you MUST make sure your security
policy is ready to handle IPv6 security similar
to IPv4!
Peering--just about everyone out there will peer
via v6 at the moment; it's the right time to
dive in and make it happen.
Start working with a good beta customer to start
developing customer route maps, customer neighbor
configs (most of which will be mirrors of your v4
configs and route maps, but with different address
families and different filters)
Most networks are allowing multihoming of /48s at
this point, so you can let your downstream customer
know it's OK for them to announce the /48 to their
other upstream as well.
Step 1 is pretty easy; the network side isn't that
scary.
Step 2, getting hosts and content up and running
with security policy in place, operations staff
comfortable on IPv6, etc is the harder part.
So, on the network side, getting IPv6 up and
running isn't hard; it's very, very similar to v4.
Leo Bicknell--thanks for a great presentation,
good summary. Few small items.
BGP change on IOS, it does reformat things;
there is a command "bgp upgrade-cli" will change
your config to new format ahead of time to let
you check the delta ahead of time.
Presentation is heavy on IOS classic configs.
IOS-xr and JunOS allows for common policies
for both, with different lines and different
terms for v4 and v6; makes configs even simpler.
Lastly, with IPv6 reverse, people forget that
$ORIGIN exists, so you can make the zone files
look considerably easier to read.
Humans seem to work better when v6 host address
and v4 address map to each other statically,
rather than using SLAAC and having hosts
change when NICs change.
A: Very true, that's more of step 2, but this
is very very good information to know.
Arjin, AMS-IX, since autoconfig is on by default,
might want to turn them off on exchange point
interfaces.
Cathy says this looks like the beginnings of
a WONDERFUL best current practices document;
let's turn it into one!
Next up is Betty with some results for us from
the elections.
196 people voted
Candidates:
Steve Feldman, Sylvie LaPierre, and
Duane Wessels are new SC members.
Austin, Texas, NANOG 48, see you Feb 21-24 2010.
Thanks to ARIN, Arbor, and Merit for this meeting!
There's new SC members; we're at the first point
since the restructuring where people have hit
term limits.
Josh, Joel, Ren, Todd, have been serving since the
revolution, and are aging out--a big round of applause
for them as well.
AND FILL OUT YOUR SURVEY!!!
http://tinyurl.com/nanog47
John Curran notes there is a break, and ARIN will start
at 11am. :)
BREAK TIME.
2
1
Hey folks,
Just a reminder that the NANOG Election polls will be closing at 09.15 EDT.
If you are listed here
http://www.nanog.org/governance/elections/2009elections/2009_voters.php
you can vote, no matter where in the world you are. Ballot is here:
https://nanog.merit.edu/election/
MLC nominations will remain open until 29 Octobe:
http://www.nanog.org/governance/elections/2009elections/2009mlc_candidates.…
For thos at the meeting *or* watching the streams, take the survey!
http://tinyurl.com/nanog47/
Cheers!
Joe
--
RSUC / GweepNet / Spunk / FnB / Usenix / SAGE
_______________________________________________
NANOG-announce mailing list
NANOG-announce(a)nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog-announce
1
0
security(a)amazon.com
Little birdies from Amazon said that's the best contact point.
Message: 4
>> Date: Tue, 20 Oct 2009 17:40:39 -0400
>> From: "J. Oquendo" <sil(a)infiltrated.net>
>> Subject: Amazon's EC2 Security contact
>> To: NANOG list <nanog(a)nanog.org>
>> Message-ID: <4ADE2E57.9030608(a)infiltrated.net>
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>>
>> Hey all, apologies for shooting this on this list, but I've had greater
>> success here.
>>
>> Anyone have a SECURITY contact for "Amazon Web Services, Elastic Compute
>> Cloud, EC2" outside of the typical: whois -h whois.arin.net
>> $THEIRSPACE|grep "@"
>>
>> I'm looking at a delicate situation here and would appreciate any
>> OOB/non-tech-sup-spool-box contact.
>>
1
0
Here's my notes from the second half of the second day at
NANOG, including the peering BoF and the ARIN open
policy hour.
Now for some dinner. ^_^;
More on the morrow.
Matt
2009.10.20 NANOG day 2 notes, second half
FILL OUT YOUR SURVEY!!
http://tinyurl.com/nanog47
VOTE!!
http://bit.ly/nanogelect
Kevin Epperson welcomes everyone back from
lunch at 1432 hours Eastern time.
Don't forget the survey and the voting!
BJ Neal from FCC will talk about national
broadband plan.
Relax, he's not a lawyer, he's not here
to regulate, he's here to get information
from us.
Looking to get input from some of the
membership here at how they should look
at traffic engineering on backhaul and
shared portions of the network.
In early 2009, President and Congress
tasked FCC to put plan together for
broadband access for all americans,
focusing on rural and underserved
areas.
Feb 17, 2010 the plan is due before Congress.
3 basic components to BB team:
deployment team: technology purposes, funding
sources, policy levers
national purpose: healthcare, hospitals, schools,
adoption: what needs to be done to encourage adoption
of BB technology for the underserved and unserved
broadband plan is data driven.
FCC looking for as much real world data as possible
as inputs to the plan as possible.
Need to make sure the plan addresses all three
elements as much as possible, to be as complete
as possible.
On the adoption side, it may be that some people
may have inside wire issues, may need education,
may not be able to afford a computer to connect,
etc.
US has a complex mix of broadband assets
DSL, cable, wireless, FTTp, power line
networking on last mile
Second mile access is copper Xport, fiber
Xport, wireless point-to-point, hybrid copper/fiber
middle mile access is dedicated internet access,
ATM, frame relay, managed IP/MPLS/VPLS,
dedicated private line, DS3, OCn,
Traffic engineering on "shared" portions:
how should BB team tihnk about traffic
engineering, or sizing of the shared or "backhaul"
portions of the network.
Significant impact to the cost of a National BB
network architecture. Obviously it is very
important to size network properly.
Possibly need an equivalent "erlang" model for
public IP traffic mix; what is the formula for
this?
NANOG member's views on the traffic engineering
and network dimensioning.
Q: Randy Bush, IIJ; fun visiting third world
country; in Tokyo, you just have fiber, and
the last 100meters is 100baseT, and he can
get 80mbits from Tokyo to westin Seattle.
He has a third world connection in Seattle
from a house there, he can get almost 600
from Seattle to Westin, costs 3x as much.
How to do traffic engineering? Provision it,
and you don't have to do traffic engineering.
A: overprovisioning comes with significant
cost, especially if that means deploying
new fiber.
Q: We said same thing about rural rollouts
of telco; now, farms get upset at busy signals.
Q: Igor from Yahoo notes that traffic engineering
is wrong term; he really means how to plan how
much to provision, not which pathways to send
traffic along.
A: He's looking for oversubscription ratio for
a given set of users with a given set of
applications along the backhaul to a public
interconnection point.
Q: In that case, pick a starting point, and
then build to what the customers use, and
bill them accordingly.
A: FCC doesn't have any traffic numbers to look
at, so they need a starting point.
Q: RAS notes the backhaul from lower tier city
to where you really want to go is really cheap
and easy; most of the cost is last mile. So
build backhaul with as much capacity as you
need; the government is going to get bad deals
no matter what anyhow.
The vast majority of the cost really is the last
mile.
Q: Jay Moran, AOL, they're looking at many
broadband proposals coming from various people.
Just need to submit the questions to people who
are sending out the VTOP applications.
BGPbotz
IM-based route view bot
Manish Karir
Route Views
Generally web or telnet based
provides a looking glass for BGP from different
vantage points (routers)
Disadvantages
Have to remember telnet/web addresses
route-server.ip.att.net
route-views.ab.bb.telus.com
"more" is a pain.
BGPbotz is up on iM, you can type in similar
commands to it.
It'll give links back if the output is too long;
output survives for an hour, then disappears.
Commands:
show ip bgp
show bgp view (router) ipv6
ping
traceroute
aslookup
whois
grep
Architecture
Python
chat protocols: AIM, Jabber(XMPP)
single thread per user
Results
400+ screen names from both protocols
Results: "show ip bgp" was most popular
command!
Talk to BGPbotz
AIM: bgpbotz
Jabber: bgpbotz(a)jabber.research.merit.edu
He shows a realtime demo of it
Still looking for more views
source available:
http://bigbotz.merit.edu
Q: Patrick, Akamai, would you do in-addr resolution
on traceroutes?
A: it really makes it slow; they wanted it to be
pretty quick. Could take a while with long
traceroutes.
Q: Matt, Yahoo, could he add other messenger
platforms like Yahoo or MSN?
A: yes, if the module can provide a python
interface, it can be plugged in.
Geoff Sisson, Duane Wessels
DNS OARC
Root Zone historical trends
NS records, A records, IPv6 AAAA records, TLDs.
In the past year, breaking away from the trend of
one AAAA per TLD.
Interesting times for the DNS root.
IPv6 Glue
DNSSec
New TLDs
IDNs
Also...
Continued Anycast deployment
continued increase in query rates.
This study of root zone changes
ICANN hired OARC to simulate changes to
the root zone and explore how they affect:
The size of the root zone
Server response latency
Server start and reload times
Hardware:
DNS OARC testbed
16 HP prliant DL140 G3 servers
4 3Ghz cores
Software
testing BIND 9.6.0P1, and NSD 3.2.1
Centos 5.3, FreeBSD 7.1
dnsperf, etc.
Zone file configs
5 types of zone content
unsigned, mostly v4 glue
unsigned, v4 and v6 glue
signed, v6 glue, 10% DS records
signed, v6 glue, 50% DS records
signed, v6 glue, 100% DS records
five zone sizes
1K, 10K, 100K, 1M, 10M
Memory usage
How do root zone changes affect zone size and
memory usage?
Process memory usage measured with pmap.
includes code memory and shared lib memory
BIND increases at 10K to linear curve.
smaller sizes have more overhead compared to entries.
NSD, 32G RAM couldn't load 10M signed TLD zones.
memory usage proportional to zone size
signed zone uses twice as much memory.
NSD needs more than 32G to hold 10M signed zone
response latency
how does latency of an L-root analog vary as
function of zone size?
BIND, unsigned zone, different sizes generally
about a quarter ms. Performance does not change
as zone size increases.
NSD, similar graph, very consistent; NSD is slightly
lower latency, .1ms
BIND with signed zone, not as consistent; histograms
drop consistently.
He did cumulative distribution graph
Only 70% of queries get answered within 4 seconds.
NSD on signed zones, more consistent, but can't do
10M zones.
CDF says similar cutoff for NSD; does a bit better.
BIND performance is stable for all sizes of signed
zones
Some degredation at higher sized zones.
BIND performance issue;
only with NSEC; no problems with NSEC3
Only with a zone like the root which is likely to
have a large number of glue owner names that
get sorted between non-glue
Only for a larger (ie 100K TLD) root zone
Plenty of time until this fix will really be needed
in production.
shows example problematic zone...linear search that
wasn't optimized as well.
Task 3; start and reload times
how does nameserver startup and reload vary over
zone file size.
Above 100k, takes noticeable load time; same
for reload time.
With 32G, they couldn't do a reload for BIND on 10M
signed file
Again, proportional to zone file size.
bandwidth transfer size vs size on disk
the cases below the 100% mark are NSD; it does
name compresson in transfers, whereas BIND doesn't.
For zone transfers of signed zones, takes much more
bandwidth than doing simple rsync.
zone transfers take much longer in face of packet loss.
TCP usage
to what extent will DNSSec increase TCP at roots?
Current root zone became signed, looked at number
of TCP retries; 0.3% of requests.
Then they re-ran it with 1M TLDs in it,
again rate of truncation increases
response sizes, truncated vs non-truncated;
EDNS512 vs larger EDNS size
650, 700, most was a bit more than 800.
Root servers can expect about an order of magnitude
increas in queries per second.
A.root will go from 5/sec to 50/sec
increasing number of TLDs apears to increase TCP
traffic
https://www.dns-oarc.net/files/rzaia/rzaia_report.pdf
or email them at the addresses on the slides.
Manish Karir, PE-ARP--port enhanced ARP for
IPv4 address sharing.
Can we put ARP to more use?
Address the notion of what an endpoint identifier
is; is it IP address, or MAC address?
Looming IPv4 exhaustion
IPv6 adoption is ramping up, but how long is enough?
IPv4->CIDR->NAT-today->exhaustion->IPv6
moving from era of plenty to exhaustion
what can we squeeze more utility out of?
question long-standing ideas, scavenge more.
Range of valid source ports; 65k; largely unused.
at most, a few hundred used.
end stations have 2 unique IDs; a MAC address
and an IP. Can we use MAC as unique ID, and
share IPs?
It's the application/service that needs it, not
the host itself.
Can you hand single IP to multiple hosts if you
limit port ranges?
What needs to change on end host, on local
network, in ARP tables, and support to find
services on the network.
PE-ARP components are on end host.
port range management on end hosts.
DHCP modified to hand ranges back.
ARP protocol needs to be modified for getting
port numbers in requests/responses
Modified ARP table; in additional to MAC to IP,
need port info in it as well
Need DNS service for service location when port
number is part of reply.
SRV records are exactly what is needed in this
case.
testbed:
All 3 nodes share same IP, with limited port ranges.
outbound packets work, and internal cluster
communications also work.
Outbound is easy, it just has a smaller range of
source ports.
Inbound traffic does both IP address and port
range lookup;
running prototype on linux 2.6.29 kernel
Mostly arp changes, some routing, forwarding code,
to look at the port info coming in.
works well in both directions, both scenarios,
How to deploy this?
router on the network edge: modified ARP on linux
router kernel forwarding packets between interfaces
Bridge solution.
Don't rely on anyone else to get better utilization
out of your IP ranges.
Incremental deployment
E2E consistency--still there
Breaks one to one mapping of IP to ARP
Single IP address can be shared thousands of times.
Related work:
CIDR
NAT
CGN-NAT++
Port scavaging resolution:
conclusion: PE-ARP does break things based on old
assumptions.
It can buy more time for transitions
Packets are not modified once they leave the end
host.
Scott Liebrand, Internap
Sounds like openWRTg access point could be tweaked to
do this;
A: Yes, it's on the map to do it; currently, just for
Linux kernel, but it's definitely doable!
PEERING BOF:
We have virtual martin in the room with us!
If you haven't registered for the peering forum,
register now--and if you've register, you can
re-register again.
It's at the Ford House--follow the red light, just past
TGIFridays--might be buses here.
Your reasons for being here are unique, there are two
microphones here; try to get the mic before you ask
your question.
First peering meeting for him was 3 years ago,
it's amazing what can happen in 3 years.
Keep the discussion going--if we run over, we'll
impinge on our drinking time.
IPv4 peering personals (and IPv4 IPv6 dual-stack)
Airstream:
nmc(a)wins.net
CHE and MIN he's in peeringdb, AS11796
eyeballs.
Egecast: AS 15133 peering(a)edgecast.com,
content/CDN network
100G network, open
Stacy Hugehs
nuspeering(a)neustar.biz AS12008
DNS provider, small but mighty!
kloch(a)carpathiahost.com
Carpathia hosting AS 29748
about 500G today, looking for AP, LA peers
No backbone today; if you're in LA, ASH, it may
be good to talk to him.
Selective; would like to see a few hundred megs of
traffic between us.
Crossconnects questions and futures.
Do you pre-wire your gear to panels? Does it help?
Is there a need for LC crossconnect support?
What about MTP? (from greg's presentation this morning)
Is crossconnect delivery getting better?
is polarity correct?
do you get light readings when handed over
How well do you manage your inventory of crossconnects?
Most people don't prewire, it turns out. It doesn't
help that much, and often has to be re-done anyhow.
What about LC cross-connects; will people only do
SC handoffs, or are more and more LC cross connects
getting run?
You can't field-terminate LCs; small, difficult.
SC terminations can be done with c-core kits on site.
If you replace stock panel, you can do SC panels with
good density, and then you can get SC to LC patches.
Field techs, every time they touch LC panels, everything
gets messed around it. :/
What about MTP?
You'll never get MTP field-terminated; so you'll never
see cross connects over MTP; too hard to do it on
custom length cables.
Look at how fast VSR cables died off.
You can't flip polarity on LC easily, either!
Mike Hughes notes that if you flood wire, it's
a short patch lead at each end; keeps the wiring
mess minimized.
At the end, it's up to you to patch it into your
gear; you provide patch hood.
Is cross-connect delivery getting better?
Yes, but polarity is often wrong.
With no light, it's a 50/50 guess on the facility
people's end.
If you leave light on, they can give light readings
on the fiber.
They often can't see the fiber at either end.
In order to prevent the NOC from getting lots of
alerts during turnups, some people leave the port
down until the cross connect is done, to limit the
errors from flapping.
Photonic switch providers, can they monitor light
levels for customers? They can, but don't.
Different optics have different thresholds, though.
How well do you manage your inventory of cross
connects?
Nobody has the full on database.
Database that is mostly up to date.
Database has some data, but is not very complete.
If they have issues, they'll open a ticket
Know they have cross connects, but that's it.
Igor, peering(a)yahoo-inc.com AS10310
content heavy, peeringDB is updated with all v6
addresses.
Scott Liebrand, Internap, AS22212
similar to igor; get sessions up, build it and they
will come. PeeringDB is now up to date
Guy Tal, LimeLight, AS22822 peering(a)llnw.com
CDN, had v6 up for 4 months now; AMSIX for EU networks,
rolling out across all exchanges, and will do PNIs.
PeeringDB needs to be updated, but will be soon.
Owen DeLong HE.net, peering(a)he.net, AS6939
Yes, they do peer, it's an open peering policy,
even for Telia and Cogent.
And yes, they baked a cake for Cogent to try
to get them to turn peering up; there's pictures
to prove it, everyone in the room has seen it. :D
IX Updates and News
Terremark, Josh Snowhorn
Bogota is growing strong.
ISTIX, Istanbul IX, turkish traffic, ME traffic
come show up!
NOTA, 200G there, they have F10 switches, need to
upgrade. Looking for new switches there
Cara Mascini, AMS-IX, they have the new MPLS/VPLS platform
about 25 new members since last NANOG, 340 members,
about 600 ports, half 10G, Over 1000 ports active
Traffic is over 800G, 2G of IPv6 traffic.
New site, interaxion, is in build up, need interswitch
links, and then it'll be live.
New web site, new member portal going live in a few
weeks; new pricing for 2010 on website.
General meeting, Nov 18th, talk to her about specifics.
Mike Hughes, LINX, he waves to Martin
Also on Slough, not just London--diverse from docklands
now 10 sites total, 3 non-docklands sites. better
resilience.
THW facility will be 20,000 meters opening next year
Brocade and Extereme still, will be 15 in november.
They still have lots of 10/100 ports.
Equinix, Eric Bell--3 regions, 12 metros,
globally hit about 380G, traffic continues to grow;
Exascale deployed in DC, now switches in DC5 as well
as DC2.
MPLPP, their version of route server, could be a
solution for HE and cogent for v6. :D
Hong Kong, opened new exchange
Telehouse 2, 2Gigs there
Looking to build requirements to expand into certain
metros; how do they expand?
Greg, Switch and Data--he joined 20 days ago
Peter has H1N1 unfortunately.
Customer1 portal, all the way on right
aggregate traffic across sites
peer-finder tool (who you peer/don't peer with)
participant list
mailing lists for each metro site; portal has a
sign-up site, it's moderated list.
held advisory council meeting, got input, aiming
roadmap for next year; route servers and route
collectors getting rolled out; peer with them, helps
them figure out who is up, who is down.
Nov 23, Jan 3rd, network moratorium over holidays
Michael Lucking, Telx
releasing new peering portal next week; will have way
to interface and interact with other peers
Everyone on switch will have access to it.
About 20% of customers are doing IPv6, it's growing;
not much traffic there yet.
SIX, Patrick Gilmore, gone through changes as of late;
added Nexus 5020, have 8x10GE between main switch and
Nexus, with 7G on it; come peer, help fill it up.
Did renumbering, v6 was instant
v4 not so instant; 3 stragglers who still haven't
renumbered. Did not buy space off Bill Manning.
was a lot less painful than people thought; don't
fear renumbering.
133 AS active, 45G active. Price drop!
$5k for 10G port, plus optic, then free thereafter.
TOR-IX, Jon
120 peers now, 5 years ago at 2G, today it's at 20G
averages 15-20 peers a year joining.
72 FE ports, 34 gig ports, 10 multigig, and 6 10G
port, and 1 20G port.
IPv4 and IPv6 on route server
portal is great, works on mobile devices
Arnold Nipper from DECIX,
still an ethernet switch. IPv6 is 300 customers.
lots of eastern europe, biggest location for
eastern european traffic.
Changed pricing model, simplified it,
GE down to 500EU, 3rd interaxtion site at frankfurt 5
later this year.
Marice Dean, France-IX--initiative in Paris, to try
to consolidate existing exchanges there; 11 or 12
active exchanges there.
Exchanges are free to use, but you get what you
pay for.
Trying to build metro fabric to extend to interesting
locations there.
There's a lot of fiber that drops near there, but not
much interconnection.
Started negotiations with existing ISPs
designed something, working to deploy it.
First effort was to get a logo; always the most
important bit.
Launch date, Dec 15th, French NOG in Paris
setting up legal structure, non profit interest group
and working to turn stuff up.
Slide showing first 8 nodes goes up. Fibers with DWDM.
Force10 switches.
info(a)france-ix.fr
Not still free. Cost for ports will be designed to
create a surplus for re-investment into future.
will track other exchanges, including 100Mb ports for
free
Do they want to consolidate all exchanges?
Long tail of locations, provide easy migration path out
of. Will launch with 120 participants.
Started as group of individuals,
Shintaro Kojima AS7521, JPNAP, IIJ and NTT primary
Also Tokyo major ICP and IXP investor.
70 v4 customers plus 30 v6 customers
traffic today is about 30Gb Tokyo and 50Gb in Osaka
Also Equinix in Tokyo in and Osaka
Customer ports available. Switches will look at
optic power for customers as well.
No route servers yet. So, nothing really new this
time, hope for better updates next NANOG.
Innovation@peeringDB from RAS.
he wrote it, people seem to be using it.
nothing to unveil yet. working on a wide variety of
things. Aiming for total re-write, but preserve
existing data.
Looking for a couple more admins; current workload
for peering DB is rising; Patrick is doing most of
it and is looking burned out. If you're into
masochism, and want to answer lots of email, let
him know, let people take a break.
Hopefully soon, will be cool innovations.
Maurice now has standing policy that in order to
peer with Google, you have to have a peeringDB
entry. ME and AP folks having trouble registering.
They feel they're not getting responded to. Could
be challenges getting approvals for accounts.
The interface is only in English, it might be something
that Google could help with translating for them.
Patrick was on vacation in China for 3 weeks, so he's
behind.
This project that we started as a freebie in the
community needs more support. There's challenges
to taking money for the project.
Translation--they don't get too many requests
from people about getting other languages
supported. If someone wants to volunteer to
regionalize the pages during the rewrite, then
they'll see if they can feed it in.
There's several people helping out; Randy Epstein
from host.net, Terry R, and some other admins like
Josh that are never around.
What's the criteria for being an admin? Patrick
and RAS have a religious stance of it being 100%
agnostic; must be objective store of data you put
in it. Has to be someone who has the same level
of belief and won't abuse the trust.
Send email to admin(a)peeringdb.com if you'd like to
help out with admin role.
What is workload like? Many hands make light work;
many emails a day. Right now, a couple of hours
a day.
Please be as clear as possible when doing clear
text requests; 30% is people leaving roles,
mergers, acquisitions, etc; no indication of
who represents a given company.
If you've got skills in detecting issues with
people's peering knowledge, you can help out.
Why do people have 13 peeringDB accounts?
Why not have role accounts? Will have a limit
on how many people can have accounts on the system.
If you send a request in for a change for ownership
in system, if you want someone else's record changed,
have documentation to back it up.
Closing moments; final Q&A?
Mail questions to nanog47(a)corbe.tt
AS20144--administrated by ICANN, DNS admin team,
for root server; live at 10G at 3 locations, v4 and v6,
looking for 2 more sites. Need to expand Europe
footprint, looking to see if he can get transport
services. DNSSec is coming, nobody knows how the
traffic will grow; before root is sign, l.root will
be signed first, they want to make sure there's
enough capacity to make sure it'll work.
Open peering policy!
Joe notes everyone should vote; the whole voting
population is less than room count.
That's it, see you in austin!
OK, back down to Grand ballroom for the ARIN
open policy hour.
They fire it off at 1805 hours Pacific time.
Preview of draft policies on agenda
policy experience report will get moved to Thursday
Policy Proposal BoF
your time
recent list discussions
Leslie Nobile, a few items from PPML list, NANOG
list, and other places, and will solicit some
feedback from the room on those.
Anything that we want to bring to the mics?
Nothing from anyone so far.
Preview first, then BoF proper.
Draft policy review
5 on agenda for this week.
not for discussion at this BoF
please read them, if you haven't already
have staff and legal review
draft policies have been on public list
will be presented at full meeting.
Don't talk about them tonight, save it for the
list or tomorrow!
Policy development process, flow chart are in it
as well.
2009-3: Global proposal
allocation of IPv4 blocks to RIRs
been submitted to all 5 RIRs; must be accepted
by all 5 RIRs, and then ICANN board will review
and adopt.
All 5 RIRs have it; this is the ARIN version.
Right now, RIR can go to IANA, show what they
use, and they get get more, usually 1 or 2 /8s.
Once there's no more IANA /8s in free pool.
At that point, RIRs return IPv4 space to IANA
when they get it back to build new free pool
Once every 6 months, RIR can ask for 1/10th of
free pool as allocation.
2009-5: multiple discrete networks
allow IPv6 initial and subsequent requests for discrete,
independent networks
/32 for ISPs, /48s for end users (and larger)
2009-6: Global: IANA policy for allocation of ASN blocks
to RIRs.
Right now, 2 pools; 16bit and 32bit
as one pool gets lower, they can go to IANA and request
more of that type.
After Jan 1, 2010, all RIRs will be locked into same
pool; will have to show usage of all ASNs before
getting more.
This would extend ability to get ASNs of each type
for one more year.
Submitted to all 5 RIRs.
2009-7: Open access to IPv6
removes to rules for initial allocation
removes requirement to advertise single aggregate
allocation
remove requirement to be a known ISP in ARIN region
or to have plane to make 200 assignments in 5 years.
2009-8: equitable IPv4 runout
slows distribution of IPv4 space
ISPs that come to ARIN, and that have been members for
a year, can request a 12 month supply.
This would reduce supply period based on how many
available /8s IANA has left.
At 20 /8s, goes down to 6 months supply.
At 10 /8s, everyone stuck with 3 month need.
Sets maximium prefix size based on ARINs free pool;
1/4 of ARIN's free pool, rounded down.
Read the summaries, draft policies, staff assessments,
etc, come to meetings prepared to discuss them.
Now, on to Policy BoF
Informal setting
presentation of ideas
discussion and feedback
(5 minutes total)
No committments at this time!
Going forward
your choice:
do nothing
continue discussions informally
take the discussion to PPML
submit a policy proposal.
So...that's the rules--who has something to talk about?
Remote participation is allowed too...but nobody's
in the room.
Lee Howard, TWC, ARIN board of trustees, the trustees
not allowed to propose, so he's just breaking the ice.
Some discussion during NANOG portion of week;
routing considerations around ARIN policies.
Should ARIN policies take into consideration any
routing considerations?
Dani from PeakWeb
The precedent from IPv4 side is that ARIN doesn't
guarantee routing; it just does registration
services.
That's really where it needs to be.
We're smarter now, we need to take the language
out.
Not enough of us were really watching when the
2bit to 4bit ASNs transition happened; we need
to start getting involved sooner, and speak up
earlier in the process.
We need to focus on proper sizing of allocations,
and let business determine usage.
In IPv4 world, we were trying to deaggregate
class Bs...it eventually worked its way out
in IPv4 world, it'll be able to work its way
out in IPv6 if we let it.
Jason Schiller, Verizon; ARIN is chartered to
shape policy; and policies will shape routing
decisions. If ARIN starts allocating /30s,
they may not guarantee routability, but once
ARIN starts giving them out, and one ISP
routes them, the pressure will be there for
everyone to route them.
It's useful to be able to take ARIN policy back
to help sell best practices inside your company.
Jon, Internet society
If we're walking in the space of a policy that
will be discussed later...the transfer policy
was difficult for the panelists to understand;
they had to call in lawyers to try to interpret
it.
That kind of feedback from NANOG panelists doesn't
fit with the 3 goals of ARIN.
Clear, technically sound, and useful.
The routing policy question--obligation of ARIN
and other RIRs that they not just conserve scarce
resource, but conserve slots in the routing table
which is a shared commons, globally. There will
be more discussion of economics during the week.
The tragedy of the commons is well known, and is
well documented; there's economic incentive for
each, but if it happens unbounded, the commons
get destroyed, and they all die.
The global routing table is a global commons;
adding to it will be in the interests of every
individual network access provider.
There is an obligation to preserve slots in the
global routing table...
Aggregation is a goal in the number resource
model, but it's not a criteria.
Cathy Aaronson--irony of statement.
The aggregate part in statement was to preserve
global routing table slots. That was the intent
at the time.
John Curran, president, CEO of ARIN.
The incorporation and bylaws are wide-reaching,
and talk about technical coordination, which is
very vast. There are things tied to number allocation
which are in NRPM, but talk about visibility of
information in whois.
the ability to abide by NRPM can be used to decide if
people get new resources, or get to keep existing
resources.
If this group wants to govern what goes into the
routing table, it can go in.
But the community needs to decide if that's a space
we get involved in adding and enforcing via the NRPM.
We can put constraints on routing in NRPM, like we
do with whois visibility. It's up to us.
Ed Kern--he'd love to have it in the policy to make
Jason to route all the /30s. :D The v6 allocation
was BCP in the policy strategy. It should be taken
out now, and moved to a BCP status. IETF and ITU
aren't the right forums for this.
Leo Bicknell, ARIN advisory council
we have the discussions repeatedly. The numbers
agency and network operators exist in symbiotic
relationship; the numbers are needed for routing,
and without routing, there's no need for number
resource registration.
As with any symbiotic relationship, both parties
need to understand the other's needs; both sides
need to keep the other healthy.
ARIN community needs to understand the limitations
of routers and policies that operators are using.
It is not useful for ARIN community to dictate to
operators how to configure their devices...in
general.
Operators need to understand implications of
policy on a 50 year span, not just next year.
Provide useful information on when routers are
likely to fall over back to policy team.
More information sharing, and less dictating
needs to happen.
Dave Farmer, ARIN AC.
Everyone needs to chill out just a little bit.
It's your routers, your policies, yes.
But you have to let ARIN know what policies
make routing policies possible.
It wouldn't be possible to be able to take /48s
for critical infrastructure if it didn't come
from one little corner.
"For this piece of stuff, this is what you can do"
ARIN needs to assign numbers in a reasonable fashion
to allow operators to make decisions around the
numbers.
The ability we have to write policies stems from
ARIN's allocations of addresses in a coherent fashion.
Cathy again.
She's super-excited that people noticed the IPv6
allocation policy, since it's been there for 10
years! finally, people are looking!
When they went from /19 to /20, they put notes
in saying they were going to look at routing
tables, and retract if it caused too much pain.
Lee Howard--delighted with feedback to that
topic of conversation. People need to send
email indicating the words to arin.ppml(a)arin.net;
if you don't know how to write the words, they
will help you write the words. Their job is
to help you write clear, concise, useful words.
And vote for him on Friday!
New topic from Cathy
Something for ARIN to answer; with v6 allocations, they
are not being sparsely allocated, they are consecutively
being allocated. Is this on purpose?
A: yes, it's on purpose. No sparse allocation in v6.
Only 1 RIR is doing sparse, that's APNIC.
They do need to discuss it, John is nodding, they
will discuss it but have not done it yet.
Doni again
Question about if ARIN wants to move from consecutive
to sparse, is that policy based, or can ARIN just
move to do that internally?
A: ARIN can do that internally; Dave Conrad notes that
the initial goal was to use sparse allocation, so it
is a goal, but also a work in progress.
Leah Roberts, ARIN AC
Increment between them could be bumped up before moving
to sparse allocations; could it be moved up a few bits
to a nibble boundry at least? /29 doesn't map very
well.
Anything else from community members, policy-wise?
Martin Hannigan
Recovery; should we revisit it today?
it's becoming aparent there will be 2 internets out
there; you'll need both addresses for quite some
time. There will be a market for v4 addresses;
it would be better to see it be rational and fair.
There's operators, policy, and there are shareholders
as well. Some want to be good, but others have to
keep the economy going, and get our paycheck.
We'll probably see /28s on the internet so v4 can still
'grow' while the move to v6 trundles along.
How do we manage /8s locally, not just under global
policy.
Scott Liebrand
There have been several policies to take baby steps
along the path--what suggestions does he have for
moving in that direction?
Martin replies:
Bite off the low hanging fruit--just define what it
is. Stragglers, things out there that aren't in
routing table, no valid contacts, etc.
We've had a mishmash, but no coherent plan.
This needs to not be tied into other policies.
Needs to strictly be about reclamation. Start
low, and then move to high stuff.
Lee Howard
you said recovery a few times; do you mean recovery
of IPv4 unused or underused space?
Unused is very different from underused.
We don't have any policy about underuse of space;
you need to have minimal use to get *more* space.
NRPR section 12--John Curran notes that you should
read the manual; he looks at it many times, and
comes away shaking. Policy provides ARIN the
necessary tool to do a few things:
it's up to ARIN staff organization to use that
policy; they use it now for addresses that are
not legitimately held by anyone at all.
They can prevent unheld resources from being held
by a party.
If that's low hanging fruit, as it is brought
to ARIN, ARIN is attempting to make sure they don't
get legitimized by ARIN updating records. But that's
reactive based on suspicious requests coming in.
Other case is resources that aren't being used,
but are legitimately held, even if it's not being
used, never routed, etc.
Those unused or heavily underutilized resources
are *not* being touched right now. They have a
legacy RSA agreement, that once signed, prevents
ARIN from ever doing anything with that block.
So, no legitimate holder they can catch. But
the ones with legitimate holders, they cannnot
both offer a legacy RSA, and simultaneously
move against those resources.
To move against resources, we need to resolve that
against legacy RSA agreements.
300 RSA signers, but that covers more than 25% of
the legacy space; so there's more and more coverage
of that space; once signed, they're part of the
system, and by contract, they is no method to
do reclamation on that space.
If we're going to change the legacy RSA, he needs
to know now!
Chris Grundeman, TWC
There was a policy after last meeting, 2008-7, enacted
after last meeting, the intention was to help identify
the fallow space. The tool will be there to help
identify it for reclamation.
Owen DeLong, HE.net
section 12 para 5 attempts to reconcile issues;
ARIN can reclaim space allocated by ARIN for
under use when legacy RSA is signed.
Leo Bicknell--the legacy space is mired in issues.
non-legacy space, RSA states that if ARIN believes
the space is not being used for the original purpose,
you may need to re-justify it, and if it cannot be
justified, ARIN may reclaim it.
John: they go after such resources *when* they come
to ARIN and attention is drawn to them. They have
reviewed and revoked resources based on that, but
they are not going out and looking for space that
would fall under those terms.
Most reporting to ARIN is under fraud reporting
process. It only is used if people feel that
fraudulent claims are made to ARIN in the application
process; any other legal issues are *not* moved on.
John, ISOC
Low hanging fruit may still be on tree because it is
rotten. APNIC talks about audit trails for space that
is recovered. If you reallocate it to someone, and it
turns it is ACL'd off or blacklisted for places they
need to reach, they are not better off for getting
the space.
When space is recovered, can its history go with the
block, so that potential recipients know what they
are getting.
Leslie Nobile notes that space reclaimed through various
means is held for 1 full year, and they use RBLs, checking
140 RBLs and lists, and noting that it has been fallow
for a year; they attempt to ensure they are issuing
clean space as much as possible. they are very aware
of this, it's not a policy, but it's an internal
proceedure.
Martin
policies and procedures are great, perhaps if ARIN
could wave the flag and let us know, that would be
great.
There's some low-hanging fruit that isn't caught
by the policies; if you're the POC, and the company
went bankrupt, it's really easy for POC to just hold
onto the space.
He thinks there's some low hanging fruit we may be
stepped on.
Also, legacy /8s getting returned need a local, non
global policy to handle them.
XXX from Jamaica, covers issues around ICT,
learning a lot at ARIN meeting.
Reading mission statement up on wall, a question
to the staff and community.
How do you draw line between a watchdog or deal
with issues, when one main activity is to facilitate
the advancement of Internet while outreach and
education is a primary goal.
It seems the Internet is such a huge monster, it
needs this broad-based consensus at all times.
The issues are overwhelming, the v4 to v6 migration
needs even more education and outreach around it.
She's learning a lot, and hopes ARIN can help
educate even more about how these issues can be
addressed and handled in the Carribean region.
We're out of time; it's a few minutes after seven.
Beer and pizza party up in rotunda, first elevator
on the left, runs from 7pm to 9pm.
Thanks to everyone who brought questions to the
microphone today!!
1
0
Hey all, apologies for shooting this on this list, but I've had greater
success here.
Anyone have a SECURITY contact for "Amazon Web Services, Elastic Compute
Cloud, EC2" outside of the typical: whois -h whois.arin.net
$THEIRSPACE|grep "@"
I'm looking at a delicate situation here and would appreciate any
OOB/non-tech-sup-spool-box contact.
--
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
J. Oquendo
SGFA, SGFE, C|EH, CNDA, CHFI, OSCP
"It takes 20 years to build a reputation and five minutes to
ruin it. If you think about that, you'll do things
differently." - Warren Buffett
227C 5D35 7DCB 0893 95AA 4771 1DCE 1FD1 5CCD 6B5E
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x5CCD6B5E
1
0
Or is it just me?
None seem to come up now.
2
2
Here's my notes from this morning's sessions. :)
Off to lunch now!
Matt
2009.10.20 NANOG day 2 notes, first half
Dave Meyer kicks things off at 0934 hours
Eastern time.
Survey! Fill it out!
http://tinyurl.com/nanog47
Cathy Aaronson will start off with a rememberance
of Abha Ahuja. She mentored, chaired working
groups, she helped found the net-grrls group;
she was always in motion, always writing software
to help other people. She always had a smile, always
had lots to share with people.
If you buy a tee shirt, Cathy will match the donation.
John Curran is up next, chairman of ARIN
Thanks to NANOG SC and Merit for the joint meeting;
Add your operator perspective!
Vote today in the NRO number council election!
You can vote with your nanog registration email.
https://www.arin.net/app/election
Join us tonight for open policy hour (this room)
and happy hour (rotunda)
Participate in tomorrow's IPv6 panel discussion
and the rest of the ARIN meeting.
You can also talk to the people at the election
help desk.
During the open policy hour, they'll discuss the
policies currently on the table.
And please join in the IPv6 panel tomorrow!
If you can, stay for the ARIN meeting, running
through Friday.
This includes policy for allocation of ASN blocks
to RIRs
Allocation of IPv4 blocks to RIRs
Open access to IPv6 (make barriers even lower)
IPv6 multiple discrete networks (if you have non
connected network nodes)
Equitable IPv4 run-out (what happens when the free
pool gets smaller and smaller!)
Tomorrow's Joint NANOG panel
IPv6--emerging success stories
Whois RESTful web service
Lame DNS testing
Use of ARIN templates
consultation process ongoing now; do we want to
maintain email-based access for all template types?
Greg Hankins is up next for 40GbE and 100GbE
standards update--IEEE P802.3ba
Lots of activity to finalize the new standards specs
many changes in 2006-2008 as objectives first developed
After draft 1.0, less news to report as task force
started comment resolution and began work towards the
final standard
Finished draft 2.2 in august, crossing Is, dotting Ts
Working towards sponsor ballot and draft 3.0
On schedule for delivery in June 2010
Copper interface moved from 10meter to 7meter.
100m on multimode,
added 125m on OM4 fiber, slightly better grade.
CFP is the module people are working towards as
a standard.
Timeline slide--shows the draft milestones that
IEEE must meet. It's actually hard to get hardware
out the door based around standards definitions.
If you do silicon development and you jump in too
fast, the standard can change under you; but if you
wait too long, you won't be ready when the standard
is fully ratified.
July 2009, Draft 2 (2.2), no more technical changes,
so MSAs have gotten together and started rolling
out pre-standard cards into market.
Draft 3.0 is big next goal, it goes to ballot for
approval for final standards track.
After Draft 3.0, you'll see people start ramping
up for volume production.
Draft 2.x will be technically complete for WG ballot
tech spec finalized
first gen pre-standard components have hit market
technology demonstrations and forums
New media modules:
QSFP modules
created for high density short reach interfaces
(came from Infiniband)
Used for 40GBASE-CR4 and 40GBASE-SR4
CXP modules
proposed for infiniband and 100GE
12 channels
100GbE uses 10 of 12 channels
used for 100GBASE-10
CFP Modules
long reach apps
big package
used for SR4, LR4, SR10, LR4, ER4
about twice the size of a Xenpak
100G and 40G options for it.
MPO/MTP cable
multi-fiber push-on
high-density fiber option
40GBASE-SR4
12 fiber MPO uses 8 fibers
100GBASE-SR10
24 fiber MPO cable, uses 20 fibers
this will make cross connects a challenge
Switches and Routers
several vendors working on pre-standard cards,
you saw some at beer and gear last night.
Alcatel, Juniper
First gen tech will be somewhat expensive and
low density
geared for those who can afford it initially and
really need it.
Nx10G LAG may be more cost effective
higher speed interfaces will make 10GbE denser and
cheaper
Density improves as vendors develop higher capacity
systems to use these cards
density requires > 400Gbps/slot for 4x100GbE ports
Cost will decrease as new technology becomes feasible.
Future meetings
September 2009, Draft 2.2 comment resolution
Nov 2009 plenary
Nov 15-20, Atlanta
Draft 3.0 and sponsor ballot
http://grouper.ieee.org/groups/802/3/ba/index.html
You have to go to meeting to get password for the
draft, unfortunately.
Look at your roadmap for next few years
get timelines from your vendors
optical gear, switches, routers
server vendors
transport and IP transit providers, IXs
Others?
figure out what is missing and ask for it
will it work with your optical systems
what about your cabling infrastructure
40km 40GbE
Ethernet OAM
Jumbo frames?
There's no 40km offering now; if you need it,
start asking for it!
Demand for other interfaces
standard defines a flexible architecture, enables
many implementations as technology changes
Expect more MSAs as tech develops and becomes cost
effective
serial signaliing spec
duplex MMF spec
25Gbps signalling for 100GbE backplane and copper
apps
Incorporation of Energy Efficient Ethernet (P802.3az)
to reduce energy consumption during idle times.
Traffic will continue to increase
Need for TbE is already being discussed by network
operations
Ethernet will continue to evolve as network requirements
change.
Question, interesting references.
Dani Roisman, PeakWeb
RSTP to MST spanning tree migration in a live datacenter
Had to migrate from a Per-vlan RSTP to MST on a
highly utilized network
So, minimal impact to a live production network
define best practices for MST deployment that will
yield maximal stability and future flexibility
Had minimal reference material to base this on
Focus on this is about real-world migration details
read white papers and vendor docs for specs on each
type.
The environment:
managed hosting facility
needed flexibility of any vlan to any server, any rack
each customer has own subnet, own vlan
Dual-uplinks from top-of-rack switches to core.
High number of STP logical port instances
using rapid pvst on core
Multiple VLAN*interface count = logical port instances
Too many spanning tree instances for layer 3 core switch
concerns around CPU utilization, memory, other resource
exhaustion at the core.
Vendor support: per-vlan STP
Cisco: per-vlan is the default config, cannot switch
to single-instance STP
foundry/brocade offers per vlan mode to interoperate
with cisco
Juniper MX and EX offers vstp to interoperate
Force10 FTOS
Are we too spoiled with per-vlan spanning tree?
don't need per-vlan spanning tree, don't want to
utilize alternate path during steady-state since
we want to guarantee 100% capacity during
failure scenario
options:
collapse from per-vlan to single-instance STP
Migrate to standards-based 802.1s MSTP
(multiple spanning tree--but really going to fewer
spanning trees!)
MST introduces new configuration complexity
all switches within region must have same
vlan-to-mst mapping
means any vlan or mst change must be done
universally to all devices in site.
issues with change control; all rack switches
must be touched when making single change.
Do they do one MST that covers all vlans?
Do they pre-create instances?
do all vendors support large instance numbers?
No, some only support instances 1-16
Had to do migration with zero downtime if possible
Used a lab environment with some L3 and L2 gear
Found a way to get it down to one STP cycle of 45secs
Know your roots! Set cores to "highest" STP priority
(lowest value)
Set rack switches to lower-than-default to ensure
they never become root.
Start from roots, then work your way down.
MSTP runs RSTP for backwards compatability
choose VLAN groups carefully.
Instance numbering
some only support small number, 1-16
starting point
all devices running 802.1w
core 1 root at 8192
core 2 root at 16384
You can pre-config all the devices with spanning
tree mapping, but they don't go live until final
command is entered
Don't use vlan 1!
set mst priority for your cores and rack switches.
don't forget MST 0!
vlan 1 hangs out in MST 0!
First network hit; when you change core 1 to
spanning mode mst
step 2, core2 moves to mst mode; brief blocking
moment.
step 3; rack switches, one at a time, go into
brief blocking cycle.
Ongoing maintenance
all new devices must be pre-configured with identical
MST params
any vlan to instance mapping changes, do to core 1
first
no protocol for MST config propagation
vtp follow-on?
MST adds config complexity
MST allows for great multi-vendor interoperability in
a layer 2 datacenter
only deployed a few times--more feedback would be
good.
Q:
Leo Bicknell, ISC; he's done several; he points
half rack switches at one core, other half at
other core; that way in core failure, only half
of traffic sloshes; also, on that way with traffic
on both sides, failed links showed up much more
quickly.
Any device in any rack has to support any vlan
is a scaling problem. Most sites end up going
to Layer3 on rack switches, which scales much
better.
A: Running hot on both sides, 50/50 is good for
making sure both paths are working; active/
standby allows for hidden failures. But
since they set up and then leave, they
needed to make sure what they leave behind is
simple for the customer to operate.
The Layer3 move is harder for managed hosting,
you don't know how many servers will want in a
given rack switch.
Q: someone else comes to mic, ran into same
type of issue. They set up their network
to have no loops by design.
Each switch had 4x1G uplinks; but when they
had flapping, it tended to melt CPU.
Vendor pushed them towards Layer3, but they
needed flexibility for any to any.
They did pruning of vlans on trunk ports;
but they ended up with little "islands" of
MST where vlans weren't trunked up.
Left those as odd 'separate' root islands,
rather than trying to fix them.
A: So many services are built around broadcast
and multicast style topologies that it's hard
to mode to Layer3, especially as virtualization
takes off; the ability to move instances around
the datacenter is really crucial for those
virtualized sites.
David Maltz, Microsoft Research
Datacenter challenges--building networks for agility
brief characterization of "mega" cloud datacenters
based on industry studies
costs
pain-points
traffic pattern characteristics in data centers
VL2--virtual layer 2
network virtualization
uniform high capacity
Cloud service datacenter
50k-200k servers
scale-out is paramount; some services have 10s of
servers, others 10s of 1000s.
servers divided up among hundreds of services
Costs for servers dominates datacenter cost:
servers 45%, power ifrastructure 25%,
maximiize useful work per dollar spent
ugly secret: 10-30% CPU utilization considered "good"
in datacenters
servers not doing anything at all
cause
server are purchased rarely (quarterly)
reassigning servers is hard
every tenant hoards servers
solution: more agility: any server, any service
Network diagram showing L3/L2 datacenter model
higher in datacenter, more expensive gear, designed
for 1+1 redundancy, scale-up model, higher in model
handles higher traffic levels.
Failure higher in model is more impactful.
10G off rack level, rack level 1G
Generally about 4,000 servers per L2 domain
network pod model keeps us from dynamically
growing/shrinking capacity
VLANs used to isolate properties from each otehr
IP addresses topologically determined by ARs
Reconfig of IPs and vlan trunks is painful,
error-prone, and takes time.
No performance isolation (vlan is reachability
isolation only)
one service sending/receiving too much stomps on
other services
Less and less capacity available for each server
as you go to higher levels of network: 80:1 to 240:1
oversubscriptions
2 types of apps: inward facing (HPC) and outward
facing. 80% of traffic is internal traffic; data
mining, ad relevance, indexing, etc.
dynamic reassignment of servers and map/reduce
style computations means explicit TE is almost
impossible.
Did a detailed study of 1500 servers on 79 ToR
switches.
Look at every 5-tuple for every connection.
Most of the flows are 100 to 1000 bytes; lots
of bursty, small traffic.
But most bytes are part of flows that are 100MB
are larger. Huge dichotomy not seen on internet
at large.
median of 10 flows per server to other servers.
how volatile is traffic? cluster the traffic
matrices together.
IF you use 40-60 clusters, cover a day's worth
of traffic. More clusters gives better fit.
traffic patterns change nearly constantly.
80th percentile is 100s; 99 percentile is 800s
server to server traffic matrix; most of the
traffic is diagonal; servers that need to
communicate tend to be grouped to same
top of rack switch.
but off-rack communications slow down the
whole set of server communications.
Faults in datacenter:
high reliability near top of tree, hard to accomplish
maintenance window, unpaired router failed.
0.3% of failure events knocked out all members of
a network redundancy group
typically at lower layers of network, but not always
objectives:
developers want network virtualization; want a model
where all their servers, and only their servers are
plugged into an ethernet switch.
Uniform high capacity
Performance isolation
Layer2 semantics
flat addressing; any server use any IP address
broadcast transmissions
VL2: distinguishing design principles
randomize to cope with volatility
separate names from locations
leverage strengths of end systems
build on proven network technology
what enables a new solution now?
programmable switches with high port density
Fast, cheap, flexible (broadcom, fulcrum)
20 port 10G switch--one big chip with 240G
List price, $10k
small buffers (2MB or 4MB packet buffers)
small forwarding table; 10k FIB entries
flexible environment; general purpose network
processor you can control.
centralized coordination
scale-out datacenters are not like enterprise networks
centralized services already control/monitor health and
role of each server (Autopilot)
Centralized control of traffic
Clos network:
ToR connect to aggs, aggs connect to intermediate node
switches; no direct cross connects.
The bisection bandwidth between each layer is the same,
so there's no need for oversubscription
You only lose 1/n chunk of bandwidth for a single
box; so you can have automated reboot of a device
to try to bring it back if it wigs out.
Use valiant load balancing
every flow is bounced off a random intermediate switch
provably hotspot free for any admissable traffic matrix
works well in practice.
Use encapsulation on cheap dumb devices.
two headers; outer header is for intermediate switch,
intermediate switch pops outer header, inner header
directs packet to destination rack switch.
MAC-in-MAC works well.
leverage strength of endsystems
shim driver at NDIS layer, trap the ARP, bounce to
VL2 agent, look up central system, cache the lookup,
all communication to that dest no longer pays the
lookup penalty.
You add extra kernel drivers to network stack when
you build the VM anyhow, so it's not that crazy.
Applications work with application addresses
AAs are flat names; infrastructure addresses invisible
to apps
How to implement VLB while avoiding need to update
state to every host on every topology change?
many switches are optimized for uplink passthrough;
so it seems to be better to bounce *all* traffic
through intermediate switches, rather than trying
to short-circuit locally.
The intermediate switches all have same IP address,
so they all send to the same intermediate IP, it
picks one switch.
You get anycast+ECMP to get fast failover and good
valiant load balancing.
They've been growing this, and found nearly perfect
load balancing.
All-to-all shuffle of 500MB shuffle among 75 servers;
get within 94% of perfect balancing; they charge for
the extra overhead for extra headers.
NICs aren't entirely full duplex; about 1.8Gb not 2Gb
bidrectional.
Provides good performance isolation as well; as one
service starts up, it has no impact on the service
being running steady state.
VLB does as well as adaptive routing (TE using
oracle) on datacenter traffic
worst link is 20% busier with VLB; median is same.
And that's assuming perfect knowledge of future
traffic flows.
Related work:
OpenFlow
wow that went fast!
Key to economic data is agility!
any server any service
network is largest blocker
right network model to create is virtual layer 2
per service
VL2 uses:
randomization
name-location separation
end systems
Q: Joe Provo--shim only applied to intra-datacenter
traffic; external traffic is *NOT* encapsulated?
A: Yes!
Q: This looks familiar to 802.1aq in IEEE; when you
did the test case, how many did you look at moving
across virtualized domains?
A: because they punt to centralized name system,
there is no limit to how often servers are switched,
or how many servers you use; you can have 10 servers
or 100,000 servers; they can move resources on 10ms
granularity.
Scalability is how many servers can go into VL2 "vlan"
and update the information.
In terms of number of virtual layer 2 environments,
it's looking like 100s to 1000s.
IEEE is looking at MAC-in-MAC for silicon based benefits;
vlans won't scale, so they use 802.1h header, gives
them 16M possibility, use IS-IS to replace spanning tree.
Did they look at moving entire topologies, or just servers?
They don't want to move whole topology, just movement in
the leaves.
Todd Underwood, Google; separate tenants, all work for
the same company, but they all have different stacks,
no coordination among them. this sounds like a
competing federation within the same company; why
does microsoft need this?
A: If you can handle this chaos, you can handle
anything!
And in addition to hosting their own services, they
also do hosting of other outsourced services like
exchange and sharepoint.
Microsoft has hundreds of internal properties
essentially.
Q: this punts on making the software side working
together, right? Makes the network handle it at
the many to many layer.
Q: Dani, Peakweb--how often is the shim lookup happening,
is it start of every flow?
A: Yes, start of every flow; that works out well; you
could aggregate, have a routing table, but doing it
per dest flow works well.
Q: Is it all L3, or is there any spanning tree involved?
A: No, network is all L3.
Q: Did you look at woven at all?
A: Their solution works to about 4,000 servers, but it
doesn't scale beyond that.
Break for 25 minutes now,
11:40 start time. We'll pop in a few more lightning
talks.
Somebody left glasses at beer and gear, reg desk has
them. :)
Break now!
Vote for SC members!!
Next up, Mirjam Kuhne, RIPE NCC,
RIPE Labs, new initiative of RIPE NCC
First there was RIPE, the equivalent of NANOG,
then NCC came into existence to handle the
meeting cordination, registrar, handled mailing
lists, etc.
RIPE Labs is a website, and a platform and a tool
for the community
You can test and evaluate new tools and prototypes
contribute new ideas
why RIPE labs?
faster, tighter innovation cycle
provide useful prototypes to you earlier
adapt to the changing environment more quickly
closer involvement of the community
openness
make feedback and suggestions faster and more
effective
http://labs.ripe.net/
many of the talks here are perfect candidates for
material to post on labs, to get feedback from your
colleagues, get research results, post new findings.
How can it benefit you?
get involved, share information, discover others
working on similar issues, get more exposure.
Few rules:
free and civil discussion between individuals
anyone can read content
register before contributing
no service guarantee
content can disappear based on
community feedback
legal or abuse issues
too little resources
What's on RIPE Labs?
DNS Lameness measurement tool
REX, the resource explorer
Intro to internet number resource database
IP address usage movies
16-bit ASN exhaustion data
NetSent next gen information service
Please take a look and participate!
mir at ripe.net or labs at ripe.net
Q: Cathy Aaronson notes that ISP security
BOF is looking for place to disseminate
information; but they should probably get
in touch with you!
Kevin Oberman is up next, from ESnet
DNSSec Basics--don't fear the signer!
why you should sign your data sooner rather
than later
this is your one shot to experiment with signing
when you can screw up and nobody will care!
later, you screw up, you disappear from the net.
DNSSEC uses public crypto, similar to SSH
DNSSEC uses anchor trust system, NOT PKI! No certs!
Starts at root, and traces down.
Root key is well known
Root knows net key
net knows es key
es key signs *.es.net
Perfect time to test and experiment without fear.
Once you publish keys, and people validate, you
don't want to experiment and fail--you will
disappear!
signing your information has no impact.
Only when you publish your keys will it have impact.
It is REALLY getting closer!
Root will be signed 2010
Org and Gov are signed now
com and net should be signed 2011
Multiple ccTLDs are signed; .se led the way,
and have lots of experience; only once did they
disappear, and that was due to missing dot in
config file; not at all DNSSEC related.
Registration issues still being worked on
transfers are of particular concern
an unhappy losing registrar could hurt you!
Implementation
Until your parent is ready
Develop signing policies and procedures
test, test, and test some more
key re-signing
key rolls
management tools
find out how to transfer the initial key to your parent
(when parent decides)
this is a trust issue--are you really "big-bank.com"
If you're brave
you can test validation (very few doing it--test on
internal server first!!) -- if this breaks, your
users will hurt (but not outside world)
You can give your public keys to the DLV (or ITARs)
this can hurt even more!
(DLV is automated, works with BIND out of box, it's
simpler, but you can chose which way to go)
What to sign?
Forward zone is big win
reverse zone has less value
may not want to sign some or all reverse or forward zones
signing involves 2 types of keys
ZSK, KSK, zone data key and key for sending keys to parent
keys need to be rolled regularly
if all keys and signatures expire, you lose all access,
period.
use two active keys
data resigned by 2 newest keys
sign at short intervals compared to expiration to
allow time to fix things.
new keys require parent to be notified.
ksks are 'safe', not on network (rotate annually)
Wait for BIND 9.7, it'll make your life much
easier.
There are commerical shipping products out there.
Make sure there are at least 2 people who can
run it, in case one gets hit by a bus.
Read NIST SP800-81
SP800-81r1 is out for comment now
Read BIND admin reference manual.
Once in a lifetime opportunity!!
Arien Vijin, AMS-IX
an MPLS/VPLS based internet exchange
(started off as a coax cable between routers)
then became Cisco 5500 switch, AMSIX version 1,
then 2001 went to Foundry switches at gig, version 2,
version 3 has optical switching
AMSIX version 3 vs AMSIX vs 4
June 2009 version 3
six sites, 2 with core switches in middle
two star networks
E, FE, GE, N*GE connections on BI-15K or RX8 switches
N*10GE connextions resilient connected on switching
platform (MLX16 or MLX32)
two separate networks, one active at any moment in
time.
selection of active network by VSRP
inactive network switch blocks ports to prevent loops
photonic switch basically flips from one network to the
other network.
Network had some scaling problems at the end.
Until now, they could always just buy bigger
boxes in the core to handle traffic.
Summer of 2009, they realized there was no sign of
a bigger switch on the horizon to replace the core.
core switches fully utilized with 10GE ports
limits ISL upgrade
no other switches on market
platform failover introduces short link flap on all
10GE customer ports--this leads to BGP flaps
with more 10G customers this becomes more of an issue
AMSIX version 4 requirements
scale to 2x port count
keep resilience in platform, but reduce impact on
failover (photonic switch layer)
increase amount of 10G customer ports on access switches
more local switching
migrate to single architecture platform
reduce management overhead
use future-proof platform that supports 40GE and 100GE
2010/2011 fully standardized
They moved to 4 central core switches, all meshed
together; every edge switch has 4 links, one to each
core.
Photonic switch for 10G members, to have redundancy
for customers.
MPLS/VPLS-based peering platform
scaling of core switches by adding extra switches in
parallel
4 LSPs between each pair of access switches
primary and secondary (backup) paths defined
OSPF
bfd for fast detection of link failures
RSVP-TE signalled LSPs over predefined paths
primary/secondary paths defined
VPLS instance per vlan
static defined VPLS peers (LDP signalle)
load balanced over parallel LSPs over all core routers
Layer 2 ACLs instead of port security
manual adjustment for now
(people have to call with new MAC addresses)
Now they're P/PE routers, not core and access
switches. ^_^;
Resilience is handled by LSP switchover from
primary to secondary path; totally transparent
to access router.
If whole switch breaks down, photonic switch
is used to flip all customers to the secondary
switch.
So, they can only run switches at 50% to allow
for photonic failover of traffic.
How to migrate the platform without customer
impact?
Build new version of photonic switch control daemon (PSCD)
No VSRP traps, but LSP state in MPLS cloud
develop configuration automation
describe network in XML, generate configuration from this
Move non MPLS capable access switches behind MPLS
routers and PXC as a 10GE customer connection
Upgrade all non MPLS capable 10GE access switches to
Brocade MLX hardware
Define migration scenario with no customer impact
2 colocation sites only for simplicity
double L2 network
VSRP for master/slave selection and loop protection
Move GE access behind PXC
Migrate one half to MPLS/VPLS network
Use PXC to move traffic to MPLS/VPLS network, test
for several weeks.
After six weeks, did the second half of the network.
Now, two separate MPLS/VPLS networks.
Waited for traffic on all backbone links to drop
below 50%; split uplinks to hit all the core P
devices; at that point, traffic then began using
paths through all 4 P router cores.
Migration--Conclusion
Traffic load balancing over multiple core switches
solves scaling issues in the core
Increased stability of the platform
Backbone failures are handled in the MPLS cloud and
not seen at the access level
Access switch failures are handled by PXC for single
pair of switches only
Operational experience
BFD instability
High LC CPU load caused BFD timeouts
resolved by increasing timers
Bug: ghost tunnels
double "up" event for LSP path
results in unequal load balancing
should be fixed in next patch release
multicast replication
replication done on ingress PE, not in core
only uses 1st link of aggregate of 1st LSP
with PIM-SM snooping traffic is balanced over multiple
links but has serious bugs
bugfixes and load balancing fixes scheduled for future
code releases.
Ripe TTM boxes used to measure delay through the fabric,
GPS timestamps.
Enormous amounts of jitter in the fabric, delays up to
40ms in the fabric.
Attempts from TTM, send 2 packets per minute, with some
entropy change (source port changes)
VPLS CAM age out after 60s
for 24-port aggregates, traffic often passes a port
without programming (CPU learning), high delay
does not affect real-world traffic, hopefully
will look to change CAM timing
packet is claustraphobic?
customer stack issue
increased stability
backbone failures handled by MPLS (not seen by customers)
access switch failures handled for a single pair of
switches now
easier debugging of customer ports
swap to different using glimmerglass
config generation
absolute necessity due to large size MPLS/VPLS configs
Scalability (future options)
bigger core
more ports
Some issues were found, but nothing that materially
impacted customer traffic
Traffic load-sharing over multiple links is good.
Q: did anything change for gigE access customers,
or are they still homed to one switch?
A: nothing changed for gigE customers; glimmerglass
is single-mode optical only, and they're too
expensive for cheap GigE ports.
no growth in 1G ports; no more FE ports; it's
really moving to a 10G only fabric.
RAS and Avi are up next
Future of Internet Exchange Points
Brief recap of history of exchange points
0th gen--throw cable over wall; PSI and Sprint
conspire to bypass ANS; third network wanted in,
MAE-East was born
1st commercial gen: FDDI, ethernet; multi-access,
had head of line blocking issues.
2nd gen: ATM exchange points, from AADS/PBNAP to
the MAEs, peermaker
3rd gen: GigE exchange points, mostly nonblocking
internal switches, PAIX, rise of Equinix, LINX,
AMS-iX
4th gen: 10G exchange points, upgrades, scale-out
of existing IXes through 2 or 3 revs of hardware
Modern exchange points are almost exclusively
ethernet based; cheap, no ATM headaches
10GE and Nx10GE have been primary growth for years.
Primarily flat L2 VLAN
IX has IP block (/24 or so)
each member router gets 1 IP
any member can talk to any other via L2
some broadcast (ARP) traffic is needed
well policed
Large IX toplogy (LINX), running 8x10G or 16x10G
trunks between locations
What's the problem?
L2 networks are easy to disrupt
forwarding loops easy to create
broadcast storms easy to create, no TTL
takes down not only exchange point, but overwhelms
peering router control plane as well
today we work around these issues by
locking down port to single MAC
hard coded, or learn single MAC only
single directly connected router port allowed
careful monitoring of member traffic with sniffers
good IXes have well trained staff for rapid responses
Accountablility
most routers have poor L2 stat tracking
options in use:
Netflow from member router
no MAC layer info, can't do inbound traffic
some platforms can't do netflow well at all
SFlow from member routers or from IX operator
still sampled, off by 5% or more
MAC accounting from member router
not available on vast majority of platforms today
None integrate well with provider 95th percentile
billing systems
IXs are a poor choice for delivering billed services
If you can't bill, you can't sell services over the
platform.
Security
Anyone can talk to anyone else
vulnerable to traffic injection
poor accounting options make this hard to detect.
when detected, easy to excuse
less security available for selling paid transit
Vulnerable to Denial of Service attacks
can even be delivered from the outside world if
the IX IP block is announced (as is frequently the case)
Vulnerable to traffic interception, ARP/CAM manipulation
Scalability
difficult to scale and debug large layer 2 networks
redundancy provided through spanning-tree or similar
backup-path protocols
large portions of network placed into blocking mode to
provide redundancy.
Managability
poor controls over traffic rates and or QoS
difficult to manage multi-router redundancy
multiple routers see the same IX/24 in multiple places
creates an "anycast" effect to the peer next-hops
can result in blackholing if there is an IX segmentation
or if there is an outage which doesn't drop link state.
Other issues:
inter-network jumbo-frames support is difficult
no ability to negotiate per-peer MTU
almost impossible to find common acceptable MTU for
everyone
service is constrained to IP only between two routers
can't use for L2 transport handoff
Avi talks about shared broadcast domain architecture
on the exchange points today.
Alternative is to use point to point virtual circuits,
like the ATM exchanges.
Adds overhead to setup process
adds security, accountablity advantages
Under ethernet, you can do vlans using 802.1q
handoff multiple virtual circuit vlans.
Biggest issue is limited VLAN ID space
limited to 4096 possible IDs--12-bit ID space
vlan stacking can scale this in transport
but VLANs in this are global across system
Means a 65 member exchange would completely
fill up the VLAN ID with a full mesh.
Traditional VLAN rewrites don't help either.
Now, the exchange also has to be arbiter of all
the VLANs used on the exchange.
Many customers use layer3 switch/routers, so the
vlan may be global across the whole device.
To get away from broadcast domain without using
strict vlans, we need to look at something else.
MPLS as transport rather than Ethernet
solves vlan scaling problems
MPLS pseudowires are 32bits; 4billion VCs
VLAN ID not carried with the packet, used only on handoffs
VLAN IDs not a shared resource anymore
Solves VLAN ID conflict problems
members chose vlan ID per VC handoff
no requirements for vlan IDs to match on each end
solves network scaling problems
using MPLS TE far more flexible than L2 protocols
allows the IX to build more complex topologies,
interconnect more locations, and more efficiently
utilize resources.
The idea is to move the exchange from L2 to L3 to
scale better, give more flexibility, and do better
debugging. You can get better stats, you can do
parallel traffic handling for scaling and redundancy,
and you see link errors when they happen, they aren't
masked by blocked ports.
Security
each virtual circuit would be isolated and secure
no mechanism for a third party to inject or sniff traffic
significantly reduced DOS potential
Accountability
Most provide SNMP measurement for vlan subints
Members can accurately meaasure traffic on each VC
without "guestimation"
capable of integrating with most billing systems.
Now you can start thinking about selling transport
over exchange points, for example
Takes the exchange point out of the middle of the
traffic accounting process.
Services
with more accountability and security, you can offer
paid services
support for "bandwidth on demand" now possible
no longer constrained to IP-only or one-router-only
can be used to connect transport circuits, SANs, etc.
JumboFrame negotiation possible, since MTU is per
interconnect
Could interconnect with existing metro transport
Use Q-in-Q vlan stacking to extend the network onto
third party infrastructures
imagine a single IX platform to service thousands of
buildings!
Could auto-negotiate VC setup using a web portal
Current exchanges mostly work
with careful engineering to protect the L2 core
with limited locations and chassis
siwth significant redundancy overhead
for IP services only
A new kind of exchange point would be better
could transform a "peering only" platform into a
new "ecosystem" to buy and sell services on.
Q: Arien from AMS-IX asks about MTU--why does it matter?
A: it's for the peer ports on both sides.
Q: they offer private interconnects at AMS-IX, but nobody
wants to do that, they don't want to move to a tagged
port. They like having a single vlan, single IP to
talk to everyone.
A: The reason RAS doesn't do it is that it's limited in
scale, you have to negotiate the vlan IDs with each side;
there's a slow provisioning cycle for it; it needs to
have same level of speed as what we're used to on IRC.
Need to eliminate the fees associated with the VLAN
setup, to make it more attractive.
It'll burn IPs as well (though for v6, that's not so much
of an issue)
Having people peer with the route-server is also useful
for people who don't speak the language who use the
route servers to pass routes back and forth.
The question of going outside amsterdam came up, but
the member forbade it, so that it wouldn't compete with
other transit and transport providers.
But within a metro location, it could open more locations
to participate on the exchange point.
The challenge in doing provisioning to many locations
is something that there is a business model for within
the metro region.
Anything else, fling your questions at lunch; return at
1430 hours!
LUNCH!! And Vote! And fill out your survey!!
1
0
Everyone:
Hope all at NANOG47 in person or remote are enjoying a great Program!!
A couple of reminders....
PC Nominations have closed. Merit is working to process the last minute nominations and acceptance. As soon we we catch up the information will be posted on the website.
MLC Nominations continue.
2009 Election process closes at 9:15 Wednesday am. Please do support the process, it is your community... so VOTE!
http://nanog.org/governance/elections/2009elections/
Lastly, we need your input, do take a moment and complete the survey!
http://www.surveymonkey.com/s.aspx?sm=OGYmCMKmi88ROAl_2fPAlEHw_3d_3d
All Best.
Betty
Merit and SC representative
_______________________________________________
NANOG-announce mailing list
NANOG-announce(a)nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog-announce
1
0
Hi there,
A customer of mine is reporting that there are a large number of addresses
he can not reach with his addresses in the 109/8 range. This was
declassified as a BOGON and assigned by IANA to RIPE in January 2009.
If you have a manually updated BOGON list, can I please ask that you review
it and update it as soon as possible please? His addresses in 89/8 and 83/8
work just fine, hence this presumption of BOGON filtering.
Matthew Walster
6
7