I'm wondering what are the growing trends in connecting Data Centers for redundancy in DR/COOP environments. I imagine VPLS has a big play here, but I'm willing to bet there are all sorts of weirdness that such environments can create, such as the effect it may have on DR elections, etc. Also, what are your experiences in replication of storage over WAN links? Are people tending towards iSCSI or do trends indicate that FCoE or FCoIP may become the preferred mechanism? Any experience with WAN acceleration in such environments? Thanks in advance! -- Stefan Fouant
We are doing: Citrix XenServer environments at both sites with NetApps for the SANs MPLS connections with Riverbeds for WAN op. Let me know if you wanna dig into this deeper. Stefan Fouant wrote:
I'm wondering what are the growing trends in connecting Data Centers for redundancy in DR/COOP environments. I imagine VPLS has a big play here, but I'm willing to bet there are all sorts of weirdness that such environments can create, such as the effect it may have on DR elections, etc. Also, what are your experiences in replication of storage over WAN links? Are people tending towards iSCSI or do trends indicate that FCoE or FCoIP may become the preferred mechanism? Any experience with WAN acceleration in such environments?
Thanks in advance!
-- Stefan Fouant
------------------------------------------------------------------------
No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.423 / Virus Database: 270.14.36/2465 - Release Date: 10/28/09 09:34:00
On Oct 28, 2009, at 8:26 PM, Stefan Fouant wrote:
I'm wondering what are the growing trends in connecting Data Centers for redundancy in DR/COOP environments.
'DR' is an obsolete 40-year-old mainframe concept; it never works, as funding/testing/scaling of the 'backup' systems is never adequate and/ or allowed. Layer-2 between sites is evil, as well. Layer-3-independence and active/active/etc. is where it's at in terms of high availability in the 21st Century. GSLB, et. al. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> Sorry, sometimes I mistake your existential crises for technical insights. -- xkcd #625
On Oct 28, 2009, at 10:38 AM, Roland Dobbins wrote:
On Oct 28, 2009, at 8:26 PM, Stefan Fouant wrote:
I'm wondering what are the growing trends in connecting Data Centers for redundancy in DR/COOP environments.
'DR' is an obsolete 40-year-old mainframe concept; it never works, as funding/testing/scaling of the 'backup' systems is never adequate and/or allowed.
Very true.
Layer-2 between sites is evil, as well.
Indeed. Now VmWare actually supports layer3 for vsphere, maybe we will start to see it go away. :)
Layer-3-independence and active/active/etc. is where it's at in terms of high availability in the 21st Century. GSLB, et. al.
Yep. That way all your environments get adequate(ish) funding. Vs management saying "oh it's just backup/dr, we will fund it next year".
Roland, Could you elaborate on "GSLB" (Global Load Balancing?) ? Pardon if that question seems a bit "noob-ish" Thanks Roland Dobbins wrote:
On Oct 28, 2009, at 8:26 PM, Stefan Fouant wrote:
I'm wondering what are the growing trends in connecting Data Centers for redundancy in DR/COOP environments.
'DR' is an obsolete 40-year-old mainframe concept; it never works, as funding/testing/scaling of the 'backup' systems is never adequate and/or allowed.
Layer-2 between sites is evil, as well.
Layer-3-independence and active/active/etc. is where it's at in terms of high availability in the 21st Century. GSLB, et. al.
----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com>
Sorry, sometimes I mistake your existential crises for technical insights.
-- xkcd #625
------------------------------------------------------------------------
No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.423 / Virus Database: 270.14.36/2465 - Release Date: 10/28/09 09:34:00
-- -"Prediction is very difficult, especially about the future." -Niels Bohr -- Ray Sanders Linux Administrator Village Voice Media Office: 602-744-6547 Cell: 602-300-4344
On Oct 29, 2009, at 12:42 AM, Ray Sanders wrote:
Could you elaborate on "GSLB" (Global Load Balancing?) ?
Architectural choices, implementation scenarios, DNS tricks to ensure optimal cleaving to and availability of distributed nodes within a given tier: <http://www.backhand.org/mod_backhand/> <http://www.backhand.org/wackamole/> <http://www.spread.org/> <http://www.dsn.jhu.edu/research/group/secure_spread/> <http://wiki.blitzed.org/DNS_balancing> <http://www.cisco.com/en/US/products/hw/contnetw/ps4162/> ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> Sorry, sometimes I mistake your existential crises for technical insights. -- xkcd #625
Props for mentioning mod_backhand. Excellent tool for GSLB. On Wed, Oct 28, 2009 at 12:57 PM, Roland Dobbins <rdobbins@arbor.net> wrote:
On Oct 29, 2009, at 12:42 AM, Ray Sanders wrote:
Could you elaborate on "GSLB" (Global Load Balancing?) ?
Architectural choices, implementation scenarios, DNS tricks to ensure optimal cleaving to and availability of distributed nodes within a given tier:
<http://www.backhand.org/mod_backhand/>
<http://www.backhand.org/wackamole/>
<http://www.dsn.jhu.edu/research/group/secure_spread/>
<http://wiki.blitzed.org/DNS_balancing>
<http://www.cisco.com/en/US/products/hw/contnetw/ps4162/>
----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com>
Sorry, sometimes I mistake your existential crises for technical insights.
-- xkcd #625
-- Brandon Galbraith Mobile: 630.400.6992 FNAL: 630.840.2141
Also, commercial solutions from F5 (their GTM product and their old 3-DNS product). Using CDN's is also a way of handling this, but you need to be prepared for all your traffic to come from their source-ip's or do creative things with x-forwarded-for etc. Making an active/active datacenter design work (or preferably one with enough sites such that more than one can be down without seriously impacted service) is a serious challenge. Lots of people will tell you (and sell you solutions for) parts of the puzzle. My experience has been that the best case is when the architecture of the application/infrastructure have been designed with these challenges in mind from the get-go. I have seen that done on the network and server side, but never on the software side- that has always required significant effort when the time came. The "drop in" solutions for this (active/active database replication, middleware solutions, proxies) are always expensive in one way or another and frequently have major deployment challenges. The network side of this can frequently be the easiest to resolve, in my experience. If you are serving up content that does not require synchronized data on the backend, then that will make your life much easier, and GSLB, a CDN or similar may help a great deal. --D On Wed, Oct 28, 2009 at 10:57 AM, Roland Dobbins <rdobbins@arbor.net> wrote:
On Oct 29, 2009, at 12:42 AM, Ray Sanders wrote:
Could you elaborate on "GSLB" (Global Load Balancing?) ?
Architectural choices, implementation scenarios, DNS tricks to ensure optimal cleaving to and availability of distributed nodes within a given tier:
<http://www.backhand.org/mod_backhand/>
<http://www.backhand.org/wackamole/>
<http://www.dsn.jhu.edu/research/group/secure_spread/>
<http://wiki.blitzed.org/DNS_balancing>
<http://www.cisco.com/en/US/products/hw/contnetw/ps4162/>
----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com>
Sorry, sometimes I mistake your existential crises for technical insights.
-- xkcd #625
-- -- Darren Bolding -- -- darren@bolding.org --
-----Original Message----- From: Darren Bolding [mailto:darren@bolding.org] Sent: Wednesday, October 28, 2009 4:57 PM To: Roland Dobbins Cc: NANOG list Subject: Re: Redundant Data Center Architectures
Also, commercial solutions from F5 (their GTM product and their old 3- DNS product).
Using CDN's is also a way of handling this, but you need to be prepared for all your traffic to come from their source-ip's or do creative things with x-forwarded-for etc.
Making an active/active datacenter design work (or preferably one with enough sites such that more than one can be down without seriously impacted service) is a serious challenge. Lots of people will tell you (and sell you solutions for) parts of the puzzle. My experience has been that the best case is when the architecture of the application/infrastructure have been designed with these challenges in mind from the get-go. I have seen that done on the network and server side, but never on the software side- that has always required significant effort when the time came.
The "drop in" solutions for this (active/active database replication, middleware solutions, proxies) are always expensive in one way or another and frequently have major deployment challenges.
The network side of this can frequently be the easiest to resolve, in my experience. If you are serving up content that does not require synchronized data on the backend, then that will make your life much easier, and GSLB, a CDN or similar may help a great deal.
Thanks everyone who has responded so far. I should have clarified my intent a bit in the original email. I am definitely interested in architectures which support synchronized data between data center locations in as close to real-time as possible. More specifically, I am interested in designs which support zero down-time during failures, or as close to zero down-time as possible. GSLB, Anycast, CDNs... those types of approaches certainly have their place especially where the pull-model is employed (DNS, Netflix, etc.). However, what types of solutions are being used for synchronized data and even network I/O on back-end systems? I've been looking at the VMware vSphere 4 Fault Tolerance stuff to synchronize the data storage and network I/O across distributed virtual machines, but still worried about the consequences of doing this stuff across WAN links with high degrees of latency, etc. From the thread I get the feeling that L2 interconnects (VPLS, PWs) are generally considered a bad thing, I gathered as much as I figured there would be lots of unintended consequences with regards to designated router elections and other weirdness. Besides connecting sites via L3 VPNs, what other approaches are others using? Also, would appreciate any comments to the synchronization items above. Thanks, -- Stefan Fouant
On 29/10/2009, at 8:39 AM, Stefan Fouant wrote:
-----Original Message----- From: Darren Bolding [mailto:darren@bolding.org] Sent: Wednesday, October 28, 2009 4:57 PM To: Roland Dobbins Cc: NANOG list Subject: Re: Redundant Data Center Architectures
Also, commercial solutions from F5 (their GTM product and their old 3- DNS product).
Using CDN's is also a way of handling this, but you need to be prepared for all your traffic to come from their source-ip's or do creative things with x-forwarded-for etc.
Making an active/active datacenter design work (or preferably one with enough sites such that more than one can be down without seriously impacted service) is a serious challenge. Lots of people will tell you (and sell you solutions for) parts of the puzzle. My experience has been that the best case is when the architecture of the application/infrastructure have been designed with these challenges in mind from the get-go. I have seen that done on the network and server side, but never on the software side- that has always required significant effort when the time came.
The "drop in" solutions for this (active/active database replication, middleware solutions, proxies) are always expensive in one way or another and frequently have major deployment challenges.
The network side of this can frequently be the easiest to resolve, in my experience. If you are serving up content that does not require synchronized data on the backend, then that will make your life much easier, and GSLB, a CDN or similar may help a great deal.
Thanks everyone who has responded so far.
I should have clarified my intent a bit in the original email. I am definitely interested in architectures which support synchronized data between data center locations in as close to real-time as possible. More specifically, I am interested in designs which support zero down-time during failures, or as close to zero down- time as possible. GSLB, Anycast, CDNs... those types of approaches certainly have their place especially where the pull-model is employed (DNS, Netflix, etc.). However, what types of solutions are being used for synchronized data and even network I/O on back-end systems? I've been looking at the VMware vSphere 4 Fault Tolerance stuff to synchronize the data storage and network I/O across distributed virtual machines, but still worried about the consequences of doing this stuff across WAN links with high degrees of latency, etc. From the thread I get the feeling that L2 interconnects (VPLS, PWs) are generally considered a bad thing, I gathered as much as I figured there would be lots of unintended consequences with regards to designated router elections and other weirdness. Besides connecting sites via L3 VPNs, what other approaches are others using? Also, would appreciate any comments to the synchronization items above.
Thanks,
-- Stefan Fouant
Layer 2 interconnects (whether they are VPLS / PWE3 / or other CCC- based models) are not bad in their own right, but I think it's important to realize that extending a (sub)network across large geographical regions because applications are not building intelligence about locality or presence is a move without intelligent engineering. I hear it all the time: just extend layer 2 between these two data centers so that we can have either (1) disaster recovery or (2) vmotion / heart beats / etc ... The truth is we can do things better and smarter than just extending bridging domains across disparate geographical locations. Real time storage should ideally be local .. but there is no reason why it can't be "available" over the cloud to other networks. The key is to have a single namespace for all storage, to not be tied to a particular storage technology, but to simply be able to present the storage/disk/ mount point to virtual machines. Extending layer 2 for iSCSI / SAN / and even FCoE is feasible. But, lets think about the technology in detail .. FCoE uses pause frames, and when there is significant geographical delay between sites, then FCoE is not the right technology. It works great locally .. and this should be just one technology to deliver storage locally in DCs. Internally I would explore DNS (GSLB / anycast / etc) .. and even ideas like mobile IPv4 / IPv6 mobility before I started extending layer 2 domains across the world. Kind regards, Truman
Roland Dobbins wrote:
On Oct 28, 2009, at 8:26 PM, Stefan Fouant wrote:
I'm wondering what are the growing trends in connecting Data Centers for redundancy in DR/COOP environments.
'DR' is an obsolete 40-year-old mainframe concept; it never works, as funding/testing/scaling of the 'backup' systems is never adequate and/or allowed.
Layer-2 between sites is evil, as well.
Layer-3-independence and active/active/etc. is where it's at in terms of high availability in the 21st Century. GSLB, et. al.
And that's about all you need to know. Never heard it put so succinctly - thanks, -Ryan Brooks
Layer-3-independence and active/active/etc. is where it's at in terms of high availability in the 21st Century. GSLB, et. al.
Somewhere on video.google.com is a Google I/O talk explaining the hell that is active/active redundancy and how hard it is to achieve at layers 4-7. I don't argue that it's the proper method for Layer 3 though. -brandon On Wed, Oct 28, 2009 at 12:38 PM, Roland Dobbins <rdobbins@arbor.net> wrote:
On Oct 28, 2009, at 8:26 PM, Stefan Fouant wrote:
I'm wondering what are the growing trends in connecting Data Centers for
redundancy in DR/COOP environments.
'DR' is an obsolete 40-year-old mainframe concept; it never works, as funding/testing/scaling of the 'backup' systems is never adequate and/or allowed.
Layer-2 between sites is evil, as well.
Layer-3-independence and active/active/etc. is where it's at in terms of high availability in the 21st Century. GSLB, et. al.
----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com>
Sorry, sometimes I mistake your existential crises for technical insights.
-- xkcd #625
-- Brandon Galbraith Mobile: 630.400.6992 FNAL: 630.840.2141
On Oct 29, 2009, at 12:44 AM, Brandon Galbraith wrote:
Somewhere on video.google.com is a Google I/O talk explaining the hell that is active/active redundancy and how hard it is to achieve at layers 4-7.
Depends upon the type of apps, amount of required concurrency, etc. It's easy on the front-end (which is where most of the drama tends to take place, anyways); it's the middle and back-end tiers which require some work, but it certainly can be and is accomplished daily, for both simple and more complex systems. The smart money makes use of various existing *aaS platforms to accomplish this without having to re-invent the wheel every time. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> Sorry, sometimes I mistake your existential crises for technical insights. -- xkcd #625
participants (9)
-
Brandon Galbraith
-
Charles Wyble
-
ChrisSerafin
-
Darren Bolding
-
Ray Sanders
-
Roland Dobbins
-
Ryan Brooks
-
Stefan Fouant
-
Truman Boyes