On Mon, Feb 27, 2012 at 11:19:27AM -0800, George Herbert wrote:
On Mon, Feb 27, 2012 at 7:28 AM, William Herrin <bill@herrin.us> wrote:
On Sun, Feb 26, 2012 at 7:02 PM, Randy Carpenter <rcarpen@network1.net> wrote:
On Feb 26, 2012, at 4:56 PM, Randy Carpenter wrote:
1. Full redundancy with instant failover to other hypervisor hosts upon hardware failure (I thought this was a given!)
This is actually a much harder problem to solve than it sounds, and gets progressively harder depending on what you mean by "failover".
At the very least, having two physical hosts capable of running your VM requires that your VM be stored on some kind of SAN (usually iSCSI based) storage system. Otherwise, two hosts have no way of accessing your VM's data if one were to die. This makes things an order of magnitude or higher more expensive.
This does not have to be true at all. Even having a fully fault-tolerant SAN in addition to spare servers should not cost much more than having separate RAID arrays inside each of the server, when you are talking about 1,000s of server (which Rackspace certainly has)
Randy,
You're kidding, right?
SAN storage costs the better part of an order of magnitude more than server storage, which itself is several times more expensive than workstation storage. That's before you duplicate the SAN and set up the replication process so that cabinet and room level failures don't take you out.
This is clearly becoming a not-NANOG-ish thread, however...
Failing to have central shared storage (iSCSI, NAS, SAN, whatever you prefer) fails the smell test on a local enterprise-grade virtualization cluster, much less a shared cloud service.
Some people have done tricks with distributing the data using one of the research-ish shared filesystems, rather than separate shared storage. That can be made to work if the host OS model and its available shared filesystems work for you. Doesn't work for Vmware Vcenter / Vmotion-ish stuff as far as I know.
There are plenty of people doing non-enterprise-grade virtualization. There's no mandate that you have the ability to migrate a virtual to another node in realtime or restart it immediately on another node if the first node dies suddenly. But anyone saying "we have a cloud" and not providing that type of service, is in marketing not engineering. From a systems architecture point of view, you can't do that.
Cloud is utterly meaningless drivel. Your idea of cloud is different from mine, which is different from my co-workers, bosses, people in marketing etc. etc. It's a vague useless term that could mean everything from a bog standard mail server through to full on 'deploy your app' things like Heroku. It would be more accurate to focus on IaaS, PaaS, SaaS et al For what little it's probably worth mentioning, Amazon provides a shared storage platform in the form of EBS, Elastic Block Storage, which you can choose to use as your root device on your server if you so wish (wouldn't advise you do, latency is unpredictable), or you can have it mounted wherever is relevant for your data (the most common route). That's their non-physical server dependent storage provision. If you pay extra it'll replicate, or even replicate between availability zones. You can also choose to have Amazon monitor and ensure sufficient numbers of your server are running through autoscale. Paul