On Wed, Feb 19, 2014 at 2:06 PM, Jay Ashworth <jra@baylink.com> wrote:
From: "Eugeniu Patrascu" <eugen@imacandi.net> [snip] My understanding of "cluster-aware filesystem" was "can be mounted at the
----- Original Message ----- physical block level by multiple operating system instances with complete safety". That seems to conflict with what you suggest, Eugeniu; am I missing something (as I often do)?
When one of the hosts has a virtual disk file open for write access on a VMFS cluster-aware filesystem, it is locked to that particular host, and a process on a different host is denied the ability write to the file, or even open the file for read access. Another host cannot even read/write metadata about the file's directory entry. Attempts to do so, get rejected with an error. So you don't really have to worry all that much about "as long you don't access the same files", although: certainly you should not try to, either. Only the software in ESXi can access the VMFS --- there is no ability to run arbitrary applications. (Which is also, why I like NFS more than shared block storage; you can conceptually use the likes of a storage array feature such as FlexClone to make a copy-on-write clone of a file, take a storage level snapshot, and then do a granular restore of a specific VM; without having to restore the entire volume as a unit. You can't pull that off with a clustered filesystem on a block target!) Also, the VMFS filesystem is cluster aware by method of exclusion (SCSI Reservations) and separate journaling. Metadata locks are global in the VMFS cluster-aware filesystem. Only one host is allowed to write to any of the metadata -on the entire volume a- time, unless you have VAAI VMFS extensions, and your storage vendor supports the ATS (atomic test and set), resulting in a performance bottleneck. For that reason, while VMFS is cluster aware, you cannot necessarily have a large number of cluster nodes, or more than a few dozen open files, before performance degrades due to the metadata bottleneck. Another consideration is that; in the event that you have a power outage which simultaneously impacts your storage array and all your hosts: you may very well be unable to regain access to any of your files, until the specific host that had that file locked comes back up, or you wait out a ~30 to ~60 minute timeout period.
Cheers, -- jra
-- -JH