Scalable Mail solution with NAS
My company is in the process of evaluating several mail solutions, scalable to 150k to 200k mailboxes. One thing we'd like to do is run the message store over Gig-E on network attached storage. Two of the vendors we've been looking at claim performance issues running this solution over NFS. Does anyone know of a carrier-class mail solution that will run well on NAS? Patrick Hollowell Sr. Network Engineer CTC Internet Services phollowell@vnet.net 800-377-3282 x3527
You might want to check out CriticalPath. We are in the process of designing a mail system using an EMC Clariion and CriticalPath's software suite. Sounds like a similar setup. jas -----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu]On Behalf Of Patrick Hollowell Sent: Tuesday, January 30, 2001 8:26 PM To: 'nanog@merit.edu' Subject: Scalable Mail solution with NAS My company is in the process of evaluating several mail solutions, scalable to 150k to 200k mailboxes. One thing we'd like to do is run the message store over Gig-E on network attached storage. Two of the vendors we've been looking at claim performance issues running this solution over NFS. Does anyone know of a carrier-class mail solution that will run well on NAS? Patrick Hollowell Sr. Network Engineer CTC Internet Services phollowell@vnet.net 800-377-3282 x3527
Patrick Hollowell [phollowell@vnet.net] wrote:
My company is in the process of evaluating several mail solutions, scalable to 150k to 200k mailboxes. One thing we'd like to do is run the message store over Gig-E on network attached storage. Two of the vendors we've been looking at claim performance issues running this solution over NFS. Does anyone know of a carrier-class mail solution that will run well on NAS?
Take a look at NetApp. My company (unfortunately) signed an NDA with NetApp, but they've posted that Yahoo uses NetApp for their e-mail: http://www.netapp.com/partners/catalog.cgi/company/28 Rumor has it (no, I'm not violating my NDA) that Hotmail also uses NetApp. -Plenty- of other large sites use 'em for e-mail. Just call a NetApp sales person and and ask for the list. It's impressive. I do believe it contains some carrier-class implementations. Mike -- Mike Johnson Network Engineer / iSun Networks, Inc. Morrisville, NC All opinions are mine, not those of my employer
Ok, this is my beef with NetApp. We have a NetApp F720 with a single disk shelf. The F720 "brain box" has two power supply units that slide into the back on the right and left side, each having it's own 48V DC connectors (about the size of LS1010 power supplies). Now the disk shelf. It has two power supplies that slide into the front and connect into the backplane with no connectors on the front. A fixed connector assembly is located on the back. Just one. One! No A side. No B side. Just one +/-/GND. We gave NetApp a call and their workaround was "you could use a diode and connect both A & B wires to the unit". Uhh... thanks. They also told us their design engineer had already been slapped on the hand, and they are working on their next version. It was an interesting gotcha for our server engineer who wansn't too farmiliar with DC power plants. Now, does anyone know of a diode that can do 10A at 48V? Any EE's out there? Mike Johnson wrote:
Patrick Hollowell [phollowell@vnet.net] wrote:
My company is in the process of evaluating several mail solutions, scalable to 150k to 200k mailboxes. One thing we'd like to do is run the message store over Gig-E on network attached storage. Two of the vendors we've been looking at claim performance issues running this solution over NFS. Does anyone know of a carrier-class mail solution that will run well on NAS?
Take a look at NetApp. My company (unfortunately) signed an NDA with NetApp, but they've posted that Yahoo uses NetApp for their e-mail: http://www.netapp.com/partners/catalog.cgi/company/28
Rumor has it (no, I'm not violating my NDA) that Hotmail also uses NetApp. -Plenty- of other large sites use 'em for e-mail. Just call a NetApp sales person and and ask for the list. It's impressive. I do believe it contains some carrier-class implementations.
Mike
Unnamed Administration sources reported that Kris S. Amundson said:
Now, does anyone know of a diode that can do 10A at 48V? Any EE's out there?
Aye... This is easy; many "icecube" bridge rectifiers will meet this spec. Digikey, ESO, etc. should have same for $5-10 max. Details upon request. -- A host is a host from coast to coast.................wb8foz@nrk.com & no one will talk to a host that's close........[v].(301) 56-LINUX Unless the host (that isn't close).........................pob 1433 is busy, hung or dead....................................20915-1433
Take a look at NetApp. My company (unfortunately) signed an NDA with NetApp, but they've posted that Yahoo uses NetApp for their e-mail: http://www.netapp.com/partners/catalog.cgi/company/28
Rumor has it (no, I'm not violating my NDA) that Hotmail also uses NetApp. -Plenty- of other large sites use 'em for e-mail. Just call a NetApp sales person and and ask for the list. It's impressive. I do believe it contains some carrier-class implementations.
We use Netapps at COLT for email. And as I understand it, Demon Internet in the UK uses them also and they have over 200,000 users taking SMTP and pop3 feeds. Regards, Neil.
Their stuff is state of the art for large storage, i have worked for 2 isps that have used several of them.. Brian On Wed, 31 Jan 2001, Neil J. McRae wrote:
Take a look at NetApp. My company (unfortunately) signed an NDA with NetApp, but they've posted that Yahoo uses NetApp for their e-mail: http://www.netapp.com/partners/catalog.cgi/company/28
Rumor has it (no, I'm not violating my NDA) that Hotmail also uses NetApp. -Plenty- of other large sites use 'em for e-mail. Just call a NetApp sales person and and ask for the list. It's impressive. I do believe it contains some carrier-class implementations.
We use Netapps at COLT for email. And as I understand it, Demon Internet in the UK uses them also and they have over 200,000 users taking SMTP and pop3 feeds.
Regards, Neil.
You may want to check Stalker Software's Communigate Pro. www.stalker.com (I'm not sure what it can do over NFS but it can support 200k mailboxes on a single server). --vadim On Tue, 30 Jan 2001, Patrick Hollowell wrote:
My company is in the process of evaluating several mail solutions, scalable to 150k to 200k mailboxes. One thing we'd like to do is run the message store over Gig-E on network attached storage. Two of the vendors we've been looking at claim performance issues running this solution over NFS. Does anyone know of a carrier-class mail solution that will run well on NAS?
Patrick Hollowell Sr. Network Engineer CTC Internet Services phollowell@vnet.net 800-377-3282 x3527
Subject: Re: Scalable Mail solution with NAS
You may want to check Stalker Software's Communigate Pro.
www.stalker.com
(I'm not sure what it can do over NFS but it can support 200k mailboxes on a single server).
--vadim
Is the number of mailboxes the key metric? What breaks sendmail + "a very big disk"? Isn't it the traffic? Chris
On Wed, 31 Jan 2001 10:13:40 EST, chrisb@kippona.com said:
Is the number of mailboxes the key metric? What breaks sendmail + "a very big disk"? Isn't it the traffic?
The two biggest problems with very-high-volume servers and sendmail are: 1) You *really* need to use multiple queues and some sort of aging scheme, so mail backlogged for dead hosts gets out of your main queue. If a queue gets too full, Sendmail exhibits bad O(N**2) behavior in sorting/running the queue. 2) If you are serving mailboxes (as opposed to a Listserv-type machine where the mail *leaves*), what can kill you isn't the sendmail, but the local delivery program and POP/IMAP checks. You get enough bozo users who have set Eudora to check for new mail every 2 minutes, you'll get bogged down no matter HOW fast Sendmail itself is. -- Valdis Kletnieks Operating Systems Analyst Virginia Tech
On Wed, 31 Jan 2001 Valdis.Kletnieks@vt.edu wrote:
On Wed, 31 Jan 2001 10:13:40 EST, chrisb@kippona.com said:
Is the number of mailboxes the key metric? What breaks sendmail + "a very big disk"? Isn't it the traffic?
The two biggest problems with very-high-volume servers and sendmail are:
1) You *really* need to use multiple queues and some sort of aging scheme, so mail backlogged for dead hosts gets out of your main queue. If a queue gets too full, Sendmail exhibits bad O(N**2) behavior in sorting/running the queue.
2) If you are serving mailboxes (as opposed to a Listserv-type machine where the mail *leaves*), what can kill you isn't the sendmail, but the local delivery program and POP/IMAP checks. You get enough bozo users who have set Eudora to check for new mail every 2 minutes, you'll get bogged down no matter HOW fast Sendmail itself is.
Your second point should in fact be splitted in two. 1. Your going to have a hard time handling the amount of incomming pop connections, yes. That's true, and there's nothing you can do about it execpt scale your server farm in consequence or deny consecutive connections within a 5 or 10 minutes period. 2. The more mailboxes you have, the slower the entire popping process will be. The reason is very simple, each pop process will spawn and read your mailbox directory. In the case where you have delivered all your mail to mailboxes all sitting in the same directory, it will take more and more time to scan the directory to find your mailbox. One way to fix this issue would be to use a hashing scheme to split the amount of actual mailboxes into a subdirectory structure. You could get something like johndoe@yourdomain.com would have his mailbox in /export/mailboxes/j/o/h/n/johndoe.mbox so in /export/mailboxes, in order to find the j directory, you only have about 36 directories entries or so. Although this example is not good in the case where you accept usernames with 3 or less characters.
-- Sebastien Berube sberube@zeroknowledge.com
chrisb@kippona.com [chrisb@kippona.com] wrote:
Is the number of mailboxes the key metric? What breaks sendmail + "a very big disk"? Isn't it the traffic?
Not that traffic isn't important, but you can get a lot more data from inferences based off of the number of mailboxes. You can speculate on the traffic, for instance. You can also look at your disks. In my opinion, delivery of messages to mailboxes is easy. Yes, you have to tune sendmail or switch to qmail or postfix to deal with the amount of messages, but delivery is easy. Retreiving them is a bit more difficult and takes some planning. Using mbox format with a POP daemon that copies the entire spool? Allowing 25MB spools? Welcome to doubling your disk space and adding more RAM to your servers. Using maildir and a decent IMAP server? Then you can use NFS spool storage and multiple IMAP servers with some fun load balancing. But, if you look at the total number of mailboxes, you can figure out how much disk space you'll need, approximate how many messages per second will be sent and received, and figure out how many users will be retrieving messages per second. I am wondering if this belongs on a networks list, though...
Chris
Mike -- Mike Johnson Network Engineer / iSun Networks, Inc. Morrisville, NC All opinions are mine, not those of my employer
On 01/31/01, Mike Johnson <mike.johnson@isunnetworks.com> wrote:
I am wondering if this belongs on a networks list, though...
There's the isp-emailservers list (at isp-emailservers.com, even), but the clues are few and far between. It'd be nice if some actual content (as opposed to "please help me I've never used the Internet before and now I'm an ISP" questions) could show up there, though. -- J.D. Falk "The Internet isn't just a publishing medium or a Product Manager medium for commerce, it's a social medium." Mail Abuse Prevention System LLC -- Howard Rheingold
On Wed, 31 Jan 2001, J.D. Falk wrote:
On 01/31/01, Mike Johnson <mike.johnson@isunnetworks.com> wrote:
I am wondering if this belongs on a networks list, though...
There's the isp-emailservers list (at isp-emailservers.com, even), but the clues are few and far between. It'd be nice if some actual content (as opposed to "please help me I've never used the Internet before and now I'm an ISP" questions) could show up there, though.
Earthlink have a little paper here: http://www.earthlink.com/about/papers/mailarch.html and I half remember seeing a couple of other papers somewhere else (try LISA archives under usenix.org). The system we use is based on a few ideas from this paper. Once you start splitting between multiple servers it's pretty easy to get something that'll scale to over a million mailboxes. -- Simon Lyall. | Newsmaster | Work: simon.lyall@ihug.co.nz Senior Network/System Admin | | Home: simon@darkmere.gen.nz ihug, Auckland, NZ | Asst Doorman | Web: http://www.darkmere.gen.nz
On Thu, Feb 01, 2001 at 10:55:08AM +1300, Simon Lyall wrote:
On Wed, 31 Jan 2001, J.D. Falk wrote:
On 01/31/01, Mike Johnson <mike.johnson@isunnetworks.com> wrote:
I am wondering if this belongs on a networks list, though...
There's the isp-emailservers list (at isp-emailservers.com, even), but the clues are few and far between. It'd be nice if some actual content (as opposed to "please help me I've never used the Internet before and now I'm an ISP" questions) could show up there, though.
Earthlink have a little paper here:
http://www.earthlink.com/about/papers/mailarch.html
and I half remember seeing a couple of other papers somewhere else (try LISA archives under usenix.org).
The system we use is based on a few ideas from this paper. Once you start splitting between multiple servers it's pretty easy to get something that'll scale to over a million mailboxes.
What about using clustered servers with SAN, I think this is also possible. For example Legato has a cluster product which can also support SAN. Is there any security consideration not to use NAS which is based on NFS ? regards, -- Muljawan Hendrianto Internet Consultant Siemens Business Services Pte. Ltd. Siemens IT Services E-Business Operations 2 Kallang Sector, Singapore 349277 Tel : +65 740 7554 Fax : +65 740 7497 Mobile : +65 9824 4688
Muljawan Hendrianto [muljawan.hendrianto@siemens.com.sg] wrote:
What about using clustered servers with SAN, I think this is also possible. For example Legato has a cluster product which can also support SAN.
Because SANs become a pain when you want to implement shared storage (ie, one central mailspool mounted by muliple systems). Certainly, it's doable, but you have to have software running on all the systems to deal with concurrent access. I've yet to find a version of this software that runs on Linux (or any other Open Sourceish OS), so it's not even a consideration for me.
Is there any security consideration not to use NAS which is based on NFS ?
Well, you just need to be careful. NFS security is resonably well understood. Mike -- Mike Johnson Network Engineer / iSun Networks, Inc. Morrisville, NC All opinions are mine, not those of my employer
participants (14)
-
Brian
-
chrisb@kippona.com
-
David Lesher
-
J.D. Falk
-
Jason Lewis
-
Kris S. Amundson
-
Mike Johnson
-
Muljawan Hendrianto
-
Neil J. McRae
-
Patrick Hollowell
-
Sebastien Berube
-
Simon Lyall
-
Vadim Antonov
-
Valdis.Kletnieks@vt.edu