Hi Drew, # Howdy, if this is off-topic I certainly apologize however I #believe that running an NNTP server is usually part of a 'network #operations' sphere of influence. Dunno about that, but I'll chime in a couple of ideas just because the volumes of NNTP traffic involved have gotten to the point where the traffic alone is probably operationally interesting, everything else aside. #I have a few basic questions. Does anyone #know off hand how much disk is needed for a fairly respectable NNTP server #for a full feed? Daily Usenet volumes are extremely sensitive to decisions with respect to carrying (or not carrying) even single groups. See, for example http://www.newsadmin.com/top100bytes.htm which shows that the top half dozen groups (by bytes posted) had daily volume running: Binary Newsgroup Bytes % Total ---------------------------------------------------------- 1 alt.binaries.dvdr 30,304,095,023 5.893 2 alt.binaries.cd.image.xbox 25,796,723,944 5.017 3 alt.binaries.dvd 19,428,583,576 3.778 4 alt.binaries.multimedia 17,783,671,185 3.459 5 alt.binaries.cd.image.games 15,303,064,035 2.976 6 alt.binaries.svcd 14,780,524,967 2.874 [commas added to byte counts for improved legibility] Yes, carrying or not carrying a single group can have a 30GB/day impact. Yes, daily traffic for a fullish feed *has* peaked in excess of 600GB and 3 million articles/day. If you want to carry "everything," you could multiply ~0.6TB/day times the number of day's retention you want to keep, however note that over time your retention will drift downward as volumes continue to increase. Also note that this is is just raw article storage space, and does not include space for article overview data, history files, etc. Daily Usenet volumes (in bytes) are also exceptionally sensitive to maximum article size, with the 80/20 rule roughly holding for byte traffic and article count (e.g., 80% of the articles by article count will require just 20% of the transfer bandwidth). If your goal is to live within a given bandwidth budget, or to efficiently utilize a particular size disk array, you can readily adjust your total article payload/day by dialing down the maximum article size you elect to accept. In case you doubt these volume stats, a couple of sites with publicly accessible daily traffic summaries include: http://nntp.abs.net/cyclone/stats/ http://informatie.wirehub.net/news/bambam/diablo.html http://newsfeed.media.kyoto-u.ac.jp/innreport/ I would note that most "full" feeds today really *AREN'T* full, however. #Also is IDE still too slow/unreliable for this type of #operation? I know back when we got our current server IDE was very slow it #has sped up a bit since then. Choice of file system, and storage methodology can be as or more critical than whether or not you're using IDE or SCSI. The days where traditional article-per-file spools in UFS file systems would work are definitely gone -- cyclical news file systems on top of ReiserFS are a popular recipe today. #The reason I am asking is because it has come #time for the old NNTP server to be buried somewhere in the mountains and for #me to procure a new one. Currently we are running a P3 600 /w about 200 GB #of storage on Solaris and Typhoon, the reason we are replacing this server #is for the poor performance and its abhorrent retention. If you're planning to work with a full feed, you won't regret getting as much CPU, memory, disk, and network connectivity as you can afford. I don't want to get into hardware/OS/server religious wars, so I'll skip any specifics here (although feel free to contact me offlist if you're interested in talking about some starting points for hardware options that seem to work okay). Regards, Joe St Sauver (joe@oregon.uoregon.edu) University of Oregon Computing Center
participants (1)
-
Joe St Sauver