Well, writing data at that speed is relatively easy (hint - get a box which does IP trunk bonding based on SRC/DST hash to step down OC-192 or whatever to, say, 64x OC-3s - which is within range of commercial RAIDs). The cost of such solution (including disk storage, about 40 exabytes) will be about US$200 mil per OC-192 trunk per year. Now the question is how to extract any useful information out of it. I guess the only feasible option would be to analyze data in real time, and record only "interesting" bits. As a guesstimate, this would require about 1000 PC boxes per OC-192 trunk. A specialized hardware (pattern-recognition chips, etc) could make it a lot easier. Not cheap, but doable, and it is well within the budget of NSA to sift through all overseas Internet traffic. Of course, encrypting data makes all that pretty irrelevant. That's why FBI and NSA are so keen to stall public adoption of encryption. (When encrypted communications are rare, they can record them and break them at their leasure; when everybody's using it - they're helpless). Particle physicists are doing very high volume real-time data analysis on comparable scale routinely, sifting through trillions of particle interactions to find dozen or two of interesting ones. So i wouldn't dismiss their ability to do that kind of surveiliance as a technical or economical impossibility. It is certainly doable with todays technology and a bit of cleverness. --vadim On Mon, 29 Oct 2001, Wojtek Zlobicki wrote:
Unfortunately, just because we know how difficult it is to provide a solution to this problem, does not mean that everyone subscribes to it. One should not discount the argument made based purely on the source, especially since recently a few very "interesting" articles showed up in a number of publications, including current issue of Forbes. The author, whose name escapes me at this time, is under the ill-belief that since the internet traffic does flow though hubs, it would be possible to intercept it and store it on the computers located in those hubs. It is more likely that a white paper describing the issues arising from attempts to intercept and store that much data would do better than an argument about unreliability of the source.
Alex
It's obvious that many people spreading this information (no matter how credible the source, have little knowledge of how much data flows through such hubs). If I remember correctly, AOL-TW for example does over 100 Terabits of traffic every day. No storage system in the world (that I know of) can write at 10 GB/sec (not forgetting that at OC-192 speeds we are writing 36 Terabytes of Data per hour). Not even the most prestigious government agencies have the ability to sort through petabytes of data per day.