Why the US Government has so many data centers
If you've wondered why the U.S. Government has so many data centers, ok I know no one has ever asked. The U.S. Government has an odd defintion of what is a data center, which ends up with a lot of things no rational person would call a data center. If you call every room with even one server a "data center," you'll end up with tens of thousands of rooms now data centers. With this defintiion, I probably have two data centers in my home. Its important because Inspectors General auditors will go around and count things, because that's what they do, and write reports about insane numbers of data centers. https://datacenters.cio.gov/optimization/ "For the purposes of this memorandum, rooms with at least one server, providing services (whether in a production, test, stage, development, or any other environment), are considered data centers. However, rooms containing only routing equipment, switches, security devices (such as firewalls), or other telecommunications components shall not be considered data centers."
On 12 Mar 2016, at 0:03, Sean Donelan wrote:
The U.S. Government has an odd defintion of what is a data center, which ends up with a lot of things no rational person would call a data center.
There's also a case to be made that governmental organizations really oughtn't to have servers just lying around in random rooms, and that those rooms are de facto government data centers, whether those who're responsible for said rooms/servers know it or not . . . ----------------------------------- Roland Dobbins <rdobbins@arbor.net>
On Fri, Mar 11, 2016 at 12:21 PM, Roland Dobbins <rdobbins@arbor.net> wrote:
On 12 Mar 2016, at 0:03, Sean Donelan wrote:
The U.S. Government has an odd defintion of what is a data center, which ends up with a lot of things no rational person would call a data center.
There's also a case to be made that governmental organizations really oughtn't to have servers just lying around in random rooms, and that those rooms are de facto government data centers, whether those who're responsible for said rooms/servers know it or not . . .
because .... at least: o safe handling of media is important (did the janitor just walk off with backup tapes/ disks/etc?) o 'a machine under your desk' is not a production operation. (if you think it is, please stop, think again and move that service to conditioned power/cooling/ethernet) I'm sure there are other reasons, but honestly those 2 are great starters...
Christopher Morrow wrote:
because .... at least: o safe handling of media is important (did the janitor just walk off with backup tapes/ disks/etc?) o 'a machine under your desk' is not a production operation. (if you think it is, please stop, think again and move that service to conditioned power/cooling/ethernet)
I'm sure there are other reasons, but honestly those 2 are great starters...
The alternative may be: - issue RFT for hosting / colocation facilities + high speed resilient connectivity between colo and local network + associated equipment to make this work, or - building out enterprise-grade comms room in local office This can be hard for public sector bodies to do and depending on the value of the data hosting or the amount of kit that needed to be hosted, it may also not be easy to justify. Nick
On Fri, 11 Mar 2016, Christopher Morrow wrote:
o 'a machine under your desk' is not a production operation. (if you think it is, please stop, think again and move that service to conditioned power/cooling/ethernet)
Even worse, the new OMB data center definition wants says "(whether in a production, test, stage, development, or any other environment)". In the non-government world, you want to keep test, staging and development separate from your "production." So your testing lab is now a "data center," and you must consolidate your "data centers" together. If you are optimizing servers, not data centers, then you probably want to consolidate your production servers in a data center. But there will still be lots of servers not in data centers, like the server in the parking garage that controls the gates or the server in the building that controls HVAC. Its not smart to consolidate your HVAC servers and your credit card servers, as some companies have found out. The U.S. government definition of data center is a bit like defining a warehouse as any room containing a single ream of paper. Yes, warehouses are used to store reams of paper; but that doesn't make every place containing a ream of paper a warehouse.
This is a great way to create a mess of rules. Need a server for running an app locally to a site? You need XYZ standards that make no sense for your deploy and increase the cost by 10 times. Our server guys always try to set standards, then they run into a deploy where the needs are simple, but the standards make it significantly uneconomical. -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Sean Donelan Sent: Friday, March 11, 2016 1:55 PM To: Christopher Morrow <morrowc.lists@gmail.com> Cc: nanog list <nanog@nanog.org> Subject: Re: Why the US Government has so many data centers On Fri, 11 Mar 2016, Christopher Morrow wrote:
o 'a machine under your desk' is not a production operation. (if you think it is, please stop, think again and move that service to conditioned power/cooling/ethernet)
Even worse, the new OMB data center definition wants says "(whether in a production, test, stage, development, or any other environment)". In the non-government world, you want to keep test, staging and development separate from your "production." So your testing lab is now a "data center," and you must consolidate your "data centers" together. If you are optimizing servers, not data centers, then you probably want to consolidate your production servers in a data center. But there will still be lots of servers not in data centers, like the server in the parking garage that controls the gates or the server in the building that controls HVAC. Its not smart to consolidate your HVAC servers and your credit card servers, as some companies have found out. The U.S. government definition of data center is a bit like defining a warehouse as any room containing a single ream of paper. Yes, warehouses are used to store reams of paper; but that doesn't make every place containing a ream of paper a warehouse.
Note that I an not answering in any sort of "official" capacity....but I will instead ask this for your consideration: Do servers in "test, stage, development, or any other environment" really need to have the same environmental, power and connectivity requirements that "production" servers have? And should a dev lab containing a couple of servers and a few developers really be called a "datacenter"? -Mark Ganzer SSC-PAC San Diego Code 82700 Office/Voice mail: 619-553-1186 NOC: 619-553-5881 On 3/11/2016 9:21 AM, Roland Dobbins wrote:
On 12 Mar 2016, at 0:03, Sean Donelan wrote:
The U.S. Government has an odd defintion of what is a data center, which ends up with a lot of things no rational person would call a data center.
There's also a case to be made that governmental organizations really oughtn't to have servers just lying around in random rooms, and that those rooms are de facto government data centers, whether those who're responsible for said rooms/servers know it or not . . .
----------------------------------- Roland Dobbins <rdobbins@arbor.net>
On Mar 11, 2016, at 11:57 AM, "Mark T. Ganzer" <ganzer@spawar.navy.mil> wrote:
but I will instead ask this for your consideration: Do servers in "test, stage, development, or any other environment" really need to have the same environmental, power and connectivity requirements that "production" servers have?
Why would you think otherwise? It's a symptom of trying to save a few cents at the risk of dollars. George William Herbert Sent from my iPhone
On Sun, 13 Mar 2016, Roland Dobbins wrote:
On 13 Mar 2016, at 3:03, George Herbert wrote:
It's a symptom of trying to save a few cents at the risk of dollars.
Concur 100%.
Not to mention the related security issues.
Just remember, no exceptions, no waivers. I understand why cloud vendors want 100% of government IT dollars. But requiring all test and development to be done solely in cloud data centers... there is your 100%
I really don't care about AWS sales (customer, but not investor or employee). But... If it's not highly loaded, cloud is cheaper. If it's not in a well run datacenter / machine room, cloud is FAR more reliable. The cost of blowing up hardware in less than well run machine rooms / datacenters can be immense. At a now defunct cell provider, we lost a badly maintained machine room to fire, only about 24 racks, $2.1 million damage. And nearly burned down the Frys Palo Alto building. And that's just the worst catastrophe; had more losses than that in smaller clusters / onsies. George William Herbert Sent from my iPhone
On Mar 13, 2016, at 2:15 PM, Sean Donelan <sean@donelan.com> wrote:
On Sun, 13 Mar 2016, Roland Dobbins wrote:
On 13 Mar 2016, at 3:03, George Herbert wrote:
It's a symptom of trying to save a few cents at the risk of dollars.
Concur 100%.
Not to mention the related security issues.
Just remember, no exceptions, no waivers.
I understand why cloud vendors want 100% of government IT dollars. But requiring all test and development to be done solely in cloud data centers... there is your 100%
On 3/13/16, Sean Donelan <sean@donelan.com> wrote:
On Sun, 13 Mar 2016, Roland Dobbins wrote:
On 13 Mar 2016, at 3:03, George Herbert wrote:
It's a symptom of trying to save a few cents at the risk of dollars.
Concur 100%.
Not to mention the related security issues.
Just remember, no exceptions, no waivers.
I understand why cloud vendors want 100% of government IT dollars. But requiring all test and development to be done solely in cloud data centers... there is your 100%
Where does it say test/dev has to be done solely in a cloud data center? This bit For the purposes of this memorandum, rooms with at least one server, providing services (whether in a production, test, stage, development, or any other environment), are considered data centers. seems to be more about trying to close the self-reporting loophole - ie 'these aren't the droids you're looking for.' for example - https://github.com/WhiteHouse/datacenters/issues/9 Lee
On Sun, 13 Mar 2016, Lee wrote:
Where does it say test/dev has to be done solely in a cloud data center? This bit For the purposes of this memorandum, rooms with at least one server, providing services (whether in a production, test, stage, development, or any other environment), are considered data centers. seems to be more about trying to close the self-reporting loophole - ie 'these aren't the droids you're looking for.' for example - https://github.com/WhiteHouse/datacenters/issues/9
Sigh, read any Inspector General report for how memorandums are implemented by auditors. If the memorandum says "or any other environment" the IG's will treat that as no exceptions. So IG's will "close the reporting loophole" by reporting that their are 100,000 "data centers" if a room contains even a single server. Auditors like counting things, they don't like interpretations. Inspector Generals are uber-auditors.
On 3/13/16, Sean Donelan wrote:
On Sun, 13 Mar 2016, Lee wrote:
Where does it say test/dev has to be done solely in a cloud data center? This bit For the purposes of this memorandum, rooms with at least one server, providing services (whether in a production, test, stage, development, or any other environment), are considered data centers. seems to be more about trying to close the self-reporting loophole - ie 'these aren't the droids you're looking for.' for example - https://github.com/WhiteHouse/datacenters/issues/9
Sigh, read any Inspector General report for how memorandums are implemented by auditors. If the memorandum says "or any other environment" the IG's will treat that as no exceptions.
So IG's will "close the reporting loophole" by reporting that their are 100,000 "data centers" if a room contains even a single server.
Auditors like counting things, they don't like interpretations. Inspector Generals are uber-auditors.
uhmmm.. yes - that's my point. No more of the "Whut? That box over there?? Oh no, that's not a server, it's an _appliance_" foot-dragging / circumvention of the cloud first policy. I doubt anyone really believes that having a server in the room makes it a data center. But if you're the Federal CIO pushing the cloud first policy, this seems like a great bureaucratic maneuver to get the decision making away from the techies that like redundant servers in multiple locations, their managers who's job rating depends on providing reliable services and even the agency CIOs. Check the reporting section of the memo where it says "each agency head shall annually publish a Data Center Consolidation and Optimization Strategic Plan". I dunno, but I'm guessing agency heads are political appointees that aren't going to spend much, if any, time listening to techies whine about how important their servers are & why they can't be consolidated, virtualized or outsourced. Lee
On Mon, 14 Mar 2016, Lee wrote:
I doubt anyone really believes that having a server in the room makes it a data center. But if you're the Federal CIO pushing the cloud first policy, this seems like a great bureaucratic maneuver to get the decision making away from the techies that like redundant servers in multiple locations, their managers who's job rating depends on providing reliable services and even the agency CIOs. Check the reporting section of the memo where it says "each agency head shall annually publish a Data Center Consolidation and Optimization Strategic Plan". I dunno, but I'm guessing agency heads are political appointees that aren't going to spend much, if any, time listening to techies whine about how important their servers are & why they can't be consolidated, virtualized or outsourced.
If your goal is to consolidate servers, call it a server consolidation initiative. You are correct political appointees won't understand why techies are perplexed by calling everything a data center. Just remember that when you read the stories in the Washington Post about how many data centers the government has... http://www.datacenterdynamics.com/design-build/us-government-finds-2000-more... New count of government facilities, and it looks like consolidation is going backwards
On 3/14/16, Sean Donelan wrote:
On Mon, 14 Mar 2016, Lee wrote:
I doubt anyone really believes that having a server in the room makes it a data center. But if you're the Federal CIO pushing the cloud first policy, this seems like a great bureaucratic maneuver to get the decision making away from the techies that like redundant servers in multiple locations, their managers who's job rating depends on providing reliable services and even the agency CIOs. Check the reporting section of the memo where it says "each agency head shall annually publish a Data Center Consolidation and Optimization Strategic Plan". I dunno, but I'm guessing agency heads are political appointees that aren't going to spend much, if any, time listening to techies whine about how important their servers are & why they can't be consolidated, virtualized or outsourced.
If your goal is to consolidate servers, call it a server consolidation initiative.
He did, didn't he? "... consolidate inefficient infrastructure, optimize existing facilities, achieve cost savings, and transition to more efficient infrastructure". But other than the ability to embarrass people[1] - ie. make the reports public, how much actual ability to effect change does he really have?
You are correct political appointees won't understand why techies are perplexed by calling everything a data center. Just remember that when you read the stories in the Washington Post about how many data centers the government has...
http://www.datacenterdynamics.com/design-build/us-government-finds-2000-more... New count of government facilities, and it looks like consolidation is going backwards
Yes, *sigh*, another what kind of people _do_ we have running the govt story. Altho, looking on the bright side, it could have been much worse than a final summing up of "With the current closing having been reported to have saved over $2.5 billion it is clear that inroads are being made, but ... one has to wonder exactly how effective the initiative will be at achieving a more effective and efficient use of government monies in providing technology services." Best Regards, Lee [1] http://archive.fortune.com/2011/07/13/news/companies/vivek_kundra_leadership... For example, one of the first things I did was take the picture of every CIO in the federal government. We set up an IT dashboard online, and I put their pictures right next to the IT projects they were responsible for. You could see on this IT dashboard whether that project was on schedule or not. The President actually looked at the IT dashboard, so we took a picture of that and put it online. Moments later, I started getting many phone calls from CIOs who said, "For the first time, my cabinet secretary is asking me why this project is red or green or yellow." One agency ended up halting 45 IT projects immediately. It was just the act of shining light and making sure you focus on execution, not only policy.
On Mon, Mar 14, 2016 at 12:44 PM, Lee <ler762@gmail.com> wrote:
Yes, *sigh*, another what kind of people _do_ we have running the govt story. Altho, looking on the bright side, it could have been much worse than a final summing up of "With the current closing having been reported to have saved over $2.5 billion it is clear that inroads are being made, but ... one has to wonder exactly how effective the initiative will be at achieving a more effective and efficient use of government monies in providing technology services."
Best Regards, Lee
That's an inaccurate cost savings though most likely; it probably doesn't take into account the impacts of the consolidation on other items. As a personal example, we're in the middle of upgrading my site from an OC-3 to an OC-12, because we're running routinely at 95+% utilization on the OC-3 with 4,000+ seats at the site. The reason we're running that high is because several years ago, they "consolidated" our file storage, so instead of file storage (and, actually, dot1x authentication though that's relatively minor) being local, everyone has to hit a datacenter some 500+ miles away over that OC-3 every time they have to access a file share. And since they're supposed to save everything to their personal share drive instead of the actual machine they're sitting at, the results are predictable. So how much is it going to cost for the OC-12 over the OC-3 annually? Is that difference higher or lower than the cost to run a couple of storage servers on-site? I don't know the math personally, but I do know that if we had storage (and RADIUS auth and hell, even a shell server) on site, we wouldn't be needing to upgrade to an OC-12.
At enterprise storage costs, that much storage will cost more than the OC-12, and then add datacenter and backups. Total could be 2-3x OC-12 annual costs. If your org can afford to buy non-top-line storage then it would probably be cheaper to go local. However, you should check how much of the bandwidth is actually storage. I see multimillion dollar projects without basic demand / needs analysis or statistics more often than not. George William Herbert Sent from my iPhone
On Mar 14, 2016, at 10:01 AM, George Metz <george.metz@gmail.com> wrote:
On Mon, Mar 14, 2016 at 12:44 PM, Lee <ler762@gmail.com> wrote:
Yes, *sigh*, another what kind of people _do_ we have running the govt story. Altho, looking on the bright side, it could have been much worse than a final summing up of "With the current closing having been reported to have saved over $2.5 billion it is clear that inroads are being made, but ... one has to wonder exactly how effective the initiative will be at achieving a more effective and efficient use of government monies in providing technology services."
Best Regards, Lee
That's an inaccurate cost savings though most likely; it probably doesn't take into account the impacts of the consolidation on other items. As a personal example, we're in the middle of upgrading my site from an OC-3 to an OC-12, because we're running routinely at 95+% utilization on the OC-3 with 4,000+ seats at the site. The reason we're running that high is because several years ago, they "consolidated" our file storage, so instead of file storage (and, actually, dot1x authentication though that's relatively minor) being local, everyone has to hit a datacenter some 500+ miles away over that OC-3 every time they have to access a file share. And since they're supposed to save everything to their personal share drive instead of the actual machine they're sitting at, the results are predictable.
So how much is it going to cost for the OC-12 over the OC-3 annually? Is that difference higher or lower than the cost to run a couple of storage servers on-site? I don't know the math personally, but I do know that if we had storage (and RADIUS auth and hell, even a shell server) on site, we wouldn't be needing to upgrade to an OC-12.
Datacenter isn't actually an issue since there's room in the same racks (ironically, in the location the previous fileservers were) as the Domain Controllers and WAN Accelerators. Based on the "standard" (per the Windows admins) file storage space of 700 meg, that sounds like 3TB for user storage. Even if it were 30TB, I still can't see a proper setup costing more than the OC-12 after a period of two years. Org is within the Federal Government, so they're not allowed to buy non-top-line anything. I agree we should check how much bandwidth is storage, but since there's a snowball's chance in hell of them actually making a change, it's almost certainly not worth the paperwork. On Mon, Mar 14, 2016 at 1:28 PM, George Herbert <george.herbert@gmail.com> wrote:
At enterprise storage costs, that much storage will cost more than the OC-12, and then add datacenter and backups. Total could be 2-3x OC-12 annual costs.
If your org can afford to buy non-top-line storage then it would probably be cheaper to go local.
However, you should check how much of the bandwidth is actually storage. I see multimillion dollar projects without basic demand / needs analysis or statistics more often than not.
George William Herbert Sent from my iPhone
On Mar 14, 2016, at 10:01 AM, George Metz <george.metz@gmail.com> wrote:
On Mon, Mar 14, 2016 at 12:44 PM, Lee <ler762@gmail.com> wrote:
Yes, *sigh*, another what kind of people _do_ we have running the govt story. Altho, looking on the bright side, it could have been much worse than a final summing up of "With the current closing having been reported to have saved over $2.5 billion it is clear that inroads are being made, but ... one has to wonder exactly how effective the initiative will be at achieving a more effective and efficient use of government monies in providing technology services."
Best Regards, Lee
That's an inaccurate cost savings though most likely; it probably doesn't take into account the impacts of the consolidation on other items. As a personal example, we're in the middle of upgrading my site from an OC-3 to an OC-12, because we're running routinely at 95+% utilization on the OC-3 with 4,000+ seats at the site. The reason we're running that high is because several years ago, they "consolidated" our file storage, so instead of file storage (and, actually, dot1x authentication though that's relatively minor) being local, everyone has to hit a datacenter some 500+ miles away over that OC-3 every time they have to access a file share. And since they're supposed to save everything to their personal share drive instead of the actual machine they're sitting at, the results are predictable.
So how much is it going to cost for the OC-12 over the OC-3 annually? Is that difference higher or lower than the cost to run a couple of storage servers on-site? I don't know the math personally, but I do know that if we had storage (and RADIUS auth and hell, even a shell server) on site, we wouldn't be needing to upgrade to an OC-12.
On Mar 14, 2016, at 12:19 PM, George Metz <george.metz@gmail.com> wrote:
Based on the "standard" (per the Windows admins) file storage space of 700 meg, that sounds like 3TB for user storage. Even if it were 30TB, I still can't see a proper setup costing more than the OC-12 after a period of two years.
Org is within the Federal Government, so they're not allowed to buy non-top-line anything.
Million-plus dollar NetApps or EMC units are not at all unusual. This is a terrible pity if a small NAS from Imation/Nexsan would work redundantly for $150k or less.
I agree we should check how much bandwidth is storage, but since there's a snowball's chance in hell of them actually making a change, it's almost certainly not worth the paperwork.
This is the kind of thing whoever runs it needs to know, proves my point, and argues against local datacenters where nobody bothers to even collect performance metrics much of the time. George William Herbert Sent from my iPhone
On Mon, 14 Mar 2016, George Metz wrote:
That's an inaccurate cost savings though most likely; it probably doesn't
Politicians and sales people with inaccurate cost savings. Say it isn't so. If you think these are $100 million dollar "data centers," maybe a few billion dollars in cost savings is possible over 10 years. But if a majority of the "data centers" are a single server in a room, the cost savings of moving it to a different room may not save billions of dollars. But no one will remember. Prediction, there will be a glowing report in a year or so about the huge cost savings, and then a couple years later will be an Inspector General report about problems counting things. If that's what taxpayers want, that's what they'll get.
I was trying to resist the urge to chime in on this one, but this discussion has continued for much longer than I had anticipated... So here it goes I spent 5 years in the Marines (out now) in which one of my MANY duties was to manage these "data centers" (a part of me just died as I used that word to describe these server rooms). I can't get into what exactly I did or with what systems on such a public forum, but I'm pretty sure that most of the servers I managed would be exempted from this paper/policy. Anyways, I came across a lot of servers in my time, but I never came across one that I felt should've been located elsewhere. People have brought up the case of personal share drive, but what about the combat camera (think public relations) that has to store large quantities (100s of 1000s) of high resolution photos and retain them for years. Should I remove that COTS (commercial off the shelf) NAS underneath the Boss' desk and put in a data center 4 miles down the road, and force all that traffic down a network that was designed for light to moderate web browsing and email traffic just so I can check a box for some politician's reelection campaign ads on how they made the government "more efficient" Better yet, what about the backhoe operator who didn't call before he dug, and cut my line to the datacenter? Now we cannot respond effectively to a natural disaster in the Asian Pacific or a bombing in the Middle East or a platoon that has come under fire and will die if they can't get air support, all because my watch officer can't even login to his machine since I can no longer have a backup domain controller on-site These seem very far fetched to most civilian network operators, but to anybody who has maintained military systems, this is a very real scenario. As mentioned, I'm pretty sure my systems would be exempted, but most would not. When these systems are vital to national security and life & death situations, it can become a very real problem. I realize that this policy was intended for more run of the mill scenarios, but the military is almost always grouped in with everyone else anyways. Furthermore, I don't think most people realize the scale of these networks. NMCI, the network that the Navy and Marine Corps used (when I was in), had over 500,000 active users in the AD forest. When you have a network that size, you have to be intentional about every decision, and you should not leave it up to a political appointee who has trouble even checking their email. When you read how about much money the US military hemorrhages, just remember.... - The multi million dollar storage array combined with a complete network overhaul, and multiple redundant 100G+ DWDM links was "more efficient" than a couple of NAS that we picked up off of Amazon for maybe $300 sitting under a desk connected to the local switch. - Using an old machine that would otherwise be collecting dust to ensure that users can login to their computers despite conditions outside of our control is apparently akin to treason and should be dealt with accordingly. </rant> --Todd Sent from my iPad
On Mar 14, 2016, at 11:01 AM, George Metz <george.metz@gmail.com> wrote:
On Mon, Mar 14, 2016 at 12:44 PM, Lee <ler762@gmail.com> wrote:
Yes, *sigh*, another what kind of people _do_ we have running the govt story. Altho, looking on the bright side, it could have been much worse than a final summing up of "With the current closing having been reported to have saved over $2.5 billion it is clear that inroads are being made, but ... one has to wonder exactly how effective the initiative will be at achieving a more effective and efficient use of government monies in providing technology services."
Best Regards, Lee
That's an inaccurate cost savings though most likely; it probably doesn't take into account the impacts of the consolidation on other items. As a personal example, we're in the middle of upgrading my site from an OC-3 to an OC-12, because we're running routinely at 95+% utilization on the OC-3 with 4,000+ seats at the site. The reason we're running that high is because several years ago, they "consolidated" our file storage, so instead of file storage (and, actually, dot1x authentication though that's relatively minor) being local, everyone has to hit a datacenter some 500+ miles away over that OC-3 every time they have to access a file share. And since they're supposed to save everything to their personal share drive instead of the actual machine they're sitting at, the results are predictable.
So how much is it going to cost for the OC-12 over the OC-3 annually? Is that difference higher or lower than the cost to run a couple of storage servers on-site? I don't know the math personally, but I do know that if we had storage (and RADIUS auth and hell, even a shell server) on site, we wouldn't be needing to upgrade to an OC-12.
So... Before I go on, I have not been in Todd's shoes, either serving nor directly supporting an org like that. However, I have indirectly supported orgs like that and consulted at or supported literally hundreds of commercial and a few educational and nonprofit orgs over the last 30 years. There are corner cases where distributed resilience is paramount, including a lot of field operations (of all sorts) on ships (and aircraft and spacecraft), or places where the net really is unstable. Any generalizations that wrap those legitimate exceptions in are overreaching their valid descriptive range. That said, the vast bulk of normal world environments, individuals make justifications like Todd's and argue for distributed services, private servers, etc. And then do not run them reliably, with patches, backups, central security management, asset tracking, redundancy, DR plans, etc. And then they break, and in some cases are and will forever be lost. In other cases they will "merely" take 2, 5, 10, in one case more than 100 times longer to repair and more money to recover than they should have. Statistically these are very very poor operational practice. Not so much because of location (some) but because of lack of care and quality management when they get distributed and lost out of IT's view. Statistically, several hundred clients in and a hundred or so organizational assessments in, if I find servers that matter under desks you have about a 2% chance that your IT org can handle supporting and managing them appropriately. If you think that 98% of servers in a particular category being at high risk of unrecoverable or very difficult recovery when problems crop up is acceptable, your successor may be hiring me or someone else who consults a lot for a very bad day's cleanup. I have literally been at a billion dollar IT disaster and at tens of smaller multimillion dollar ones trying to clean it up. This is a very sad type of work. I am not nearly as cheap for recoveries as for preventive management and proactive fixes. George William Herbert Sent from my iPhone
On Mar 18, 2016, at 9:28 PM, Todd Crane <todd.crane@n5tech.com> wrote:
I was trying to resist the urge to chime in on this one, but this discussion has continued for much longer than I had anticipated... So here it goes
I spent 5 years in the Marines (out now) in which one of my MANY duties was to manage these "data centers" (a part of me just died as I used that word to describe these server rooms). I can't get into what exactly I did or with what systems on such a public forum, but I'm pretty sure that most of the servers I managed would be exempted from this paper/policy.
Anyways, I came across a lot of servers in my time, but I never came across one that I felt should've been located elsewhere. People have brought up the case of personal share drive, but what about the combat camera (think public relations) that has to store large quantities (100s of 1000s) of high resolution photos and retain them for years. Should I remove that COTS (commercial off the shelf) NAS underneath the Boss' desk and put in a data center 4 miles down the road, and force all that traffic down a network that was designed for light to moderate web browsing and email traffic just so I can check a box for some politician's reelection campaign ads on how they made the government "more efficient"
Better yet, what about the backhoe operator who didn't call before he dug, and cut my line to the datacenter? Now we cannot respond effectively to a natural disaster in the Asian Pacific or a bombing in the Middle East or a platoon that has come under fire and will die if they can't get air support, all because my watch officer can't even login to his machine since I can no longer have a backup domain controller on-site
These seem very far fetched to most civilian network operators, but to anybody who has maintained military systems, this is a very real scenario. As mentioned, I'm pretty sure my systems would be exempted, but most would not. When these systems are vital to national security and life & death situations, it can become a very real problem. I realize that this policy was intended for more run of the mill scenarios, but the military is almost always grouped in with everyone else anyways.
Furthermore, I don't think most people realize the scale of these networks. NMCI, the network that the Navy and Marine Corps used (when I was in), had over 500,000 active users in the AD forest. When you have a network that size, you have to be intentional about every decision, and you should not leave it up to a political appointee who has trouble even checking their email.
When you read how about much money the US military hemorrhages, just remember.... - The multi million dollar storage array combined with a complete network overhaul, and multiple redundant 100G+ DWDM links was "more efficient" than a couple of NAS that we picked up off of Amazon for maybe $300 sitting under a desk connected to the local switch. - Using an old machine that would otherwise be collecting dust to ensure that users can login to their computers despite conditions outside of our control is apparently akin to treason and should be dealt with accordingly. </rant>
--Todd
Sent from my iPad
On Mar 14, 2016, at 11:01 AM, George Metz <george.metz@gmail.com> wrote:
On Mon, Mar 14, 2016 at 12:44 PM, Lee <ler762@gmail.com> wrote:
Yes, *sigh*, another what kind of people _do_ we have running the govt story. Altho, looking on the bright side, it could have been much worse than a final summing up of "With the current closing having been reported to have saved over $2.5 billion it is clear that inroads are being made, but ... one has to wonder exactly how effective the initiative will be at achieving a more effective and efficient use of government monies in providing technology services."
Best Regards, Lee
That's an inaccurate cost savings though most likely; it probably doesn't take into account the impacts of the consolidation on other items. As a personal example, we're in the middle of upgrading my site from an OC-3 to an OC-12, because we're running routinely at 95+% utilization on the OC-3 with 4,000+ seats at the site. The reason we're running that high is because several years ago, they "consolidated" our file storage, so instead of file storage (and, actually, dot1x authentication though that's relatively minor) being local, everyone has to hit a datacenter some 500+ miles away over that OC-3 every time they have to access a file share. And since they're supposed to save everything to their personal share drive instead of the actual machine they're sitting at, the results are predictable.
So how much is it going to cost for the OC-12 over the OC-3 annually? Is that difference higher or lower than the cost to run a couple of storage servers on-site? I don't know the math personally, but I do know that if we had storage (and RADIUS auth and hell, even a shell server) on site, we wouldn't be needing to upgrade to an OC-12.
----- Original Message -----
From: "George Herbert" <george.herbert@gmail.com>
There are corner cases where distributed resilience is paramount, including a lot of field operations (of all sorts) on ships (and aircraft and spacecraft), or places where the net really is unstable. Any generalizations that wrap those legitimate exceptions in are overreaching their valid descriptive range.
This seems like a good time to mention my favorite example of such a thing. In the Navy, originally, and it ended up in a few other places, there was invented the concept of a 'battleshort', or 'battleshunt', depending on whom you're talking to. This was something akin to a Big Frankenstein Knife Switch across the main circuit breaker in a power panel (and maybe a couple branch circuit breakers), whose job was to make sure those didn't trip on you at an inconvenient time. Like when you were trying to lay a gun on a Bad Guy. The engineering decision that was made there was that the minor possiblity of a circuit overheating and starting something on fire was less important that *the ability to shoot at the bad guys*... Or, in my favorite example, something going wrong when launching Apollo rockets. If you examine the Firing Room recorder transcripts from the manned Apollo launches, you will find, somewhere in the terminal count, an instruction to "engage the battle short", or something like that. Men were, I have been told, stationed at strategic locations with extinguishers, in case something which would normally have tripped a breaker was forbidden from doing so by the shunt... so that the power wouldn't go out at T-4 seconds. It's referenced in this article: http://www.honeysucklecreek.net/station/ops_areas.html and a number of other places google will find you. Unknown whether this protocol was still followed in the Shuttle era, or whether it will return in the New Manned Space Flight era. But, like the four star saluting the Medal Of Honor recipient, it's one of those outliers that's *so far* out, that I love and collect them. And it's a good category of idea to have in the back of your head when planning. Cheers, -- jra -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
FYI, similar to "battleshort", the term BATTLE OVERRIDE is described [1] on page 45 of G. Gordon Liddy's book _Will_, and apparently [2] "Battle Override" was to be the original title of Liddy's autobiography, but the publisher wanted a one-word title. Quotes: "On the multidialed wall behind the radar technicians was a prominent switch with a red security cover. It was marked: BATTLE OVERRIDE", and "In the event of a battle emergency, however, the protective warm-up delay could be overridden and full power applied immediately by throwing the 'Battle Override' switch, as everything and everyone became expendable in war." Tony Patti CIO [1] https://books.google.com/books?id=YRty_4HT_8kC&pg=PA45&lpg=PA45&dq=%22battle+override%22+M-33c&source=bl&ots=RYLdUECeHF&sig=hs9i6-W_CVwe5ZcjpxbEGSh9TNE&hl=en&sa=X&ved=0ahUKEwi474vOltjLAhUKcRQKHUHmCW4Q6AEIHjAA#v=onepage&q=%22battle%20override%22%20M-33c&f=false [2] http://www.worldwizzy.com/library/G._Gordon_Liddy -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Jay R. Ashworth Sent: Tuesday, March 22, 2016 3:59 PM To: North American Network Operators' Group Subject: Top-shelf resilience (Re: Why the US Government has so many data centers) ----- Original Message -----
From: "George Herbert" <george.herbert@gmail.com>
There are corner cases where distributed resilience is paramount, including a lot of field operations (of all sorts) on ships (and aircraft and spacecraft), or places where the net really is unstable. Any generalizations that wrap those legitimate exceptions in are overreaching their valid descriptive range.
This seems like a good time to mention my favorite example of such a thing. In the Navy, originally, and it ended up in a few other places, there was invented the concept of a 'battleshort', or 'battleshunt', depending on whom you're talking to. This was something akin to a Big Frankenstein Knife Switch across the main circuit breaker in a power panel (and maybe a couple branch circuit breakers), whose job was to make sure those didn't trip on you at an inconvenient time. Like when you were trying to lay a gun on a Bad Guy. The engineering decision that was made there was that the minor possiblity of a circuit overheating and starting something on fire was less important that *the ability to shoot at the bad guys*... Or, in my favorite example, something going wrong when launching Apollo rockets. If you examine the Firing Room recorder transcripts from the manned Apollo launches, you will find, somewhere in the terminal count, an instruction to "engage the battle short", or something like that. Men were, I have been told, stationed at strategic locations with extinguishers, in case something which would normally have tripped a breaker was forbidden from doing so by the shunt... so that the power wouldn't go out at T-4 seconds. It's referenced in this article: http://www.honeysucklecreek.net/station/ops_areas.html and a number of other places google will find you. Unknown whether this protocol was still followed in the Shuttle era, or whether it will return in the New Manned Space Flight era. But, like the four star saluting the Medal Of Honor recipient, it's one of those outliers that's *so far* out, that I love and collect them. And it's a good category of idea to have in the back of your head when planning. Cheers, -- jra -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
Subject: Top-shelf resilience (Re: Why the US Government has so many data centers) Date: Tue, Mar 22, 2016 at 07:59:24PM +0000 Quoting Jay R. Ashworth (jra@baylink.com):
This seems like a good time to mention my favorite example of such a thing.
In the Navy, originally, and it ended up in a few other places, there was invented the concept of a 'battleshort', or 'battleshunt', depending on whom you're talking to.
I've built one, sort of. In an outdoor broadcasting vehicle. See, in order to get a working grounding scheme, the PDU in the bus gets to serve as power source for a lot of things that might find themselves outside, in climate. 200VDC feeds in triaxial cables to cameras, for instance. (this was before cameras were connected with singlemode fiber, but after the era of the multicore "shower handle" connectors) All this was of course built for some exposure to the elements but not for drenching. During setup, it was decided to protect people with a GFCI breaker on the main three-phase bus in the bus[0][1], but once setup, people were not really supposed to gefingerpoken the thingamaboobs, so in the interest of reliability a bypass was created for the GFCI breaker. This had to be built in-house, since no electrical contractor even wanted to contemplate it. So we did. /Måns, ex-builder of analog broadcast facilities. -- Måns Nilsson primary/secondary/besserwisser/machina MN-1334-RIPE +46 705 989668 First, I'm going to give you all the ANSWERS to today's test ... So just plug in your SONY WALKMANS and relax!! [0] Pun not intended but carefully kept once discovered. [1] This is (continental) Europe, where we are not afraid of 405VAC three-phase mains. Tesla was European. Edison was born to American parents.
----- Original Message -----
From: "Lee" <ler762@gmail.com>
On 3/13/16, Sean Donelan wrote:
I doubt anyone really believes that having a server in the room makes it a data center. But if you're the Federal CIO pushing the cloud first policy, this seems like a great bureaucratic maneuver to get the decision making away from the techies that like redundant servers in multiple locations, their managers who's job rating depends on providing reliable services and even the agency CIOs. Check the reporting section of the memo where it says "each agency head shall annually publish a Data Center Consolidation and Optimization Strategic Plan". I dunno, but I'm guessing agency heads are political appointees that aren't going to spend much, if any, time listening to techies whine about how important their servers are & why they can't be consolidated, virtualized or outsourced.
Fine. But when some Armenian script kiddie DDoSing Netflix takes down your TSA terrorist lookup service, and you come to me asking why the plane blew up, I'm going to tell you "because you fucking ignored my written advice on the matter", while I'm packing my desk. In writing. Cheers, -- jra -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
On Tue, 22 Mar 2016, Jay R. Ashworth wrote:
But when some Armenian script kiddie DDoSing Netflix takes down your TSA terrorist lookup service, and you come to me asking why the plane blew up, I'm going to tell you "because you fucking ignored my written advice on the matter", while I'm packing my desk.
DOCI is about physical data center opimization, not about network or service availability. DCOI metrics: - Energy metering - Power Usage Effectiveness (PUE) - Virtualization - Server Utilization & Automated Monitoring - Facility Utilization Why do you have two circuits with only 40% utilization. The auditor says that's waste, and you only need one circuit at 80% utilization for half the cost.
Circuit utilization, capacity and availability shouldn't be calculated separately in a data center environment. If you look at each separately you risk making some expensive mistakes. *Rafael Possamai* Founder & CEO at E2W Solutions *office:* (414) 269-6000 *e-mail:* rafael@e2wsolutions.com On Tue, Mar 22, 2016 at 11:11 AM, Sean Donelan <sean@donelan.com> wrote:
On Tue, 22 Mar 2016, Jay R. Ashworth wrote:
But when some Armenian script kiddie DDoSing Netflix takes down your TSA terrorist lookup service, and you come to me asking why the plane blew up, I'm going to tell you "because you fucking ignored my written advice on the matter", while I'm packing my desk.
DOCI is about physical data center opimization, not about network or service availability.
DCOI metrics: - Energy metering - Power Usage Effectiveness (PUE) - Virtualization - Server Utilization & Automated Monitoring - Facility Utilization
Why do you have two circuits with only 40% utilization. The auditor says that's waste, and you only need one circuit at 80% utilization for half the cost.
On Tue, 22 Mar 2016 12:11:11 -0400, Sean Donelan said:
Why do you have two circuits with only 40% utilization. The auditor says that's waste, and you only need one circuit at 80% utilization for half the cost.
And of course, said auditor is probably near impervious to the very real and valid reasons you have 2 circuits. Because as Upton Sinclair wrote around a century ago: "You cannot make a man understand something when his paycheck depends on him not understanding it".
Come on, the audit requirements should have diversity/redundancy concerns in them. That's standard in all the audits I have done or participated in. If these ones don't I have a marketing opportunity to teach a HA seminar and followon consulting to the IG. George William Herbert Sent from my iPhone
On Mar 22, 2016, at 10:59 AM, Valdis.Kletnieks@vt.edu wrote:
On Tue, 22 Mar 2016 12:11:11 -0400, Sean Donelan said:
Why do you have two circuits with only 40% utilization. The auditor says that's waste, and you only need one circuit at 80% utilization for half the cost.
And of course, said auditor is probably near impervious to the very real and valid reasons you have 2 circuits. Because as Upton Sinclair wrote around a century ago:
"You cannot make a man understand something when his paycheck depends on him not understanding it".
On Tue, 22 Mar 2016, George Herbert wrote:
Come on, the audit requirements should have diversity/redundancy concerns in them.
That's standard in all the audits I have done or participated in.
If these ones don't I have a marketing opportunity to teach a HA seminar and followon consulting to the IG.
Turn on C-SPAN and watch any random congressional oversight hearing. Reasonable, rational or logical thoughts are rare. You may be making assumptions that aren't supported. Just ask Flint Michigan about saving money on cheaper water supplies.
The last time I checked, the US CIO office was understaffed and fighting the bureaucratic hydra and mostly losing, but competent and doing things like providing IGs with relevant ammo. If not true in this case then the audit should be redone with relevant criteria. George William Herbert Sent from my iPhone
On Mar 22, 2016, at 11:36 AM, Sean Donelan <sean@donelan.com> wrote:
On Tue, 22 Mar 2016, George Herbert wrote: Come on, the audit requirements should have diversity/redundancy concerns in them.
That's standard in all the audits I have done or participated in.
If these ones don't I have a marketing opportunity to teach a HA seminar and followon consulting to the IG.
Turn on C-SPAN and watch any random congressional oversight hearing.
Reasonable, rational or logical thoughts are rare. You may be making assumptions that aren't supported. Just ask Flint Michigan about saving money on cheaper water supplies.
* Mark T. Ganzer:
Note that I an not answering in any sort of "official" capacity....but I will instead ask this for your consideration: Do servers in "test, stage, development, or any other environment" really need to have the same environmental, power and connectivity requirements that "production" servers have?
Depends on the process. If you can push to production without pushing to stage first, then stage and production need the same service level.
On Sat, 12 Mar 2016, Roland Dobbins wrote:
The U.S. Government has an odd defintion of what is a data center, which ends up with a lot of things no rational person would call a data center.
There's also a case to be made that governmental organizations really oughtn't to have servers just lying around in random rooms, and that those rooms are de facto government data centers, whether those who're responsible for said rooms/servers know it or not . . .
If that is the goal, don't call it data center optimization. That is server optimization. When you say "data center" to an ordinary, average person or reporter; they think of big buildings filled with racks of computers. Not a lonely server sitting in a test lab or under someone's desk.
* Sean Donelan:
When you say "data center" to an ordinary, average person or reporter; they think of big buildings filled with racks of computers. Not a lonely server sitting in a test lab or under someone's desk.
I suspect part of the initiative is to get rid of that mindset, which leads to such gems as “we don't have any servers, so we only need to secure our clients”.
I can confirm this. I was working at NASA when the last "data call" was put out. We had a room with a flight simulator in it, powered by an SGI Onyx2. The conversation with the auditor went like this: Auditor *points at Onyx2* "Is that machine shared?" Me: "Well yeah, the whole group uses it to..." Auditor: *aside, to colleague* "OK, mark this room down too." And our flight simulator lab became a data center. On Fri, Mar 11, 2016 at 9:03 AM, Sean Donelan <sean@donelan.com> wrote:
If you've wondered why the U.S. Government has so many data centers, ok I know no one has ever asked.
The U.S. Government has an odd defintion of what is a data center, which ends up with a lot of things no rational person would call a data center.
If you call every room with even one server a "data center," you'll end up with tens of thousands of rooms now data centers. With this defintiion, I probably have two data centers in my home. Its important because Inspectors General auditors will go around and count things, because that's what they do, and write reports about insane numbers of data centers.
https://datacenters.cio.gov/optimization/
"For the purposes of this memorandum, rooms with at least one server, providing services (whether in a production, test, stage, development, or any other environment), are considered data centers. However, rooms containing only routing equipment, switches, security devices (such as firewalls), or other telecommunications components shall not be considered data centers."
On 3/11/16 9:03 AM, Sean Donelan wrote:
https://datacenters.cio.gov/optimization/
"For the purposes of this memorandum, rooms with at least one server, providing services (whether in a production, test, stage, development, or any other environment), are considered data centers. However, rooms containing only routing equipment, switches, security devices (such as firewalls), or other telecommunications components shall not be considered data centers."
In other words, Hillary Clinton's bathroom closet is a data center. -- -- Jay Hennigan - CCIE #7880 - Network Engineering - jay@impulse.net Impulse Internet Service - http://www.impulse.net/ Your local telephone and internet company - 805 884-6323 - WB6RDV
Guess what, an IG decides to count "data centers" using OMB's definition of a data center. CIO points out those "data centers" won't save money. https://fcw.com/articles/2016/04/11/lyngaas-halvorsen-update.aspx The IG report knocked Halvorsen for not adjusting his strategy to account for a revised definition of data centers from the Office of Management and Budget. But Halvorsen defended that decision, saying the revised definition focused on special-purpose processing nodes, which are data centers that have no direct connection to the DOD Information Network. "Those nodes aren't where the money [is], and in most cases, there's no value in consolidating them," Halvorsen said.
participants (18)
-
amuse
-
Christopher Morrow
-
Florian Weimer
-
George Herbert
-
George Metz
-
Jay Hennigan
-
Jay R. Ashworth
-
Lee
-
Mark T. Ganzer
-
Måns Nilsson
-
Nick Hilliard
-
Rafael Possamai
-
Roland Dobbins
-
Sean Donelan
-
Steve Mikulasik
-
Todd Crane
-
Tony Patti
-
Valdis.Kletnieks@vt.edu