Skip to main content

Information Security: National Nuclear Security Administration Needs to Improve Contingency Planning for Its Classified Supercomputing Operations

GAO-11-67 Published: Dec 09, 2010. Publicly Released: Dec 09, 2010.
Jump To:
Skip to Highlights

Highlights

In the absence of underground nuclear weapons testing, the National Nuclear Security Administration (NNSA) relies on its supercomputing operations at its three weapons laboratories to simulate the effects of changes to current weapons systems, calculate the confidence of future untested systems, and ensure military requirements are met. GAO was requested to assess the extent to which (1) NNSA has implemented contingency and disaster recovery planning and testing for its classified supercomputing systems, (2) the laboratories are able to share supercomputing capacity for recovery operations, and (3) NNSA tracks the costs for contingency and disaster recovery planning for supercomputing assets. To do this work, GAO examined contingency and disaster recovery planning policies and activities, and analyzed classified supercomputing capabilities at the weapons laboratories, and NNSA budgetary data.

All three NNSA weapons laboratories--Los Alamos, Sandia, and Lawrence Livermore--have implemented some components of a contingency planning and disaster recovery program. NNSA, however, has not provided effective oversight to ensure that the laboratories have comprehensive and effective contingency and disaster recovery planning and testing. Further, due to lack of planning and analysis by NNSA and the laboratories, the impact of a system outage is unclear. Only one of the three laboratories--Los Alamos--had conducted a business impact analysis to assess the criticality of resources and acceptable outage time frames; yet, NNSA and all three laboratories consider the consequence associated with the loss of system availability to be low impact and do not consider the classified supercomputers to be mission critical. Nonetheless, NNSA classified supercomputing capabilities serve as a computational surrogate to nuclear weapons testing and are used to address other areas of national security. Despite the absence of business impact analyses, all laboratories had key components of a contingency planning program in place. However, shortcomings existed. For example, all laboratories had backup processes in place and had developed contingency plans, but the plans were not comprehensive. Specifically, one plan did not address the supercomputing operations, and none of the plans had been tested at the time of GAO's review. In addition, the laboratories addressed disaster recovery to a limited extent, but not specifically for the supercomputers. These shortcomings existed, at least in part, because NNSA's component organizations, including the Office of the Chief Information Officer, were unclear about their roles and responsibilities for providing oversight in the laboratories' implementation of contingency and disaster recovery planning. Until the agency fully implements a contingency and disaster recovery planning program for its weapons laboratories, it has limited assurance that vital information can be recovered and made available to meet national security priorities and requirements. Although the laboratories have the technological capability to share supercomputing capacity across all three weapons laboratories, barriers exist that could impede recovery operations. For example, the laboratories do not know the minimum supercomputing capacity needed to meet program requirements, such as simulating the effects of changes to weapons systems, should a disruption occur. In addition, the laboratories have not tested the technological capability to share the capacity on an on-demand basis for recovery operations. Without having an understanding of capacity needs and subsequent testing, the laboratories have little assurance that they could effectively share capacity if needed. Although NNSA obligated approximately $1.7 billion to help implement its classified supercomputing program from fiscal years 2007 through 2009, the agency has not tracked costs for contingency and disaster recovery planning and is uncertain of actual funds that were spent toward these efforts. GAO recommends, among other things, that NNSA clearly define roles and responsibilities for its component organizations in providing oversight for contingency and disaster recovery planning for the classified supercomputing environment. NNSA agreed with most of GAO's recommendations, but did not concur with recommendations relating to capacity planning and cost tracking.

Recommendations

Recommendations for Executive Action

Agency Affected Recommendation Status
National Nuclear Security Administration To improve the effectiveness of contingency and disaster recovery planning for NNSA's classified supercomputing capabilities, the Administrator of NNSA should, where not already implemented, direct the weapons laboratories to develop business impact analyses that, among other things, (1) identify and prioritize critical systems, data, and supporting resources; (2) identify allowable outage times and impacts for classified supercomputing capabilities; and (3) identify recovery priorities and strategies.
Closed – Implemented
In 2015 we verified that NNSA, in response to our recommendation, developed a National-Level Business Impact Analysis for its three laboratories. The business impact analysis identified and prioritized critical systems, data, and supporting resources. It also identified allowable outage times and impacts for classified supercomputing capabilities and identified recovery priorities and strategies. By taking this action, NNSA reduces the risk that the potential impact of a system outages will not be unknown.
National Nuclear Security Administration To improve the effectiveness of contingency and disaster recovery planning for NNSA's classified supercomputing capabilities, the Administrator of NNSA should, where not already implemented, direct the weapons laboratories to develop and implement comprehensive contingency and disaster recovery plans for all classified supercomputing systems that identify how each weapons laboratory's classified supercomputing capabilities will be recovered following service disruptions.
Closed – Implemented
In 2015 we verified that NNSA, in response to our recommendation, developed and implemented comprehensive contingency and disaster recovery plans for all classified supercomputing systems that identify how each weapons laboratory's classified supercomputing capabilities will be recovered following a service disruption. NNSA developed a business continuity plan for its software applications which identified how software applications will be recovered following a service disruption. NNSA also developed remote back-up disaster and recovery agreement for two of its laboratories, Furthermore, NNSA developed Business Continuity Plan for its Sandia National Laboratories and a Continuity Plan for Advanced Simulation & Computing (ASC) High Performance Computing. By taking these actions, NNSA reduces the risk of not being able to successfully recover their supercomputing assets and operations after a service disruption.
National Nuclear Security Administration To improve the effectiveness of contingency and disaster recovery planning for NNSA's classified supercomputing capabilities, the Administrator of NNSA should, where not already implemented, direct the weapons laboratories to conduct contingency and disaster recovery plan testing.
Closed – Implemented
In 2015 we verified that NNSA, in response to our recommendation, conducted contingency and disaster recovery plan testing. By taking this action, NNSA reduces the risk of not being able to successfully recover their supercomputing assets and operations after a service disruption.
National Nuclear Security Administration To improve the effectiveness of contingency and disaster recovery planning for NNSA's classified supercomputing capabilities, the Administrator of NNSA should, where not already implemented, direct the weapons laboratories to test the three weapons laboratories' ability to share "on-demand" classified supercomputing capacity to ensure this capability will work in the event of unexpected service disruptions.
Closed – Implemented
In 2015 we verified that NNSA, in response to our recommendation, tested its three weapons laboratories' ability to share "on-demand" classified supercomputing capacity to ensure this capability will work in the event of unexpected service disruption. By taking this action, NNSA has increased assurance that its disaster recovery approach would work effectively should a service disruption occur.
National Nuclear Security Administration The Administrator of NNSA should document an agencywide means for reprioritizing the workload across NNSA's classified supercomputing systems should a disruption occur.
Closed – Implemented
In 2015 we verified that NNSA, in response to our recommendation, documented an agencywide means for reprioritizing the workload across NNSA's classified supercomputing systems. By taking this action, NNSA has increased assurance that is disaster recovery approach would work effectively should a service disruption occur.
National Nuclear Security Administration The Administrator of NNSA should clearly define the oversight responsibilities of the NNSA Advanced Simulation and Computing (ASC) program office and the NNSA Office of the Chief Information Officer, as they relate to contingency and disaster recovery planning for NNSA's classified supercomputing operations.
Closed – Implemented
In 2015 we verified that NNSA, in response to our recommendation, clearly defined the oversight responsibilities for contingency and disaster recovery planning for its classified supercomputing operation. By taking this action, NNSA reduces the risk that it will not be able to effectively manage and oversee the recovery of its supercomputing operations should service disruptions occur.
National Nuclear Security Administration The Administrator of NNSA should identify, assess, and communicate the minimum classified supercomputing capacity needed to meet Stockpile Stewardship requirements in the event of a service disruption.
Closed – Implemented
In 2015 we verified that NNSA, in response to our recommendation, identified, assessed, and communicated the minimum classified supercomputing capacity needed to meet Stockpile Stewardship requirements in the event of a service disruption. By taking these actions, NNSA has increased assurance that is disaster recovery approach would work effectively should a service disruption occur.
National Nuclear Security Administration The Administrator of NNSA should develop, document, and implement a process that identifies and tracks expenditures for contingency and disaster recovery planning for NNSA's classified supercomputing assets.
Closed – Not Implemented
NNSA did not concur with this recommendation; it believes that tracking expenses separately would not add significant value.
National Nuclear Security Administration The Administrator of NNSA should develop and document the total anticipated costs for contingency and disaster recovery planning of NNSA's classified supercomputing assets, which includes the replacemnet costs for these assets.
Closed – Not Implemented
NNSA did not concur with this recommendation.

Full Report

GAO Contacts

Office of Public Affairs

Topics

Computer capacityComputer securityDisaster recoveryDisaster recovery plansInformation securityInternal controlsIT contingency plansLaboratoriesNational defense operationsNuclear weapons testingService disruptionStrategic national stockpileSupercomputersWeapons systemsTechnology assessmentSystems testingCost estimatesProgram costs