Resources - System Safety

EXCERPTS - MIL-STD-882D DEPARTMENT OF DEFENSE - STANDARD PRACTICE - SYSTEM SAFETY

DEFINITIONS

  1. Acquisition program.  A directed, funded effort that is designed to provide a new, improved, or continuing system in response to a validated operational need.
  2. Developer.  The individual or organization assigned responsibility for a development effort.  Developers can be either internal to the government or contractors.
  3. Fail safe.  A design feature that ensures the system remains safe, or in the event of a failure, causes the system to revert to a state that will not cause a mishap.
  4. Hazard.  Any real or potential condition that can cause injury, illness, or death to personnel; damage to or loss of equipment or property; or damage to the environment.
  5. Hazardous material.  Any substance that, due to its chemical, physical, or biological nature, causes safety, public health, or environmental concerns that would require an elevated level of effort to manage.
  6. Health hazard assessment.  The application of biomedical knowledge and principles to identify and eliminate or control health hazards associated with systems in direct support of the life-cycle management of materiel items.
  7. Life cycle.  All phases of the system's life including research, development, test and evaluation, production, deployment (inventory), operations and support, and disposal .
  8. Mishap.  An unplanned event or series of events resulting in death, injury, occupational illness, damage to or loss of equipment or property, or damage to the environment.
  9. Mishap probability.  The aggregate probability of occurrence of the individual events that might be created by a specific hazard.
  10. Mishap probability levels.  An arbitrary categorization that provides a qualitative measure of the most reasonable likelihood of occurrence of a mishap resulting from personnel error, environmental conditions, design inadequacies, procedural deficiencies, or system, subsystem, or component failure or malfunction.
  11. Mishap risk.  An expression of the impact and possibility of a mishap in terms of potential mishap severity and probability of occurrence.
  12. Mishap risk assessment.  The process of characterizing hazards within risk areas and critical technical processes, analyzing them for their potential mishap severity
  13. Mishap risk categories.  An arbitrary categorization of mishap risk assessment values often used to generate specific action such as mandatory reporting of certain hazards to management for action, or formal acceptance of the associated mishap risk.
  14. Mishap severity.  An assessment of the consequences of the most reasonable credible mishap that could be caused by a specific hazard.
  15. Mishap severity category.  An arbitrary categorization that provides a qualitative measure of the most reasonable credible mishap resulting from personnel error, environmental conditions, design inadequacies, procedural deficiencies, or system, subsystem, or component failure or malfunction.
  16. Program manager.  A government official who is responsible for managing an acquisition program.  Also, a general term of reference to those organizations directed by individual managers, exercising authority over the planning, direction, and control of tasks and associated functions essential for support of designated systems.  This term will normally be used in lieu of system support manager, weapon program manager, system manager, and project manager when such organizations perform these functions.
  17. Residual mishap risk.  The remaining mishap risk that exists after all mitigation techniques have been implemented or exhausted, in accordance with the system safety design order of precedence (see 4.4).
  18.   Safety.  Freedom from those conditions that can cause death, injury, occupational illness, damage to or loss of equipment or property, or damage to the environment.
  19. Safety critical.  A term applied to any condition, event, operation, process, or item whose proper recognition, control, performance, or tolerance is essential to safe system operation and support (e.g., safety critical function, safety critical path, or safety critical component).
  20. Subsystem.  A grouping of items satisfying a logical group of functions within a particular system.
  21. System.  An integrated composite of people, products, and processes that provide a capability to satisfy a stated need or objective.
  22. System safety.  The application of engineering and management principles, criteria, and techniques to achieve acceptable mishap risk, within the constraints of operational effectiveness, time, and cost, throughout all phases of the system life cycle.
  23. System safety engineering.  An engineering discipline that employs specialized professional knowledge and skills in applying scientific and engineering principles, criteria, and techniques to identify and eliminate hazards, in order to reduce the associated mishap risk.
  24. System safety management.  All plans and actions taken to identify, assess, mitigate, and continuously track, control, and document environmental, safety, and health mishap risks encountered in the development, test, acquisition, use, and disposal of DoD weapon systems, subsystems, equipment, and facilities.

GENERAL REQUIREMENTS OF SYSTEM SAFETY

Identification of hazards.  Identify hazards through a systematic hazard analysis process encompassing detailed analysis of system hardware and software, the environment (in which the system will exist), and the intended usage or application.  Historical hazard and mishap data, including lessons learned from other systems, are considered and used.  Identification of hazards is a responsibility of all members of the program.  During hazard identification, consideration is given to hazards over the system life cycle.

Assessment of mishap risk.  Assess the severity and probability of the mishap risk associated with each identified hazard, i.e., determine the potential impact of the hazard on personnel, facilities, equipment, operations, the public, and the environment, as well as on the system itself.

Identification of mishap risk mitigation measures.  Identify potential mishap risk mitigation alternatives and the expected effectiveness of each alternative or method.  Mishap risk mitigation is an iterative process that culminates when the residual mishap risk has been reduced to a level acceptable to the appropriate authority.  The system safety design order of precedence for mitigating identified hazards is:
  1. Eliminate hazards through design selection.  If an identified hazard cannot be eliminated, reduce the associated mishap risk to an acceptable level.
  2. Incorporate safety devices.  If the hazard cannot be eliminated, reduce the mishap risk to an acceptable level through the use of protective safety features or devices.
  3. Provide warning devices.  If safety devices do not adequately lower the mishap risk of the hazard, include a detection and warning system to alert personnel to the particular hazard.
  4. Develop procedures and training.  Where it is impractical to eliminate hazards through design selection or to reduce the associated risk to an acceptable level with safety and warning devices, incorporate special procedures and training.  Procedures may include the use of personal protective equipment.

Reduction of mishap risk to an acceptable level.  Reduce the mishap risk through a mitigation approach mutually agreed to by both the developer and the program manager.  Residual mishap risk and hazards must be communicated to the associated test effort for verification.

Verification of mishap risk reduction.  Verify the mishap risk reduction and mitigation through appropriate analysis, testing, or inspection.  Document the determined residual mishap risk.  New hazards identified during testing must be reported to the program manager and the developer.

Review of hazards and acceptance of residual mishap risk by the appropriate authority.  Notify the program manager of identified hazards and residual mishap risk.  The program manager must ensure that remaining hazards and residual mishap risk are reviewed and accepted by the appropriate risk acceptance authority.  The appropriate risk acceptance authority must include the system user in the mishap risk review.  The appropriate risk acceptance authority must formally acknowledge and document acceptance of hazards and residual mishap risk.

Tracking of hazards and residual mishap risk.  Track hazards, their closure, and residual mishap risk.  A tracking system for hazards, their closure, and residual mishap risk must be maintained throughout the system life cycle.  The program manager must keep the system user apprised of the hazards and residual mishap risk.

GENERAL REQUIREMENTS

General. System safety applies engineering and management principles, criteria, and techniques to achieve acceptable mishap risk, within the constraints of operational effectiveness, time, and cost, throughout all phases of the system life cycle.  It draws upon professional knowledge and specialized skills in the mathematical, physical, and scientific disciplines, together with the principles and methods of engineering design and analysis, to specify and evaluate the environmental, safety, and health mishap risk associated with a system.  Experience indicates that the degree of safety achieved in a system is directly dependent upon the emphasis given.  The program manager and the developer must apply this emphasis during all phases of the life cycle.  A safe design is a prerequisite for safe operations, with the goal being to produce an inherently safe product that will have the minimum safety-imposed operational restrictions.

System safety in environmental and health hazard management. While environmental and health hazard management are normally associated with the application of statutory direction and requirements, the management of mishap risk associated with actual environmental and health hazards is directly addressed by the system safety approach.  Therefore, environmental and health hazards can be analyzed and managed with the same tools as any other hazard, whether they affect equipment, the environment, or personnel.

System safety planning

Prior to formally documenting the system safety approach, the program manager, in concert with systems engineering and associated system safety professionals, must determine what system safety effort is necessary to meet program and regulatory requirements.  This effort includes developing a planned approach for safety task accomplishment, providing qualified people to accomplish the tasks, establishing the authority for implementing the safety tasks through all levels of management, and allocating appropriate resources to ensure that the safety tasks are completed.

System safety planning subtasks.  System safety planning subtasks should:

  1. Establish specific safety performance requirements based on overall program requirements and system user inputs.
  2. Establish a system safety organization or function and the required lines of communication with associated organizations (government and contractor). 
  3. Establish interfaces between system safety and other functional elements of the program, as well as with other safety and engineering disciplines (such as nuclear, range, explosive, chemical,and biological). 
  4. Designate the organizational unit responsible for executing each safety task.  Establish the authority for resolution of identified hazards.
  5. Establish system safety milestones and relate these to major program milestones, program element responsibility, and required inputs and outputs.
  6. Establish an incident alerting/notification, investigation, and reporting process, to include notification of the program manager.
  7. Establish an acceptable level of mishap risk, mishap probability and severity thresholds, and documentation requirements (including but not limited to hazards and residual mishap risk).
  8. Establish an approach and methodology for reporting to the program manager the following information:
    • Safety critical characteristics and features.
    • Operating, maintenance, and overhaul safety requirements.
    • Measures used to eliminate or mitigate hazards.
    • Acquisition management of hazardous materials.
  9. Establish the method for the formal acceptance and documenting of residual mishap risks and the associated hazards.
  10. Establish the method for communicating hazards, the associated risks, and residual mishap risk to the system user.
  11. Specify requirements for other specialized safety approvals as necessary.

Safety performance requirements

These are the general safety requirements needed to meet the core program objectives.  The more closely these requirements relate to a given program, the more easily the designers can incorporate them into the system.  In the appropriate system specifications, incorporate the safety performance requirements that are applicable, and the specific risk levels considered acceptable for the system.

Acceptable risk levels can be defined in terms of:

  • a hazard category developed through a mishap risk assessment matrix;
  • an overall system mishap rate; demonstration of controls required to preclude unacceptable conditions;
  • satisfaction of specified standards and regulatory requirements; or
  • other suitable mishap risk assessment procedures. 

Listed below are some examples of how safety performance requirements could be stated.

  • Quantitative requirements.  Quantitative requirements are usually expressed as a failure or mishap rate, such as "The catastrophic system mishap rate shall not exceed x.xx X 10-y per operational hour."
  • Mishap risk requirements.  Mishap risk requirements could be expressed as "No hazards assigned a Catastrophic mishap severity are acceptable." 
  • Mishap risk requirements could also be expressed as a level defined by a mishap risk assessment, such as "No Category 3 or higher mishap risks are acceptable."
  • Standardization requirements.  Standardization requirements are expressed relative to a known standard that is relevant to the system being developed.  Examples include: "The system will comply with the laws of the State of XXXXX and be operable on the highways of the State of XXXXX" or "The system will be designed to meet ANSI Std XXX as a minimum."

Safety design requirements

The program manager, in concert with the chief engineer and utilizing systems engineering and associated system safety professionals, should establish specific safety design requirements for the overall system.  The objective of safety design requirements is to achieve acceptable mishap risk through a systematic application of design guidance from standards, specifications, regulations, design handbooks, safety design checklists, and other sources.  These are reviewed for safety design parameters and acceptance criteria applicable to the system.  Safety design requirements derived from the selected parameters, as well as any associated acceptance criteria, are included in the system specification.  These requirements and criteria are expanded for inclusion in the associated follow-on or lower level specifications.  Some general safety system design requirements are listed below.

1.  Hazardous material use is minimized, eliminated, or associated mishap risks are reduced through design, including material selection or substitution.  When potentially hazardous materials must be used, the materials that pose the least risk throughout the life cycle of the system are selected.

2.  Hazardous substances, components, and operations are isolated from other activities, areas, personnel, and incompatible materials.

3.  Equipment is located so that access during operations, servicing, repair, or adjustment minimizes personnel exposure to hazards (e.g., hazardous substances, high voltage, electromagnetic radiation, and cutting and puncturing surfaces).

4.  Power sources, controls, and critical components of redundant subsystems are protected by physical separation or shielding, or by other acceptable methods.

5.  Safety devices that will minimize mishap risk (e.g., interlocks, redundancy, fail safe design, system protection, fire suppression, and protective measures such as clothing, equipment, devices, and procedures) are considered for hazards that cannot be eliminated.  Provisions are made for periodic functional checks of safety devices when applicable.

6.  System disposal (including explosive ordnance disposal) and demilitarization are considered in the design.

7.  Warning signals are implemented so as to minimize the probability of incorrect personnel reaction to the signals, and should be standardized within like types of systems.

8.  Warning and cautionary notes are provided in assembly, operation, and maintenance instructions, and distinctive markings are provided on hazardous components, equipment, and facilities to ensure personnel and equipment protection when no alternate design approach can eliminate a hazard.  Use standard warning and cautionary notations where multiple applications occur.  Standardize notations in accordance with commonly accepted commercial practice or, if none exists, normal military procedures.  Do not use warning, caution, or other written advisory as the only risk reduction method for hazards assigned Catastrophic or Critical mishap severities. 

9.  Safety critical tasks may require personnel proficiency; if so, the developer should propose a proficiency certification process to be used.

10.  Severity of injury or damage to equipment or the environment as a result of a mishap is minimized.

l1.  Inadequate or overly restrictive requirements regarding safety are not included in the system specification.

12.  Acceptable risk is achieved in implementing new technology, materials, or designs in an item’s production, test, and operation.  Changes to design, configuration, production, or mission requirements (including any resulting system modifications and upgrades, retrofits, insertions of new technologies or materials, or use of new production or test techniques) are accomplished in a manner that maintains an acceptable level of mishap risk.  Changes to the environment in which the system operates are analyzed to identify and mitigate any resulting hazards or changes in mishap risks.

Safety Critical Conditions

Some program managers may include the following conditions in their solicitation, system specification, or contract as requirements for the system design.  These condition statements are used optionally as supplemental requirements based on specific program needs.

Unacceptable conditions.  The following safety critical conditions are considered unacceptable for development efforts.  Positive action and verified implementation is required to reduce the mishap risk associated with these situations to a level acceptable to the program manager.

1.  Single component failure, common mode failure, human error, or a design feature that could cause a mishap of Catastrophic or Critical severity.

2.  Dual independent component failures, dual independent human errors, or a combination of a component failure and a human error involving safety critical command and control functions, which could cause a mishap of Catastrophic or Critical severity.

3.  Generation of hazardous radiation or energy, when no provisions have been made to protect personnel or sensitive subsystems from damage or adverse effects.

4.  Packaging or handling procedures and characteristics that could cause a mishap for which no controls have been provided to protect personnel or sensitive equipment.

5.  Hazard categories that are specified as unacceptable in the development agreement.

Acceptable conditions.  The following approaches are considered acceptable for correcting unacceptable conditions and will require no further analysis once mitigating actions are implemented and verified.

1.  For non-safety critical command and control functions: a system design that requires two or more independent human errors, or that requires two or more independent failures, or a combination of independent failure and human error.

2.  For safety critical command and control functions: a system design that requires at least three independent failures, or three independent human errors, or a combination of three independent failures and human errors.

3.  System designs that positively prevent errors in assembly, installation, or connections that could result in a mishap.

4.  System designs that positively prevent damage propagation from one component to another or prevent sufficient energy propagation to cause a mishap.

5.  System design limitations on operation, interaction, or sequencing that preclude occurrence of a mishap.

6.  System designs that provide an approved safety factor, or a fixed design allowance that limits, to an acceptable level, possibilities of structural failure or release of energy sufficient to cause a mishap.

7.  System designs that control energy build-up that could potentially cause a mishap (e.g., fuses, relief valves, or electrical explosion proofing).

8.  System designs where component failure can be temporarily tolerated because of residual strength or alternate operating paths, so that operations can continue with a reduced but acceptable safety margin.

9.  System designs that positively alert the controlling personnel to a hazardous situation where the capability for operator reaction has been provided.

10.  System designs that limit or control the use of hazardous materials.

Elements of an effective system safety effort.

Elements of an effective system safety effort include:

1.  Management is always aware, and formally documents this awareness, of the mishap risks associated with the system.  Hazards associated with the system are identified, assessed, tracked, monitored, and the associated risks are either eliminated or controlled to an acceptable level throughout the life cycle.  Actions taken to eliminate or reduce mishap risk to an acceptable level are identified and archived for tracking and lessons learned purposes.

2.  Historical hazard and mishap data, including lessons learned from other systems, are considered and used.

3.  Environmental protection, safety, and occupational health, consistent with mission requirements, are designed into the system in a timely, cost-effective manner.  Inclusion of the appropriate safety features is accomplished during the applicable phases of the system life cycle.

4.  Mishap risk resulting from harmful environmental conditions (e.g., temperature, pressure, noise, toxicity, acceleration, and vibration) and human error in system operation and support is minimized.

e.  System users are kept abreast of the safety of the system and included in the safety decision process.

A.4.4 

A system safety engineering effort consists of eight main requirements.  The following paragraphs provide further descriptions on what efforts are typically expected due to each of the system safety requirements listed in paragraph 4.

Documentation of the system safety approach. 

The documentation of the system safety approach should describe the planned tasks and activities of system safety management and system engineering required to identify, evaluate, and eliminate or control hazards, or to reduce the residual mishap risk to a level acceptable throughout the system life cycle.  The documentation should describe, as a minimum, the four elements of an effective system safety effort:  a planned approach for task accomplishment, qualified people to accomplish tasks, the authority to implement tasks through all levels of management, and the appropriate commitment of resources (both manning and funding) to ensure that safety tasks are completed.  Specifically, the provided documentation should:

1. Describe the scope of the overall system program and the related system safety effort.  Define system safety program milestones.  Relate these to major program milestones, program element responsibility, and required inputs and outputs.

2.  Describe the safety tasks and activities of system safety management and engineering.  Describe the interrelationships between system safety and other functional elements of the program.  List the other program requirements and tasks applicable to system safety and reference where they are specified or described.  Include the organizational relationships between other functional elements having responsibility for tasks with system safety impacts and the system safety management and engineering organization including the review and approval authority of those tasks.

3.  Describe specific analysis techniques and formats to be used in qualitative or quantitative assessments of hazards, their causes, and effects.

4.  Describe the process through which management decisions will be made (for example, timely notification of unacceptable risks, necessary action, incidents or malfunctions, waivers to safety requirements, and program deviations).  Include a description on how residual mishap risk is formally accepted and this acceptance is documented.

5.  Describe the mishap risk assessment procedures, including the mishap severity categories, mishap probability levels, and the system safety design order of precedence that should be followed to satisfy the safety requirements of the program.  State any qualitative or quantitative measures of safety to be used for mishap risk assessment including a description of the acceptable and unacceptable risk levels (if applicable).  Include system safety definitions that modify, deviate from, or are in addition to those in this standard or generally accepted by the system safety community.

6. Describe how resolution and action relative to system safety will be implemented at the program management level possessing resolution authority.

7. Describe the verification (e.g., test, analysis, demonstration, or inspection) requirements for ensuring that safety is adequately attained.  Identify any certification requirements for software, safety devices, or other special safety features (e.g., render safe and emergency disposal procedures).

h. Describe the mishap or incident notification, investigation, and reporting process for the program, including notification of the program manager.

8.  Describe the approach for collecting and processing pertinent historical hazard, mishap, and safety lessons learned data.  Include a description on how a system hazard log is developed and kept current (see A.4.4.8.1).

j.  Describe how the user is kept abreast of residual mishap risk and the associated hazards.

Identification of hazards

Identify hazards through a systematic hazard analysis process encompassing detailed analysis of system hardware and software, the environment (in which the system will exist), and the intended usage or application.  Historical hazard and mishap data, including lessons learned from other systems, are considered and used.

Approaches for identifying hazards.

Numerous approaches have been developed and used to identify system hazards.  A key aspect of many of these approaches is empowering the design engineer with the authority to design safe systems and the responsibility to identify to program management the hazards associated with the design.  Hazard identification approaches often include using system users in the effort.  Commonly used approaches for identifying hazards can be found in the Defense Acquisition Deskbook and System Safety Society’s System Safety Analysis Handbook.

Assessment of Mishap Risk

Assess the severity and probability of the mishap risk associated with each identified hazard, i.e., determine the potential impact of the hazard on personnel, facilities, equipment, operations, the public, or environment, as well as on the system itself.

1. Mishap risk assessment models. To determine what actions to take to eliminate or control identified hazards, a system of determining the level of mishap risk involved must be developed.  A good mishap risk assessment model will enable decision makers to properly understand the level of mishap risk involved, relative to what it will cost in schedule and dollars to reduce that mishap risk to an acceptable level.

2.Model development.  Key to most mishap risk assessment models is the characterization of mishap risks as to mishap severity and mishap probability.  Since the highest system safety design order of precedence is to eliminate hazards by design, a mishap risk assessment procedure considering only mishap severity will generally suffice during the early design phase to minimize the system’s mishap risks (for example, just don’t use hazardous or toxic material in the design).  When all hazards cannot be eliminated during the early design phase, a mishap risk assessment procedure based upon the mishap probability as well as the mishap severity provides a resultant mishap risk assessment.  The assessment is used to establish priorities for corrective action, resolution of identified hazards, and notification to management of the mishap risks.  The information provided here is a suggested model and set of definitions that can be used.  Program managers are allowed to develop models and definitions appropriate to their individual programs.

Mishap severity.  Mishap severity categories are defined to provide a qualitative measure of the most reasonable credible mishap resulting from personnel error, environmental conditions, design inadequacies, procedural deficiencies, or system, subsystem, or component failure or malfunction.  Suggested mishap severity categories are shown in Table A-I.

TABLE A-I.  Suggested mishap severity categories.

Description

Category

Environmental, Safety, and Health Result Criteria

Catastrophic

   I

Could result in death, permanent total disability, loss exceeding $1M, or irreversible severe environmental damage that violates law or regulation.

Critical

   II

Could result in permanent partial disability, injuries or occupational illness that may result in hospitalization of at least three personnel, loss exceeding $200K but less than $1M, or reversible environmental damage causing a violation of law or regulation.

Marginal

   III

Could result in injury or occupational illness resulting in one or more lost work days(s), loss exceeding $10K but less than $200K, or mitigatible environmental damage without violation of law or regulation where restoration activities can be accomplished.

Negligible

   IV

Could result in injury or illness not resulting in a lost work day, loss exceeding $2K but less than $10K, or minimal environmental damage not violating law or regulation.

 

NOTE:  These mishap severity categories provide guidance to a wide variety of programs.  However, adaptation to a particular program is generally required to provide a mutual understanding between the program manager and the developer as to the meaning of the terms used in the category definitions.  Other risk assessment techniques may be used provided that the user approves them.

Mishap probability.  Mishap probability is the probability that a mishap will occur during the planned life expectancy of the system.  It can be described in terms of potential occurrences per unit of time, events, population, items, or activity.  Assigning a quantitative mishap probability to a potential design or procedural hazard is generally not possible early in the design process.  At that stage, a qualitative mishap probability may be derived from research, analysis, and evaluation of historical safety data from similar systems.  Supporting rationale for assigning a mishap probability is documented in hazard analysis reports.  Suggested qualitative mishap probability levels are shown in Table A-II.

TABLE A-II.  Suggested mishap probability levels.

    Description*

    Level

    Specific Individual Item

    Fleet or Inventory**

    Frequent

        A

    Likely to occur often in the life of an item, with a probability of occurrence greater than 10-1 in that life.

    Continuously experienced.

    Probable

        B

    Will occur several times in the life of an item, with a probability of occurrence less than 10-1 but greater than 10-2 in that life.

    Will occur frequently.

    Occasional

        C

    Likely to occur some time in the life of an item, with a probability of occurrence less than 10-2 but greater than 10-3 in that life.

    Will occur several times.

    Remote

        D

    Unlikely but possible to occur in the life of an item, with a probability of occurrence less than 10-3 but greater than 10-6 in that life.

    Unlikely, but can reasonably be expected to occur.

    Improbable

        E

    So unlikely, it can be assumed occurrence may not be experienced, with a probability of occurrence less than 10-6 in that life.

    Unlikely to occur, but possible.

 

   *Definitions of descriptive words may have to be modified based on quantity of items involved.

**The expected size of the fleet or inventory should be defined prior to accomplishing an assessment of the system.

Mishap risk assessment.  Mishap risk characterization as to mishap severity and mishap probability can be performed through the use of the mishap risk assessment matrix.  This assessment allows one to assign a mishap risk assessment value to a hazard based on its mishap severity and its mishap probability.  This value is then often used to rank different hazards as to their associated mishap risks.  An example of a mishap risk assessment matrix is shown at Table A-III.  

TABLE A-III.  Example mishap risk assessment values.

SEVERITY

PROBABILITY

Catastrophic

Critical

Marginal

Negligible

Frequent

1

3

7

13

Probable

2

5

9

16

Occasional

4

6

11

18

Remote

8

10

14

19

Improbable

12

15

17

20

 

Mishap risk categories.  Mishap risk assessment values are often used in grouping individual hazards into mishap risk categories.  Mishap risk categories are then used to generate specific action such as mandatory reporting of certain hazards to management for action or formal acceptance of the associated mishap risk.  Table A-IV includes an example listing of mishap risk categories and the associated assessment values.  In the example, the system management has determined that mishap risk assessment values 1 through 5 constitute “High” risk while values 6 through 9 constitute “Serious” risk.

TABLE A-IV.  Example mishap risk categories and mishap risk acceptance levels.  

    Mishap Risk Assessment Value

    Mishap Risk Category

    Mishap Risk Acceptance

    Level

    1 – 5

    High

    Component Acquisition Executive

    6 – 9

    Serious

    Program Executive Officer

    10 – 17

    Medium

    Program Manager

    18 – 20

    Low

    As directed

 

*Representative mishap risk acceptance levels are shown in the above table.  Mishap risk acceptance is discussed in paragraph A.4.4.7  

Mishap risk impact.  The mishap risk impact is assessed, as necessary, using other factors to discriminate between hazards having the same mishap risk value.  One might discriminate between hazards with the same mishap risk assessment value in terms of mission capabilities, or social, economic, and political factors.  This would be a program management decision used to prioritize resulting actions.

Mishap risk assessment approaches.  Commonly used approaches for assessing mishap risk can be found in the Defense Acquisition Deskbook and System Safety Society’s System Safety Analysis Handbook (see A.6.1)

Identification of mishap risk mitigation measures.

Identify potential mishap risk mitigation alternatives and the expected effectiveness of each alternative or method.  Mishap risk mitigation is an iterative process that culminates when the residual mishap risk has been reduced to a level acceptable to the appropriate authority.

Prioritize hazards for corrective action.  To eliminate or otherwise control as many hazards as possible, prioritize hazards for corrective action.  A categorization of hazards may be conducted according to the mishap risk potential they present.

System safety design order of precedence. The ultimate goal of a system safety program is to design systems that contain no hazards.  However, since the nature of most complex systems makes it impossible or impractical to design them completely hazard-free, a successful system safety program often provides a system design where there exist no hazards resulting in an unacceptable level of mishap risk.  As hazard analyses are performed, hazards will be identified that will require resolution.  The system safety design order of precedence defines the order to be followed for satisfying system safety requirements and reducing risks.  The alternatives for eliminating the specific hazard or controlling its associated risk are evaluated so that an acceptable method for mishap risk reduction can be agreed to.

Reduction of mishap risk to an acceptable level.  Reduce the system mishap risk through a mitigation approach mutually agreed to by both the developer and the program manager.

Communication with associated test efforts.  Residual mishap risk and associated hazards must be communicated to the system test efforts for verification. 

Verification of mishap risk reduction.  Verify the mishap risk reduction and mitigation through appropriate analysis, testing, or inspection.  Document the determined residual mishap risk.  The program manager must ensure that the selected mitigation approaches will result in the expected residual mishap risk.  To provide this assurance, the system test effort should verify the performance of the mitigation actions.  New hazards identified during testing must be reported to the program manager and the developer.

Testing for a safe design.  Tests and demonstrations must be defined to validate selected safety features of the system.  Tests or demonstrations must be performed on safety critical equipment and procedures to determine the mishap severity or to establish the margin of safety of the design.  Induced or simulated failures will be considered to demonstrate the failure mode and acceptability of safety critical equipment.  Where hazards are identified during the development effort and it cannot be analytically determined whether the action taken will adequately control the hazard, safety tests must be conducted to evaluate the effectiveness of the controls.  Where costs for safety testing would be prohibitive, safety characteristics or procedures may be verified by engineering analyses, analogy, laboratory test, functional mockups, or subscale/model simulation.  Tests of safety systems should be integrated into appropriate system test and demonstration plans to the maximum extent possible.

Conducting safe testing.  The program manager must ensure that test teams are familiar with mishap risks of the system. Test plans, procedures, and test results for all tests including design verification, operational evaluation, production acceptance, and shelf-life validation should be reviewed to ensure that:

1.  Safety is adequately demonstrated.

2.  The testing will be conducted in a safe manner.

3.  All additional hazards introduced by testing procedures, instrumentation, test hardware, and test environment are properly identified and controlled.

Communication of new hazards identified during testing.  Testing organizations must ensure that hazards and safety discrepancies discovered during testing are communicated to the program manager and the developer.

Review and acceptance of residual mishap risk by the appropriate authority.  Notify the program manager of identified hazards and residual mishap risk.

Residual mishap risk.  The mishap risk that remains after all planned mishap risk management measures have been implemented is considered residual mishap risk.  Residual mishap risk is documented along with the reason(s) for incomplete mitigation.

Residual mishap risk management.  The program manager must know what residual mishap risk exists in the system being acquired.  For significant mishap risks, the program manager is required to elevate reporting of residual mishap risk to higher levels of appropriate authority (such as the Program Executive Officer or Component Acquisition Executive) for action or acceptance.  The program manager is encouraged to apply additional resources or other remedies to help the developer satisfactorily resolve hazards providing significant mishap risk.  Table A-IV includes an example of a mishap risk acceptance level matrix based on the mishap risk assessment value and mishap risk category.

Residual mishap risk acceptance.  The program manager is responsible for formally documenting the acceptance of the residual mishap risk of the system by the appropriate authority.  The program manager should update this residual mishap risk and the associated hazards to reflect changes in the system or its use.  The program manager should keep the system users apprised of the residual mishap risk of the system and the associated hazards.

Tracking hazards and residual mishap risk.  Track hazards, their closures, and residual mishap risk.  A tracking system for hazards, their closures, and residual mishap risk must be maintained throughout the system life cycle.  The program manager must keep the system user apprised of system hazards and residual mishap risk.

Process for tracking of hazards and residual mishap risk.  Each system must have a current log of identified hazards and residual mishap risk, including an assessment of the residual mishap risk (see A.4.4.7).  As changes are integrated into the system, this log is updated to incorporate added or changed hazards and the associated residual mishap risk.  The Government must formally acknowledge acceptance of system hazards and residual mishap risk.  Users will be kept informed of hazards and residual mishap risk associated with their systems.

Developer responsibilities for communications, acceptance, and tracking of hazards and residual mishap risk.  The developer (see 3.2.2) is responsible for communicating information to the program manager on system hazards and residual mishap risk, including any unusual consequences and costs associated with hazard mitigation.  After attempting to eliminate or mitigate system hazards, the developer will formally document and notify the program manager of all hazards breaching thresholds set in the safety design criteria.  At the same time, the developer will also communicate the system residual mishap risk.

Program manager responsibilities for communications, acceptance, and tracking of hazards and residual mishap risk.  The program manager is responsible for maintaining a log of all identified hazards and residual mishap risk for the system.  The program manager will communicate known hazards and associated risks of the system to all system developers and users.  As changes are integrated into the system, the program manager shall update this log to incorporate added or changed hazards and the residual mishap risk identified by the developer.  The program manager is also responsible for informing system developers about the program manager’s expectations for handling of newly discovered hazards.  The program manager will evaluate new hazards and the resulting residual mishap risk, and either recommend further action to mitigate the hazards, or formally document the acceptance of these hazards and residual mishap risk.  The program manager will evaluate the hazards and associated residual mishap risk in the context of the user requirements, potential mission capability, and the operational environment.  Copies of the documentation of the hazard and risk acceptance will be provided to both the developer and the system user.  Hazards for which the program manager accepts responsibility for mitigation will also be included in the formal documentation.  For example, if the program manager decides to execute a special training program to mitigate a potentially hazardous situation, this approach will be documented in the formal response to the developer.  Residual mishap risk and hazards must be communicated to system test efforts for verification.

Program manager responsibilities.

The program manager should:

1  Assure that all types of hazards are identified, evaluated, and mitigated to a level compliant with acquisition management policy, federal laws and regulations, Executive Orders, treaties, and agreements.

2  Establish, plan, organize, implement, and maintain an effective system safety effort that is integrated into all life cycle phases.

3  Ensure that system safety planning is documented to provide all program participants with visibility into how the system safety effort is to be conducted.

4  Establish definitive safety requirements for the procurement, development, and sustainment of the system.  The requirements should be set forth clearly in the appropriate system specifications and contractual documents.

5  Provide historical safety data to developers.

6  Monitor the developer’s system safety activities and review and approve delivered data in a timely manner, if applicable, to ensure adequate performance and compliance with safety requirements.

7  Ensure that the appropriate system specifications are updated to reflect results of analyses, tests, and evaluations.

8  Evaluate new lessons learned for inclusion into appropriate databases and submit recommendations to the responsible organization.

9  Establish system safety teams to assist the program manager in developing and implementing a system safety effort.

10  Provide technical data on Government-furnished Equipment or Government-furnished Property to enable the developer to accomplish the defined tasks.

11  Document acceptance of residual mishap risk and associated hazards.

12  Keep the system users apprised of system hazards and residual mishap risk.

Source: DoD

Certisafety Section Home Page

Copyright ©2000-2016 Geigle Safety Group, Inc. All rights reserved. Federal copyright prohibits unauthorized reproduction by any means without permission. Students may reproduce materials for personal study. Disclaimer: This material is for training purposes only to inform the reader of occupational safety and health best practices and general compliance requirement and is not a substitute for provisions of the OSH Act of 1970 or any governmental regulatory agency. CertiSafety is a division of Geigle Safety Group, Inc., and is not connected or affiliated with the U.S. Department of Labor (DOL), or the Occupational Safety and Health Administration (OSHA).