Availability Management

Availability Management and IT Service Continuity Management highly collaborate, as both have the same aim: the highest possible service stability. While Availability manages the normal service delivery, IT Service Continuity Management takes over in case of disaster situations.

Scope

Availability Management ensures that services are up and running without impairments. IT Service Continuity Management does the same in case of disaster situations and also plans for these cases:

The availability of services is one of the most important goods of any ITSM organization, as the availability is the key for customer statisfaction. Withou the given waranty for a service, it\’s utiliy cannot be realized. If an service is interrupted, customer statisfaction can still be achieved through proper plans for a quick service recovery. Improvement to service availability can only be achieved by understanding how services support the business.

During a service runtime there are different states a service can be measured by:

Service Availability
100 * (Agreed Service Time – Downtimes) / Agreed Service Time
Maintainability
Meantime between Failure
Reliability
Meantime between Failure / Meantime to restore Service
Serviceability
Meantime to restore Service
- Measured by customers based on Service Level Agreements
- Measured by ITSM Provider based on Operative Level Agreements

Activities

There are diffent activities which data, reports and plans are stored within an Availability Management Information System:

Re-active activities
- Monitor, mearsure, analyze, report and review the availability
- Investigate any non-availability and deduct actions
Pro-active activities
- Risk Analysis and -Management
- Economical Counter-Measures, e.g. clustering
- Plan, design and review new or changed services
- Test of availability-mechanisms
- Continuous Service Improvement

For measuring the risk of service downtimes the following methods can be used:

Component Failure Impact Analysis (CFIA)
Fault Tree Analysis (FTA)
Service Failure Analysis (SFA)
Enhanced Incident LifeCycle (retrospectively)

Critical Success Factors

The following items are examples:

CSF: Management of the availability and reliability of IT-Services
- KPI: Reduction of non-availability and impact
- KPI: Increasement of reliability
CSF: Fulfillment of business needs
- KPI: Reduced overtime of on business side through service downtimes
- KPI: Reduced service downtimes during critical timeframes, e.g. christmas sales
- KPI: Increased business and customer statisfaction
CSF: Availability of infrastructure and applications – based on Service Level Agreements – is achieved economically
- KPI: Reduced costs during service downtimes