There are very few organizations supporting network services that have a comprehensive design process. More often the process is abbreviated if accomplished at all and systems are provisioned directly. If looking at processes from an ITIL perspective most organizations have strong Service Operations processes, the big Service Transition process – Change Management, but Service Strategy and Design are usually lacking.
I’ve been designing and installing telecommunications systems for almost 30 years. I’ve held a variety of positions supporting small to large networks and seen a a variety of approaches to engineering and provisioning. Although labels and pigeon holes don’t adequately explain the wide variety of approaches in use, we can use a few broad categories to generally describe them.
This approach was typical 15 years ago. It relies on a small team of highly qualified network engineers who solve problems on the back of a napkin and provision systems directly. If there is a problem with the network service or a new capability needs to be added, a network engineer will come up with a solution and implement it on the devices in question. This isn’t to imply that there is no planning – on the contrary, there is planning, but each implementation is planned and executed individually. Sure there are standards, but they’re more often informal.
This isn’t necessarily a bad approach; it works well for small networks. If there is a highly skilled staff that communicate frequently this can be managed on an informal, ad-hoc basis. The trouble is that as the network grows and management tries to save money by staffing the engineering department with less experienced engineers, mistakes start to appear from unexpected non-standard configurations and error. At this point management steps in in an attempt to reign in the boys.
This approach is similar to the previous approach with the addition to a change management program. In an attempt to reduce unexpected service disruptions caused by change, a formal change management process is established to control how changes are executed and manage the impact of change disruptions. Changes are well documented and scrutinized by a Change Advisory Board (CAB). Impact assessments are presented CAB and the change is categorized based on the risk and impact. Specific change window periods are established and the implementations are managed. This forces the engineering staff to develop a more thorough implementation plan, but it doesn’t address the the fundamental problem.
In my opinion, this approach is a complete waste of time because it doesn’t address the problem – it addresses the symptoms. What causes unexpected services disruptions caused by a change implementation? Unless your installers are under-qualified, it’s not how the implementation is executed. It’s what is being done. All this approach does is impose a great deal of management oversight and increase the order to delivery time by adding control gates.
Change Management can’t control unexpected behavior because Change Management focuses on the execution of the change. If the impact of every change was known for certain, then the implementation could be managed without unexpected consequences. How can the impact be known with a high degree of certainty? By designing the network service as a system rather that designing each implementation, which is actually skipping the design process and jumping straight to provisioning. This is putting the cart before the horse. This is the most common practice in use and is why IT managers look to outsourcing network services. Herding cats is difficult if not impossible.
In addition to Change Management, standardized templates and compliance checking are often implemented in an attempt to standardize configuration across larger more complex networks. Often an IT management framework such as ITIL is embraced; however, seldom is the network service subject to the ITIL Service Design and Release Management processes. In this model a standard IOS images and configuration templates are developed to describe each element of the network configuration. These templates may be broken down into smaller, more manageable sub-components such as base, network management management, interface configuration, routing protocols, traffic engineering, tributary provisioning, performance, etc. these templates are then used as a standard to check network device configurations against through some compliance checking mechanism such as OPNET NetDoctor or HPNA.
This is a large step in the right direction, but still fails to address the fundamental problem. Configuration Management is important, but it still doesn’t address the problem, but a symptom. There will often be a large number of devices out of compliance and bringing them into compliance is a burdensome process in a large network with a tight change management process. This is because they’re still skipping the design process and the operations managers have little confidence in the design – because in the larger context of the entire network, the design is non-existent.
It’s interesting to note that most large managed service providers are at this stage. This is partially because they have little control over the customer’s service strategy and design processes. The service contract primarily addresses service transition and operation. The metrics used to evaluate provider performance are largely operations related – availability, packet loss, etc. Providers are able to productize operations and transition processes to fit any environment. This contributes to difficulty getting the provider to implement changes. It’s in their best interest to keep the network stable and reliable.
There is a paradigm associated with designing a network using COTS products that causes network engineering workcenters to disregard the conventional engineering process. Consider the design of an aircraft platform. Engineers don’t go out and build aircraft from scratch and create each one slightly different. Engineers design a blueprint that addresses all aspects of the design and system lifecycle. The production team takes that blueprint and fits the factory to build many aircraft using this design. Retrofits follow a similar process. Consider a software engineering project. The developer develops the application for each platform it is to be released on and releases an installation package for each platform. That installation package takes into account any variation in the platform. One package may be released to install on various Windows OSs, another on various Linux OSs, another for the supported Mac OSs. This has been thoroughly tested prior to release. The installation package installs with high degree of certainty. Enhancements and fixes are packaged into release packages. Patches that have a time constraint are released outside the scheduled release schedule. Imagine if the developer released a collection of executables, libraries, and support files and expected the installer to configure it correctly based on the system it was being installed on. The results wouldn’t be very certain and there would be a large number of incidents reported for failed installations. Imagine if the aircraft designer released a set of guidelines and expected the factory to design each aircraft to order. I’d be taking the train. If this seems logical, then why do most organizations skip the design process for IT/telecom systems and jump straight to provisioning? This is because the system is a collection of COTS products and the design consists primarily of topology and configuration. This doesn’t make the design process any less vital.
Under this model, the network is considered a service and the design process creates a blueprint that will be applied wherever the service is provisioned. Standards and templates are part of that design, but there is much more. The entire topology and system lifecycle are addressed in a systematic way that ensures that each device in the network is a refection of that design. There is a system of logic that describes how these standards and templates are used to provision any component in the network. Enhancements and fixes are released on regular cycles across the entire network and the version of the network configuration is managed. This approach takes most of the guess work out of the provisioning process.
The ITIL Service Design process treats a service similar to the way the aircraft and software developer are handled in the above examples. When the network is treated as a service that must be subject to this same rigorous engineering process, the result is improved efficiency a high degree of predictability that reduces service disruptions caused by unexpected problems encountered during changes. This requires a great deal more engineering effort during the design and release processes, but the ROI is improved availability and reduction effort during implementation. Implementing the release package becomes a turn-key operation that should be performed by the operations or provisioning team rather than engineering. This paradigm shift often takes some time for an organization to grasp and function efficiently in, but will improve performance and efficiency and paves the way toward automated provisioning.
This is the Zen of the network service design continuum. It can’t be achieved unless there is a fundamental shift in the way the network engineering staff approaches network design. Engineering produces a blueprint can be implemented with a high degree of certainty. The network service is designed as a system with a well developed design package that address all aspects of the network topology and system lifecycle. Network hardware is standardized and standard systems are defined. Standards are developed in great detail. Configurations are designed from a systemic perspective in a manner that can be applied to standard systems using the other standards as inputs. The CMDB or some other authoritive data source will contain all the network configuration items and the relationships between them. A logical system is developed that addresses how these standards and relationships will be applied to any given implementation. This is all tested on individual components and as a system to ensure the system meets the desired design requirements and assess the impact of any changes that will have to be applied as a result of the release.
At this point the logic that has been developed to take the design and translate it to an implementation (provisioning) can be turned in to a automated routine that can produce the required configurations to provision all devices necessary to make any given change.
Some of the controls such as compliance checking become more of a spot check to verify that the automation is working effectively. Network engineers are no longer involved in provisioning, but in designing the service in a larger context. Provisioning becomes a repeatable process with a high degree of certainty. This greatly reduces the risk that Change management is attempting to control and makes this a workable process.
Most organizations with large or complex network would benefit greatly from this approach.