There are very few organizations supporting network services that have a comprehensive design process. More often the process is abbreviated if accomplished at all and systems are provisioned directly. If looking at processes from an ITIL perspective most organizations have strong Service Operations processes, the big Service Transition process – Change Management, but Service Strategy and Design are usually lacking.
I’ve been designing and installing telecommunications systems for almost 30 years. I’ve held a variety of positions supporting small to large networks and seen a a variety of approaches to engineering and provisioning. Although labels and pigeon holes don’t adequately explain the wide variety of approaches in use, we can use a few broad categories to generally describe them.
This approach was typical 15 years ago. It relies on a small team of highly qualified network engineers who solve problems on the back of a napkin and provision systems directly. If there is a problem with the network service or a new capability needs to be added, a network engineer will come up with a solution and implement it on the devices in question. This isn’t to imply that there is no planning – on the contrary, there is planning, but each implementation is planned and executed individually. Sure there are standards, but they’re more often informal.
This isn’t necessarily a bad approach; it works well for small networks. If there is a highly skilled staff that communicate frequently this can be managed on an informal, ad-hoc basis. The trouble is that as the network grows and management tries to save money by staffing the engineering department with less experienced engineers, mistakes start to appear from unexpected non-standard configurations and error. At this point management steps in in an attempt to reign in the boys.
This approach is similar to the previous approach with the addition to a change management program. In an attempt to reduce unexpected service disruptions caused by change, a formal change management process is established to control how changes are executed and manage the impact of change disruptions. Changes are well documented and scrutinized by a Change Advisory Board (CAB). Impact assessments are presented CAB and the change is categorized based on the risk and impact. Specific change window periods are established and the implementations are managed. This forces the engineering staff to develop a more thorough implementation plan, but it doesn’t address the the fundamental problem.
In my opinion, this approach is a complete waste of time because it doesn’t address the problem – it addresses the symptoms. What causes unexpected services disruptions caused by a change implementation? Unless your installers are under-qualified, it’s not how the implementation is executed. It’s what is being done. All this approach does is impose a great deal of management oversight and increase the order to delivery time by adding control gates.
Change Management can’t control unexpected behavior because Change Management focuses on the execution of the change. If the impact of every change was known for certain, then the implementation could be managed without unexpected consequences. How can the impact be known with a high degree of certainty? By designing the network service as a system rather that designing each implementation, which is actually skipping the design process and jumping straight to provisioning. This is putting the cart before the horse. This is the most common practice in use and is why IT managers look to outsourcing network services. Herding cats is difficult if not impossible.
In addition to Change Management, standardized templates and compliance checking are often implemented in an attempt to standardize configuration across larger more complex networks. Often an IT management framework such as ITIL is embraced; however, seldom is the network service subject to the ITIL Service Design and Release Management processes. In this model a standard IOS images and configuration templates are developed to describe each element of the network configuration. These templates may be broken down into smaller, more manageable sub-components such as base, network management management, interface configuration, routing protocols, traffic engineering, tributary provisioning, performance, etc. these templates are then used as a standard to check network device configurations against through some compliance checking mechanism such as OPNET NetDoctor or HPNA.
This is a large step in the right direction, but still fails to address the fundamental problem. Configuration Management is important, but it still doesn’t address the problem, but a symptom. There will often be a large number of devices out of compliance and bringing them into compliance is a burdensome process in a large network with a tight change management process. This is because they’re still skipping the design process and the operations managers have little confidence in the design – because in the larger context of the entire network, the design is non-existent.
It’s interesting to note that most large managed service providers are at this stage. This is partially because they have little control over the customer’s service strategy and design processes. The service contract primarily addresses service transition and operation. The metrics used to evaluate provider performance are largely operations related – availability, packet loss, etc. Providers are able to productize operations and transition processes to fit any environment. This contributes to difficulty getting the provider to implement changes. It’s in their best interest to keep the network stable and reliable.
There is a paradigm associated with designing a network using COTS products that causes network engineering workcenters to disregard the conventional engineering process. Consider the design of an aircraft platform. Engineers don’t go out and build aircraft from scratch and create each one slightly different. Engineers design a blueprint that addresses all aspects of the design and system lifecycle. The production team takes that blueprint and fits the factory to build many aircraft using this design. Retrofits follow a similar process. Consider a software engineering project. The developer develops the application for each platform it is to be released on and releases an installation package for each platform. That installation package takes into account any variation in the platform. One package may be released to install on various Windows OSs, another on various Linux OSs, another for the supported Mac OSs. This has been thoroughly tested prior to release. The installation package installs with high degree of certainty. Enhancements and fixes are packaged into release packages. Patches that have a time constraint are released outside the scheduled release schedule. Imagine if the developer released a collection of executables, libraries, and support files and expected the installer to configure it correctly based on the system it was being installed on. The results wouldn’t be very certain and there would be a large number of incidents reported for failed installations. Imagine if the aircraft designer released a set of guidelines and expected the factory to design each aircraft to order. I’d be taking the train. If this seems logical, then why do most organizations skip the design process for IT/telecom systems and jump straight to provisioning? This is because the system is a collection of COTS products and the design consists primarily of topology and configuration. This doesn’t make the design process any less vital.
Under this model, the network is considered a service and the design process creates a blueprint that will be applied wherever the service is provisioned. Standards and templates are part of that design, but there is much more. The entire topology and system lifecycle are addressed in a systematic way that ensures that each device in the network is a refection of that design. There is a system of logic that describes how these standards and templates are used to provision any component in the network. Enhancements and fixes are released on regular cycles across the entire network and the version of the network configuration is managed. This approach takes most of the guess work out of the provisioning process.
The ITIL Service Design process treats a service similar to the way the aircraft and software developer are handled in the above examples. When the network is treated as a service that must be subject to this same rigorous engineering process, the result is improved efficiency a high degree of predictability that reduces service disruptions caused by unexpected problems encountered during changes. This requires a great deal more engineering effort during the design and release processes, but the ROI is improved availability and reduction effort during implementation. Implementing the release package becomes a turn-key operation that should be performed by the operations or provisioning team rather than engineering. This paradigm shift often takes some time for an organization to grasp and function efficiently in, but will improve performance and efficiency and paves the way toward automated provisioning.
This is the Zen of the network service design continuum. It can’t be achieved unless there is a fundamental shift in the way the network engineering staff approaches network design. Engineering produces a blueprint can be implemented with a high degree of certainty. The network service is designed as a system with a well developed design package that address all aspects of the network topology and system lifecycle. Network hardware is standardized and standard systems are defined. Standards are developed in great detail. Configurations are designed from a systemic perspective in a manner that can be applied to standard systems using the other standards as inputs. The CMDB or some other authoritive data source will contain all the network configuration items and the relationships between them. A logical system is developed that addresses how these standards and relationships will be applied to any given implementation. This is all tested on individual components and as a system to ensure the system meets the desired design requirements and assess the impact of any changes that will have to be applied as a result of the release.
At this point the logic that has been developed to take the design and translate it to an implementation (provisioning) can be turned in to a automated routine that can produce the required configurations to provision all devices necessary to make any given change.
Some of the controls such as compliance checking become more of a spot check to verify that the automation is working effectively. Network engineers are no longer involved in provisioning, but in designing the service in a larger context. Provisioning becomes a repeatable process with a high degree of certainty. This greatly reduces the risk that Change management is attempting to control and makes this a workable process.
Most organizations with large or complex network would benefit greatly from this approach.
Here’s a great source on a wide variety of technology and training:
Reference post below by Shamus McGillicuddy
It still boggles my mind why there is such a fascination with large bridged networks rather than relying on the proven ability of IP to manage path selection. Spanning Tree doesn’t have the features to ensure optimal path selection. Maybe it’s that the data center is often designed by people with a strong background in computers rather than by network engineers. I’ve seen many cases where data centers have traffic going over the wrong path causing congestion because they can’t get Spanning Tree to place it on a more optimal path. Then add this trend to run layer 2 over the WAN with VPLS. Sure, you don’t have to deal with IP addressing and route distribution, but the tradeoff is a large geographically separated collision domain with little control over path selection and less ability to troubleshoot and monitor it. IP routing is a solution that shouldn’t be overlooked. It was designed specifically for this reason, and it’s easier to spell. SDN may prove to be a great solution, but it’s too young yet.
Excellent insight. New technologies and methods will provide more challenges for network security. That’s job security if you can keep pace.
While 802.11ac may be of interest to those looking to enable laptop and mobile users high speed access, that’s just at the access tier of the LAN. SDN has more potential to change the architecture dramatically, and that not withstanding adequate means to measure performance and monitor security in that environment.
Yes, visibility into the cloud has to take a more prominent role. That will require innovative approaches. Are the three big NMS providers able to move fast enough to address this need? I’m looking to startups for the new approaches. And what of Open Source products, which have come a long way? Why invest 3/4 million into COTS and then not develop the customizations and integration to make it do everything you need in your environment? A better approach is to use Open Source and invest the money saved into human resources to configure and integrate the tools – the added benefit is a top-notch support team to keep it in pace with the network changes.
Added complexity has its costs. Measuring the performance of a dynamically changing topology, the performance of the SDN system itself, and added complexity in network security are just a few challenges. Software-Defined Networking certainly has potential, but I’m still waiting to see if this can realize a ROI and performance improvement given the additional complexity. I don’t think everyone is ready to jump on this bandwagon just yet.
Original is reposted below:
What does 2013 have in store for the networking industry? We asked five top industry analysts to predict on networking trends for this year. Click on the links below to find out what will happen in data center networking, network security, campus LANs, network management and software-defined networking.
Data center networks will continue to wrestle with the limitations of spanning tree protocol in 2013, but enterprises that move to alternatives like network fabrics will find roadblocks to scalability. Meanwhile, enterprises will use Ethernet exchanges to build hybrid cloud environments and cutting edge micro-electromechanical systems (MEMS)-based photonic switches will start to make some noise in the data center. Erica Hanselman, research director at London-based 451 Research, shares his predictions for how the data center networking industry will shake out in 2013.
In 2013, network security vendors need to develop third-party ecosystems that help enterprises correlate data among the various components of their security architecture. Also, network security pros will need to sort through the software-defined networking (SDN) hype to figure out how secure these new technologies are. Meanwhile, enterprises will accelerate their adoption of next-generation firewalls and advanced threat protection systems. We asked Greg Young, research vice president at Stamford, Conn.-based Gartner
Inc., to share his views on the changes we’ll see in network security this year.
Campus networking has lacked innovation for a few years, but 2013 may switch things up a bit. While wireless LAN vendors will be pushing faster 802.11ac networks this year, the industry may also see some architectural changes that could finally deliver true unified wireless and wired campus LANs. We asked Andre Kindness, senior analyst at Forrester Research, to share his views on
the changes we’ll see in campus LANs this year.
Emerging virtual overlay network technology will force network management vendors to develop tools to monitor these new environments in 2013. Meanwhile, enterprises will demand better visibility into their public cloud resources and virtual desktop infrastructure deployments. Enterprise Management Associates Research Director Jim Frey shares these and other predictions for
how the network management market will evolve this year.
What’s in store for software-defined networking? IDC analyst Brad Casemore predicts adoption will grow among service providers and cloud providers; vendors will battle each other in Layer 4-7 network services and SDN controllers; and OpenFlow may evolve, but very slowly. In the longer term, IDC projects that the SDN market will reach $3.7 billion by 2016. Here’s more of what Casemore had to say about the SDN market in 2013.
Change Management is an important function in most organizations. It carries more weight than many of the other ITIL functions because it’s the biggest pain point. It’s a well established fact that upwards of 80% of all outages are self-inflicted. IT managers are constantly getting heat over deployment that didn’t go exactly as planned. When you boil that down to lost productivity or missed business opportunities it amounts to a sizable amount of money. These are just some of the reason Change Management gets so much well deserved attention.
So, you establish a Change Advisory Board. There is a lot of preparation and documentation that has to go into any change before it’s presented to the board for approval. Each change is categorized, analyzed, scrutinized, until everyone involved is thoroughly mesmerized. The time required to get a change approve may also have increased five-fold. The process is controlled through some rather expensive management software, well documented, well planned, and hopefully well executed.
The question is: After expending all this effort into the Change Management process, expending the resources in additional planning and documentation, and spending all the time in meetings, and prolonging the time required to get a task accomplished, did CM reduce service disruptions and save more money than was invested in the process? Let’s face it; if not, then throw the whole thing out and go back to shootin’ from the hip.
The stakeholders on the board probably didn’t review the detailed documentation that has been prepared. There are probably only a few people in the entire organization who will ever read it. The stakeholders only have a few important questions: why is this change necessary, what’s the impact, who or what will be affected, what are the risks, are they adequately mitigated, and is there a viable back-out process. There are probably a few key people in each business unit who could review the implementation details and provide their respective stakeholder with a recommendation and/or list of concerns and remediations.
Is the CAB keeping any metrics? Are you aware of how many changes of each category are being implemented? Were they on schedule? Were the impacts more or less than expected? Is there a way to relate an incidents as a result of a change to the change in your incident reporting system? Is all this management making an improvement, or have you just spent more resources managing with no real gain? When you make a change to the process, does it streamline the process and/or improve the results?
Change Management is good. CM in the context of ITIL framework is excellent … but we must always keep focused on the end objective – becoming more efficient and effective. CM for the sake of CM is a common ill and needs to be tempered with CS (common sense).