Choosing metrics that provide insight into the health and performance of systems and processes can be challenging. Metrics need to be aligned with the requirements of the systems and processes that they support. While many performance management systems provide useful metrics out-of-the box, you will undoubtably have to define others yourself and determine a means to collect and report them.
I break metrics down into two major categories: strategic and operational.
Strategic metrics provide a broad insight into a service’s overall performance. These are the type of metrics that are briefed at the manager’s weekly meeting. They usually aren’t directly actionable, but are very useful for trending.
Strategic metrics should be used to evaluate the overall effect of process or system improvements. Healthy organizations are involved in some manner of Deming style continuous process improvement (CPI) which also applies to system/service design. As changes are implemented metrics are monitors to determine if the changes improved the system or process as expected.
Some examples of strategic metrics are: system availability, homepage load time, and incidents identified through ITSM vs. those identified by customers. These provide a high level indicator of performance more closely related to business objectives than to specific system or process operation and design criteria.
Operational metrics provide detail and are useful to help identify service disruptions, problems, capacity planning, and areas for improvement. These metrics are often directly actionable. Operations can use these metrics to proactively identify potential service disruptions, isolate the cause of a problem, and evaluate the effectiveness of the team. Engineering uses these metrics to determine if the service design is meeting the design requirements, identify areas for design improvements, and provide data necessary for planning new services and upgrades.
Good metrics should be aligned with operational factors that indicate the health of the service and the design requirements. Metrics, just like every other aspect of a system design, are driven by requirements. The specific design requirements and criteria should be used to define metrics that measure how that aspect of the service is meeting the specified design objective. Historical metrics are valuable to baseline performance and can be used to configure thresholds or historical reference in problem isolation and forecasting.
For example, if you have employed a differentiated services strategy you should be monitoring the traffic volume and queue discards for each class of service you’ve defined. This will help you understand if your traffic projections are accurate and the QOS design is meeting the system requirements. Historical data can help identify traffic trends that influenced the change and determine if it was due to growth, a new application or service, or a “Mother’s Day” traffic anomaly.
Sometime metrics are more valuable when correlated with other metrics. This is true for both strategic and operational metrics. In such cases it is often useful to create a composite metric.
Google, for example, has a health score composed from page load time and other metrics that is briefed to the senior execs daily. In another example, perhaps the calls between the web front end and the SSO are only of concern if they are not directly related to the number of users connecting. In this case a composite metric may provide operations a key piece of information to proactively identify a potential service disruption or reduce MTTR.
Few performance management systems have the capability to create composite metrics within the application. There are always ways around that, but usually involve writing custom glueware.
Metrics should have a specific purpose. The consumers of the metrics should find value in the data – both the data itself and the way it is presented. Like every aspect of the service, metrics should be in a Demingesque continual improvement cycle. Metric definitions, the mechanism to collect them, and how they are communicated to their audience need to be constantly evaluated.
Metrics often become useless if the metric becomes the process objective. Take the time to resolve an incident for example. This metrics can provide valuable insight into the effectiveness of the operations staff and processes; however, it seldom does. This is because most operations managers know this and continually press their staff to close tickets as soon as possible to keep MTTR low. The objective of the operations process is not to close tickets quickly, but to support customer satisfaction by maintaining the service. Because the metric becomes the objective, it looses its value. This is difficult enough to address when the service is managed in-house, but when it becomes outsourced, that is even more troublesome. Operations SLAs often specifically address MTTR. If the service provider is contractually obligated to keep MTTR low, they will focus on closing tickets even if the issue remains unresolved.
Reference post below by Shamus McGillicuddy
It still boggles my mind why there is such a fascination with large bridged networks rather than relying on the proven ability of IP to manage path selection. Spanning Tree doesn’t have the features to ensure optimal path selection. Maybe it’s that the data center is often designed by people with a strong background in computers rather than by network engineers. I’ve seen many cases where data centers have traffic going over the wrong path causing congestion because they can’t get Spanning Tree to place it on a more optimal path. Then add this trend to run layer 2 over the WAN with VPLS. Sure, you don’t have to deal with IP addressing and route distribution, but the tradeoff is a large geographically separated collision domain with little control over path selection and less ability to troubleshoot and monitor it. IP routing is a solution that shouldn’t be overlooked. It was designed specifically for this reason, and it’s easier to spell. SDN may prove to be a great solution, but it’s too young yet.
Excellent insight. New technologies and methods will provide more challenges for network security. That’s job security if you can keep pace.
While 802.11ac may be of interest to those looking to enable laptop and mobile users high speed access, that’s just at the access tier of the LAN. SDN has more potential to change the architecture dramatically, and that not withstanding adequate means to measure performance and monitor security in that environment.
Yes, visibility into the cloud has to take a more prominent role. That will require innovative approaches. Are the three big NMS providers able to move fast enough to address this need? I’m looking to startups for the new approaches. And what of Open Source products, which have come a long way? Why invest 3/4 million into COTS and then not develop the customizations and integration to make it do everything you need in your environment? A better approach is to use Open Source and invest the money saved into human resources to configure and integrate the tools – the added benefit is a top-notch support team to keep it in pace with the network changes.
Added complexity has its costs. Measuring the performance of a dynamically changing topology, the performance of the SDN system itself, and added complexity in network security are just a few challenges. Software-Defined Networking certainly has potential, but I’m still waiting to see if this can realize a ROI and performance improvement given the additional complexity. I don’t think everyone is ready to jump on this bandwagon just yet.
Original is reposted below:
What does 2013 have in store for the networking industry? We asked five top industry analysts to predict on networking trends for this year. Click on the links below to find out what will happen in data center networking, network security, campus LANs, network management and software-defined networking.
Data center networks will continue to wrestle with the limitations of spanning tree protocol in 2013, but enterprises that move to alternatives like network fabrics will find roadblocks to scalability. Meanwhile, enterprises will use Ethernet exchanges to build hybrid cloud environments and cutting edge micro-electromechanical systems (MEMS)-based photonic switches will start to make some noise in the data center. Erica Hanselman, research director at London-based 451 Research, shares his predictions for how the data center networking industry will shake out in 2013.
In 2013, network security vendors need to develop third-party ecosystems that help enterprises correlate data among the various components of their security architecture. Also, network security pros will need to sort through the software-defined networking (SDN) hype to figure out how secure these new technologies are. Meanwhile, enterprises will accelerate their adoption of next-generation firewalls and advanced threat protection systems. We asked Greg Young, research vice president at Stamford, Conn.-based Gartner
Inc., to share his views on the changes we’ll see in network security this year.
Campus networking has lacked innovation for a few years, but 2013 may switch things up a bit. While wireless LAN vendors will be pushing faster 802.11ac networks this year, the industry may also see some architectural changes that could finally deliver true unified wireless and wired campus LANs. We asked Andre Kindness, senior analyst at Forrester Research, to share his views on
the changes we’ll see in campus LANs this year.
Emerging virtual overlay network technology will force network management vendors to develop tools to monitor these new environments in 2013. Meanwhile, enterprises will demand better visibility into their public cloud resources and virtual desktop infrastructure deployments. Enterprise Management Associates Research Director Jim Frey shares these and other predictions for
how the network management market will evolve this year.
What’s in store for software-defined networking? IDC analyst Brad Casemore predicts adoption will grow among service providers and cloud providers; vendors will battle each other in Layer 4-7 network services and SDN controllers; and OpenFlow may evolve, but very slowly. In the longer term, IDC projects that the SDN market will reach $3.7 billion by 2016. Here’s more of what Casemore had to say about the SDN market in 2013.