SEARCH 


Recently in IT Operations Category

Sometimes ingenuity leads to alternative uses of new products in ways the original designer never imagined. The classic business school example is 3M’s Post-It note which was originally conceived as pressure sensitive adhesive – a novel idea with few practical uses until someone slapped the adhesive on a piece of paper and found utility in a reusable bookmark. 


Many customers use Managed Objects technology, at least initially to consolidate IT management consoles. This is understandable since a key point of differentiation for us is our bi-directional API-level integration adapters, which simply means other management tools view our BSM product as if it were just another user. While the consolidation existing IT management tools is a powerful value proposition for many of IT organizations, one bank that I’ve previously written about in an earlier post, found new benefits of integration in adding best-in-class management tools. 

The bank had already standardized on Managed Objects enterprise wide as its presentation layer and had elected to migrate the underlying system management tools, for its Windows environment, from IBM Tivoli to the Microsoft System Center Operations Manager (SCOM). The decision was based on cost reduction (equivalent of 3 FTE) and from a technical standpoint the IT operations felt SCOM was a better fit for their specific environment. 

The migration would include nearly 5,000 IT components including 1,200 servers running Windows, Linux and UNIX – in addition to iSeries LPARs managed by Bytware and several thousand network components managed by CiscoWorks and WhatsUp Gold. Normally a large project like this involving the implementation of a new product and data migration would be accompanied by the usual challenges of user adoption – and its associated learning curve. However, since IT operations maintained Managed Objects as the single enterprise management interface and its integration to underlying management systems is both thorough and easy to implement, the transition was smooth, uneventful, and transparent to majority of IT staff. 

While integration was still a central value proposition for this customer, the motivation behind it was new: integration afforded IT operations the flexibility to select the best technology for their environment. Technical SMEs could select tools that best solve their problems while avoiding the risk, disruptions and the usual adversity to change in any organization. It’s a variation of the Post-It note example above, where customer ingenuity led to alternative use Managed Objects. 

- Abbas


As a follow up to my recent post titled “End User Experience Monitoring as lynchpin for BSM,” I spoke with Eden Shochat, CTO of Aternity, to learn more about their offerings in this space and discuss unique contributions they could make as part of a larger BSM strategy. For those who aren’t familiar with them, Aternity provides a new class of Application Performance Management software via its Frontline Performance Intelligence Platform.

An excerpt of our very interesting discussion is presented below:

Abbas: I’ve described the role that EUEM solutions like Aternity’s Frontline Performance Intelligence play in the context of BSM, but what is unique about your technology in this area?

Eden: Aternity’s Platform fuses application, desktop and user performance monitoring with real-time business intelligence. This approach generates the most accurate and comprehensive user experience information from multiple levels of the network and application stack on the end-user machines (physical or virtual). By combining this unique data with correlation, clustering and anomaly detection analysis algorithms, Aternity’s Platform is able to perform preemptive problem detection, usage & usability analysis, fact-based capacity planning and activity-oriented compliance. 


Abbas: What is the typical deployment architecture for one of your Global 2000 customers? Does the solution require agents, appliances, or a combination? 

Eden: There is basically a four-tiered architecture approach we follow during the course of our enterprise deployments: 

> The Microsoft Certified Agent(s) collects information and measures the performance of the desktop, applications and user productivity 
> Next, Aggregation services communicate with the Agent(s) to aggregate the measurements and further compress the traffic 
> Then, Analytics Services perform analysis on any incoming data, such as activity usage metrics, and clusters similar data together, performing anomaly detection and correlating endpoints with similar characteristics to locate probable cause 
> Finally, the architecture is supported by a Management Console and a historical database store for enabling the management, configuration and interactive drill-down into specific user experience business intelligence data. 

The aggregation, analytics and management tiers can all run on the same server or they can be split into multiple servers, scaling horizontally or vertically to support tens of thousands of monitored users. 

Unlike appliances which are located within the data center, the Microsoft Certified Agent(s) resides on the end-point where the service is consumed, providing for an in-depth level of accuracy that is impossible to achieve with sniffer based technologies. Additionally, Agent distribution is performed as a software update versus having to distribute multiple hardware appliances.
 
And, by having aggregation separated, the architecture can more easily support distributed models. These could for example include those found in the oil and gas industry, where some of the services that are consumed there are behind very high cost vsat links. Bank branches are another good example, where some of the network and application services are local and don’t go to through a general corporate network.
 
Abbas: When monitoring performance of applications, do you treat them all the same (i.e. agnostic) or are their specialized analyses for common applications such as Exchange, SAP, and Web?
 
Eden: The Aternity Agent performs both protocol agnostic monitoring, supporting virtually all applications as well as technology specific monitoring. This includes:
 
> Generic network Cartridge, supporting any request à response type protocol, e.g: Java RMI, CIFS, 3270 or other, unpublished protocols.
> HTTP/s Cartridge supporting any HTTP-based application, web or otherwise, without requiring the secure breaches by appliance-type key management
> Win32 Client/Server Cartridge: Passive monitoring of any win32 user interface application, be it .NET forms, Powerbuilder or plain vanilla win32 programming.
> Oracle EBusiness Cartridge: Generic monitoring of JInitiator and JDK based form applications, including customizations performed by customers.
> Java: Monitor any Swing-based (AWT is supported by the Win32 Cartridge) applications and applets
 > Server-based Computing ICA/RDP Support: We monitor both the latency of the screen refreshes as well as the actual applications on the Citrix/Terminal Server for published desktops and applications.
 
The support for technology-based instrumentation means that most (if not all) of the applications, shrink wrapped or custom, can be monitored by the Aternity Platform.
 
In addition, the Agent collects environmental information, e.g: network statistics, process information (including crashed and hung processes, user activity) and operating system, service packs, installed applications & patches. The agent is a Windows service monitoring the endpoint providing insight into the network, desktop, server-based computing protocols.
 
This monitoring can be applied to standard desktop/laptop clients, server-based computing environments like Citrix XenApp and virtual desktop infrastructure (VDI) deployments, all this with incredibly minimal footprint of less than 10MB of physical RAM and under 0.1 percent CPU on average.
 
Abbas: Analytics seems to play a prominent role in how Aternity positions itself. Can you go into more detail about how it works and the value that provides to your customers?
 
Eden: Attempting to understand or derive business intelligence from volumes of end user performance metrics is like looking for the proverbial, “needle in the haystack”. Sophisticated, real-time analytics are therefore necessary to truly bring about what we call, Frontline Performance Intelligence.
 
The issue that plagued the early attempts of monitoring user experience is having the capability to transform huge volumes of data into actionable intelligence. Previously, organizations would try to lessen the flow of data from end users’ desktops by only supporting partial deployments of Application Performance Management (APM) technologies. These deployments would be applied to PCs that were exhibiting performance issues. This mode of operation prohibits on-going introspection of user productivity and experience.
 
By collecting a comprehensive set of end user performance and productivity metrics at the Frontline, and processing this data with analytics, the Aternity Platform generates Frontline Performance Intelligence from real frontline performance metrics. The analytical components in the Aternity algorithm engine include:
 
Autonomic Performance Profiling: An Autonomic Performance Profile™ is the mathematical model used for automatic, real-time identification of groups of homogenous users sharing the same behavior at a particular time, and is used to quantify, detect and distinguish between normal and abnormal behavior.
 
Deviation detection: Autonomic Performance Profiles were designed to provide the earliest possible detection of performance problems that impact multiple users while simultaneously eliminating the need for manual alert configuration and tuning, which many other products in the market require. The Analytic Engine performs continuous correlation between the real-time performance measurements captured by the Agents, and the Baselines of the Autonomic Performance Profiles. In this way, performance deviations of any magnitude can be automatically detected, for groups of users of any size, with no manual configuration and/or intervention.
 
Problem Minimization: Each of the detected symptoms is analyzed for commonalities to tie multiple symptoms together into a problem. This has been shown to greatly reduce the number of alerts going to the IT operations.
 
Problem Isolation through Endpoint Classification: End users with like symptoms are first grouped together into an “Effect group”, and an alert is raised. The analytic engine then automatically identifies the end users’ unique commonalities, with two levels of correlation, across the effect group:
 
1. Positive Correlation: the attributes that the affected group have in common
2. Negative Correlation: the attributes common to the effect group that are also common to the non-affected user groups
 
The intersection of these two correlations, i.e. the Query Group and the Effect Group is shown as the “Match Group” above. The attributes that produce the strongest Match Group are “surfaced” as a Probable Cause. Any attributes collected by the Aternity Agent (e.g.: the amount of memory, installed application or the subnet where the endpoint resides) may be used for Dynamic Problem Isolation, i.e. Probable Cause Analysis.
 
Abbas: Given that FPI may well be the early warning system that companies would rely on to get ahead of end user performance issues, what mechanisms do you provide by which another management platform can gain access to the results of the analytics so that they can be presented inside of a BSM view?
 
Eden: When we designed the Aternity Platform, it was clear that we are generating a new type of a data stream - user experience combined with activity data. As such, we architected the system to be totally open. The system components communicate over a message bus among themselves. And, the complete database schema is open, documented and simple for custom-built reports. The problem detection analytics are exposed through our object-oriented Problem Life Cycle Manager and CLI layers.
 
Some of the existing integrations at customers include to Ticketing Systems (CA, BMC), Portals (IBM WebSphere, Microsoft Sharepoint), SNMP alert systems (HPOV) and other proprietary systems.
 
Contact information for Aternity is available here.


Bits and bytes from itSMF Fusion 2008

| Comments (0)

 
CindywinstheWii (Small).jpg
We had a great week at the itSMF Fusion 2008 show in San Francisco this past week – certainly time well spent. We had some really insightful conversations with current and prospective customers, engaged a handful of analysts, scoped out the competition and sat in on a handful of very interesting sessions which unveiled some rather unique data points.

For example, in one session on CMDBs, in excess of 50% of the audience (by show of hands) said they were currently implementing a CMDB. Two-percent admitted they were on their second try having failed the first time around. No surprise integration was routinely cited as the main culprit and that’s an area Managed Objects has certainly mastered.

StackSafe has posted some notes from the show here – and we’d like to offer some bits and bytes – mostly paraphrases – as well:

>> Congratulations to Cindy from Hallmark (photo nearby) who won our Wii raffle.

>> IT is good at measuring performance, but poor at measuring quality.  A help desk that aims to solve 60% of incidents on the first call is really just encouraging staff to close a ticket with a poor answer and reopen a new one with another call. – Malcolm Fry, “CIO and the 366 Degree Circle

>> Roughly 10% of the audience raised their hand when asked “do you know what BSM is?” – Lisa Erickson Harris, “BSM and Best Practices, Elevating the Role of the Service Desk”

>> IT investments will continue to grow, but they must either produce cost savings in the supply chain or improve the customer experience – Charlie Feld “Enabling 21st Century Business Model with IT”

>> A well run IT department is like air – it’s taken for granted. – Dennis Ravenelle, IT Service Continuity Management, Where do I start?

>> “An inaccurate CMDB is worse than no CMDB.” – Richard Peasley, Building Decision Support Systems that Work

- Abbas


The compelling need for modeling services

| Comments (0)

Some time ago I was working with a customer who was new to his organization. His sole responsibility was to administer Managed Objects software – and was tasked with modeling his business’ most important services.

It might defy expectations, but this new hire was made responsible for modeling the service dependencies for an organization he knew little about. We worked together for about a week and I like to believe I gave him a lot of good advice and training. Consequently, he wasn’t exactly humored at the quantity of work that lay before him, but he gotten some good ideas on how to begin.

About a year later, I spoke with him again and to paraphrase, this is what he told me:

He was viewing his Managed Objects BSM operations console and saw a critical problem -- one of their Oracle servers had thrown a critical error. He walked over to the network operations center to see what they were doing about it.

In response to telling the ops guys that Oracle was down on one of their critical apps, he received blank stares.

“Huh? No, Oracle isn’t down, it all looks okay.”

And then just as the ops people were saying as much…they saw he was right. There was an error. Their question to my customer was, how did he know that, and more importantly how did he know that before they knew?

The answer is simple: he asked the hard questions about what the service dependencies were and then was able to model them. He didn’t model every service, but he modeled the important ones.

He asked questions of managers, operators and developers. He read standards. He pieced together information to develop a picture of how the systems work, and then going back and asking more questions. In context, and against some fairly tremendous odds, my customer pieced together how the key systems they monitored actually worked.

This story is cliché amongst my co-workers. They all have had instances where customers have related similar experiences and seen it themselves.

- Tom


Martin Atherton’s recent blog post “Are your systems falling down on resiliency?” calls out some alarming survey results that all corporate IT and business executives should pay attention to. Chief among these is the finding that 20% of today’s companies have significant, business impacting IT disruptions each and every quarter. It’s even more alarming when these results are juxtaposed with a similar survey conducted by Managed Objects in June of 2007 which found that up to 50% of large enterprise companies have significant business impacting outages up to 5 times a year. Among the Retail Banking sector this percentage goes up to 67%.

It’s tough to look at these results and say that we’re getting any better – besides as a business colleague used to tell me: “you need at least three data points to make a trend.” And yet, when the layers of the onion are peeled away in the Managed Objects survey, respondents felt that a good percentage of these outages were caused by changes to the application configuration. In fact, about a third of Retail Banks responded that between 25% and 50% of IT outages were caused by application configuration changes.

And while this further affirms Martin’s call for better focus on application development processes, it also brings into play the need for better process and technology focus applied to IT Operations change and configuration management. Even with the best designs and most careful plans, well-built production business applications can deteriorate in quality over time as fixes and enhancements are applied. So, while IT service quality improvements can be obtained through better development processes and technologies, process and technology improvements need to carry through to today’s IT operational environment as well as to achieve long-term IT service quality improvements.


Living with IT pain

| Comments (2)

Pain-of-the-blues-Giclee-Print-C11745912.jpg
Sometimes people learn to live with pain. The pain may be real, enduring and chronic but people choose to ignore the pain -- which isn't the same thing as making it go away.

I was speaking with a prospect recently about our BSM solution, and though after listening carefully to our presentation this person’s first reaction was "we just don’t have the pain" to drive this sort of initiative. However, as conversation evolved, we began to uncover that not only was this organization living with pain -- the pain was in fact excruciating and had a direct impact on the highest levels of IT leadership. They had in effect, chosen to ignore the pain. Here’s the story as conveyed:

We have a heterogeneous IT management environment. Our tools report silos of information that must be manually correlated to understand the overall health of the IT infrastructure. As such our users usually provide IT with the first indication that there is a slowdown or outage. For example, a user in California will call the helpdesk in New York to report that an application is running slow.

Upon receiving the call, IT looks at their tools, but the screens are all reflecting green: IT can’t see the problem. Frustrated their issue isn’t being resolved, the users begin to assume IT isn’t worth their salt, and consequently stop calling to report incidents or problems.

Later our CIO goes on a company-wide townhall tour, to visit users and better understand how IT can support the business. He’s taken aback when he gets blasted by users during a meeting (ambush?) at the California office because the PeopleSoft application has been running slow for weeks.

Determined to get to the bottom of this issue the CIO returns to the office with this anecdote and thoroughly questions his staff. "Why wasn’t this problem addressed?" he demands, noting how critical the application is to the business. "No one reported the incident," is the response.

This is a quintessential business case for BSM if I’ve ever heard one. By integrating those existing IT management tools, BSM can consolidate those silos of information and link the underlying infrastructure components to the service being provided (in this case a PeopleSoft application). The benefit of doing this has been proven time and time again: IT will be able to rapidly determine root cause of incidents and problems, often before their customers are even aware, and dramatically reduce the mean-time-to-repair (MTTR). This supports the overall organization’s maturity to move from a reactive to a proactive IT organization. Why not eliminate chronic pain instead of learning to live with it?

As for my prospect, well, we have a second meeting soon and the CIO will be involved.

- Randy



Everyone wants a single source of truth

| Comments (3)

In many ways technologists are fighting for the same ideals – just on different levels. The single source of truth is one of those ideals and I find it fascinating to know that other members of the IT community are tacking a similar problem in parallel. Consider the following:

ERP: CEOs look to enterprise resource planning software to provide a single version of the truth with regard to a company’s finances. According to CIO.com, “Finance has its own set of revenue numbers, sales has another version, and the different business units may each have their own version of how much they contributed to revenues. ERP creates a single version of the truth that cannot be questioned because everyone is using the same system.”

MDM: A March 2008 report from Gartner describes how master data management “can help enterprises achieve a single view of the product.” For example, a consumer goods manufacturer that produces wigits in different countries tends to maintain different versions of the same product specification data for governmental or regulatory reporting requirements. Over time the quality of these different versions changes or erodes leaving the business with multiple versions of the truth.

CMDB: Many IT departments have embarked on configuration management system (CMS) projects to synchronize and reconcile multiple federated data sources from performance, asset and system management tools. To provide a single source of truth about the health, availability and quality of the service provided to the business.

As a BSM vendor, my experience with CMDB customers tells me that the most suitable approach for this type of project it to leverage data directly from the trusted sources through integration, rather than attempting to copy, duplicate or otherwise centralize the data with methods such as ETL. In other words, build a centralized view, not a centralize repository, which of course is a key reason why calling it a configuration management system is more appropriate than calling it a configuration management database.

Still the fact that other areas of IT are seeking to find solutions to the similar challenges is interesting – and we’ll be on the look out for best practices. It’s also reminiscent of the idea that those who don’t read history are bound to repeat it.

- Jim


Does process matter?

| Comments (0)

Does process matter? In the case of BSM, both technology and process are inextricably linked. On one hand, BSM technology brings together integration, modeling, automation and analytics – so IT operations have the necessary tools to quickly associate IT component failures with relative business services. Without technology, root cause and impact analysis must be manually discerned, which for at least one of our customers, takes as many as 35 people on a conference call.

On the other hand, without IT process regimen to ensure IT infrastructure change repeatability and continuous improvement, BSM technology will help to find problems faster – but it won’t help to reduce the risks introduced when changes are made to the IT infrastructure.

In both regards, BSM vendors have come a long way over the last five years. For example we’ve learned how to couple the capabilities of Service Catalog and Discovery with a detailed top down implementation process that results in successful BSM projects in as few as 90 days.

The aspects that vary from project to project are customer-specific – organizational processes or infrastructure – that makes each BSM implementation slightly different. This is where a vendor’s implementation experience matters most. To that end, we like to believe we’ve learned a lot in conducting more than 300 BSM implementations over the last decade.

Managed Objects fully agrees with the assertion that process matters…but we also believe that there are a number of generally accepted and well-defined processes and best practices associated with successful BSM projects.

- Dustin


Few that would argue that rolling out a business service management approach to an organization is highly valuable and beneficial. One of the biggest challenges, however, continues to be what this BSM solution should look like to its consumers. Is it a dashboard with a stoplight like layout? Is it speedometer tracking how quickly things are moving along on a production line? Is it information laid out on a world map highlighting contributions from various regions to a global marketplace?

Consider this analogy to highlight just how important this look and feel really is – I like eggs and toast for breakfast, as do a lot of other people. The raw contents of this breakfast are the same for everyone but preparation and layout will absolutely determine how successful breakfast is on any given day. It will determine how likely I am to recommend the meal to other people, how much satisfaction I gain from eating it, and how much value I associate with it as an everyday exercise. For the record, I like mine prepared as two eggs over hard with wheat toast cut in half on either side of the eggs, almost like a face. How many possible permutations are there? A lot.

Building dashboards at the presentation layer of a BSM solution is arguably the most important and most difficult task during any deployment, akin to making breakfast for a group of people all of whom agree that they want eggs for breakfast but can’t always articulate how they want them prepared or presented. This is exactly why in selecting a BSM vendor, it is absolutely crucial to evaluate flexibility in presenting information and also being able to change it very quickly to look like something completely different. In addition, the ability to quickly add/modify/delete the data feeds allows for new ingredients to be added to the mix so that the presentation layer can truly take on the shape that the BSM user community needs.

All this flexibility comes with its own challenges, the most significant of which is soliciting user feedback and building in mechanisms for comments, suggestions and approval from the user base. It’s one that is certainly reflected in the feature development cycle of a lot of companies, in particular ones with varying types of users. MySpace’s SVP of Product Strategy described their approach of soliciting feedback using blogs along Tom Anderson’s profile. A similar approach was described on CIO.com's blog.

These approaches can be creatively applied to building and maintaining a relevant presentation layer for a BSM solution by allowing communities of interest to provide feedback and suggestions on what they would like to see. It is then up to the administrators of the system to extract themes and concrete ideas from the feedback stream and construct new views. These should then be promoted in Beta form to further distill them, and then brought to the full user community. Sound like a BSM group inside of a larger enterprise social network? You bet.


Analysts in both the US and UK have been anticipating Microsoft’s move to extend its IT management capability into the Linux and UNIX platforms. For example last fall at Gartner’s ITxpo, one analyst theorized that if application vendors moved into the IT management space, it would be game-changing.

There is little doubt Microsoft’s move will make ripples in the market. The company has incredible influence in so many aspects of IT, that if this proves a serious commitment to IT management, there is a high probability for success. That success will likely come at the expense of incumbent vendors – mainly by way of taking market share from the Big 4.

By expanding beyond the realm of Windows it is conceivable that customers might find it attractive to extend their existing MOM implementations to other platforms. However, this does not guarantee gentle westerly winds nor smooth sailing since there are several market dynamics and competitive factors that will influence how – and how quickly – Microsoft’s initiative evolves.

First, cost reduction and cost containment are perhaps the most substantive pressures on IT decisions today. As such, it’s reasonable to expect the Big 4 will respond to this event with more aggressive pricing. In this approach, Microsoft will essentially be trading space for time, and slowly chipping away at Big 4 revenue streams. This will weaken the Big 4 over time.

Secondly, there will remain some doubts in the market as to Microsoft’s credibility. For example it will have to prove it can manage mission critical environments as well as the incumbent vendors. This means IT decision makers will see tremendous risk in migrating to a Microsoft management platform – which can prove to be a difficult and time consuming sales objection to overcome.

At the same time, the Big 4 are investing in two key product functionalities that will extend the vast distance among product innovation that Microsoft, despite its prowess, will find it challenging to cover. Mainly these investments are in behavioural logic – the detection of unusual activity that provides predictive capabilities – and data centre automation.

Perhaps then, Microsoft has long range plans to move into the BSM space given it’s newly found operating system independence. BSM is still very much a level playing field with the Big 4 attempting to buy (rather than innovate) their way into a space with more agile pioneers, like Managed Objects, where our vendor neutral approach and pervasive integration is proving a difficult capability for them to match.

- Jim