SEARCH 

About this Archive

This page is an archive of entries from August 2008 listed from newest to oldest.

July 2008 is the previous archive.

September 2008 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Syndicate




 

August 2008 Archives

links for 2008-08-28

| Comments (0)

  • "Kenny Van Zant, chief product strategist at SolarWinds, a network management software maker, told me that most network outages are not caused by corrupted files...'If you look at the root causes of most network outages, north of 70 percent of them are caused by configuration errors by humans,' Van Zant told me. 'Computers fail a whole lot less often than the humans punching things into computers fail. Network engineers, as smart as they are, are not immune from that.' " >>> We've seen some evidence to support this hypothesis as well. In fact, one of our customers - an ITIL centric shop - found that 60% of outages were being caused by planned changes. Solution? CMDB360 enables them to understand the impact of change before making the change.
    (tags: CMDB BSM ITIL)
  • "Other factors are also contributing to spiraling network costs. Aberdeen Group, for example, found that companies expect to increase their bandwidth by 108%, on average, over the next 12 months. Companies also expect to increase the number of business-critical applications running on their networks by 67%."
  • "Challenges of a consolidation project include...Moving the right systems. It is often difficult for IT operations teams to map server resources to existing applications and IT services, ensuring that all the appropriate systems are moved in a consolidation project." >>> That's where BSM comes in: map technology...to applications...to the business
  • "The U.S.-led economic downturn shows no sign of causing a recession in IT spending," said Jim Tully, vice president and distinguished analyst at Gartner, in a statement. "In subsequent years we will see reduced growth, but the fundamentals remain strong. Emerging regions, replacement of obsolete systems and some technology shifts are driving growth."


links for 2008-08-25

| Comments (0)


I get to have some very interesting and varied discussions with people interested in BSM. They generally start off with a high level strategic goal for their IT organization and go from there. A common theme that’s emerged in a large percentage of them lately is the topic of Application Performance Management (APM) and its relationship with BSM. More specifically, I’ve been spending a lot of time talking about End-User Experience Monitoring (as a subset of APM) as a key element of IT management strategy.

The most important factor when organizations look towards End-User Experience Monitoring (EUEM) technologies is of course that user experience is the ultimate success criteria by which IT services should be evaluated. If I could provide highly responsive application services that were always up and running, what else could be expected of me? The end users would be happy and the lines of business would be realizing the value of their application investments. The underlying technology at that point becomes largely irrelevant. Whether the applications run in corporate datacenters, are hosted by a 3rd party, run in a cloud, or are 100% virtualized, it just doesn’t matter. I am of course oversimplifying the matter by focusing on performance and availability of applications, while leaving out other elements such as security, but, for the sake of discussion, let’s stick to performance and availability for now.

The benefits for having an EUEM solution in place are obvious:

-- Instead of waiting for the phone to ring at the IT Ops Help Desk, you are proactively notified about application failures and brownouts (“this application is so slow!”)
-- Is the reported issue real; is the application really slow or is it performing the same as it always does?
-- Quickly get a handle on the scope of an issue; how many users impacted, which locations, is it all applications or just a few, etc.
-- Get initial diagnostic information to start the triage process; who gets called into the war room? Visibility always sounds great, but I always like to ask – “Have you considered the downside of total visibility into end user performance?”
-- Issues that previously went unreported will be obvious on the chosen EUEM products reports – If an application performs slowly and no one calls it in, was it a real problem? The answer is now Yes, whereas before the EUEM tool, it never happened.
-- Don’t be surprised when you hear talk about establishing internal service level objectives for performance, availability, and MTTR not for IT infrastructure elements, but End User Experience.
-- How well equipped are you to handle an increase in the number of events that need troubleshooting?  Now that you know every time an app fails or is slow, how quickly can you figure out WHY it happened?

This is where the BSM discussion starts to come up organically (unless it was the starting point, of course). Having a defined service model in place for the key applications & services that IT provides to the business is key to the WHY side of the EUEM equation. We talk about having a BSM implementation that ties together the best of breed End-User Experience Monitoring solutions, Service Desks, Fault and Event Management solutions, Element Management platforms, IT infrastructure performance monitoring tools, and other types of products to build a complete strategy for IT and Business Operations. Integration discussions quickly follow but that is a topic that has been covered extensively in other blog posts (but if comments suggest it’s time to revisit the topic, we certainly can).

I often wind up talking about specific types of EUEM technologies and vendors because of my background in this space (see full disclosure below) wanted to capture some of those discussions in a series of posts. First off, I generally group EUEM tools into one of three categories:

1. Passive monitoring systems

    -- Generally appliance offerings which capture and analyze IP packets to provide insight into End-User Experience of a wide array of applications
    -- May have specialized analyses for standards-based apps such as Web, Voice, and Video
    -- Deployment is usually at datacenters or wherever the applications are hosted
    -- Vendors providing solutions in this space (in alphabetical order):
            -- CA’s Wily Customer Experience Manager (via Wily)
            -- Compuware’s ClientVantage (via Adlex)
            -- Coradiant
            -- HP’s Real User Monitor software (via Mercury Interactive)
            -- IBM Tivoli
            -- NetQoS
            -- Nimsoft
            -- OPNET Technologies
            -- Quest’s Foglight End User Management

2. Active monitoring systems w/ synthetic transactions

    -- In most cases this involves recording sample user activities (e.g. login, search for info, run report, etc.) and using deployed robot agents/appliances to replay them at varied intervals.
    -- Most frequently these solutions tend to be oriented exclusively towards Web apps, but there are specialized vendors that focus on large enterprise application suites such as SAP and Oracle.
    -- Generally deployed at user population centers such as campuses or representative locations such as selected offices in Europe or Asia.
    -- Market was pretty much created by Mercury Interactive (now part of HP) so they are the dominant player here.
    -- Some of the vendors in the passive monitoring space also have some active monitoring elements in their portfolio to “poke” the applications when users might not be utilizing their applications (after business hours, for example); other smaller vendors offer lower cost solutions when compared to the HP suite including Managed Objects’ own Business Experience Manager.
    -- For internet facing applications, services from companies like Gomez and Keynote Systems are great sources of performance data without having to deploy your own monitoring “robots”

3. End-user behavior monitoring & analysis

    -- A few of the passive monitoring vendors offer aggregate or high level data, but I tend to group solutions into this category that have detailed analysis not just performance, but also business analytics about user behavior.
    -- Some of the defining elements in this category include capture and playback of complete user sessions, tracking time spent on each page, click-through rates, and error or missing content identification.
    -- Solutions in this space are specialized by applications (e.g. Web, SAP, Oracle, etc.)
    -- Deployment can be a combination of appliances and software agents. 
    -- A few of the vendors in this space (in alphabetical order):
            -- Aternity
            -- Knoa
            -- Tealeaf

The list of players in the EUEM space is not comprehensive, just a short list of those whom I have some level of familiarity. Some quick Google searches or a call to your favorite analyst should turn up more.

I would like to explore this area in more detail and plan a series of follow up posts with topics such as: details about what information from EUEM solutions should be incorporated into the BSM fabric (Hint: it’s not just events); sharing field experience on deployment strategies that have worked & pitfalls to avoid; detailed coverage on specific vendors including strengths, weaknesses, integration details, etc.

If you have experience with any of the products mentioned, other vendors that feel should be mentioned in any of the categories, general feedback, or would like to have a one-on-one discussion on this (or other) topic I’ve covered, leave me a comment below.

***Disclosure: The author of this post, Abbas Haider Ali, held roles at OPNET Technologies and IBM prior to Managed Objects.


Martin Atherton’s recent blog post “Are your systems falling down on resiliency?” calls out some alarming survey results that all corporate IT and business executives should pay attention to. Chief among these is the finding that 20% of today’s companies have significant, business impacting IT disruptions each and every quarter. It’s even more alarming when these results are juxtaposed with a similar survey conducted by Managed Objects in June of 2007 which found that up to 50% of large enterprise companies have significant business impacting outages up to 5 times a year. Among the Retail Banking sector this percentage goes up to 67%.

It’s tough to look at these results and say that we’re getting any better – besides as a business colleague used to tell me: “you need at least three data points to make a trend.” And yet, when the layers of the onion are peeled away in the Managed Objects survey, respondents felt that a good percentage of these outages were caused by changes to the application configuration. In fact, about a third of Retail Banks responded that between 25% and 50% of IT outages were caused by application configuration changes.

And while this further affirms Martin’s call for better focus on application development processes, it also brings into play the need for better process and technology focus applied to IT Operations change and configuration management. Even with the best designs and most careful plans, well-built production business applications can deteriorate in quality over time as fixes and enhancements are applied. So, while IT service quality improvements can be obtained through better development processes and technologies, process and technology improvements need to carry through to today’s IT operational environment as well as to achieve long-term IT service quality improvements.


CMDB and governance

| Comments (0)

cmdb_gov-2.jpg
IT governance is loosely defined as "the performance, risk assessment and compliance of an information technology system." It is a subset of corporate governance which applies the same principal at a corporate level.

IT governance is a main motivator for corporate entities in their decisions to create compliant IT systems that include CMDBs. The CMDB is supposed to model (as closely as possible) a "vision of the truth" about those corporate entities’ IT systems. To that end I would like to discuss the governance of how a CMDB is constructed. By CMDB governance I mean: the performance, risk assessment and compliance of a CMDB to the actual true state of the IT system being modeled; I might also add to this the confidence in that CMDB as it relates to its accuracy in modeling the IT system.

How then are CMDBs constructed? The best way is to tie together existing data and complement that data with discovery tools and old fashioned data-entry. The latter may be done through the use of various scripts or the mining of databases that hold certain aspects of the system, but in large part these would be grouped under ‘formal methodologies.’

At this point the CMDB construction is already fraught with danger, for example holes in the data, or data that is inaccurate or stale. The confidence factor of such a CMDB may be quite low, which makes it rather useless. This problem is exacerbated by that fact that building it can be a very time-consuming and difficult procedure. There is a very well known decision matrix called the ‘Impact/Effort’ matrix. The idea is that if an activity has a high effort, but a low impact, then that activity is not worth doing.

This is a major problem in creating a useful CMBD. What to do?

Well, before I attempt to answer that question, let me introduce a different way that CMDB’s can be created: social networking tools.

The idea is that the IT systems knowledge (at least on a high level) is contained in the heads of many human beings. The raw data can be managed by the discovery tools and formal tools, but the structural knowledge and ability to correct or fill in knowledge requires constant, dynamic human intervention.

That’s why tools like Managed Objects myCMDB that allow ‘Wiki’ like intervention into the construction of a CMDB are so useful. This works by taking the view that the CMDB is a dynamic living entity that never really attains a static ‘state of truth,’ but instead has major portions that are basically static, and ‘outliers’ that are being dynamically changed as network needs change, as virtual machines are created and destroyed, and as complex applications go into and out of existence. The truth in a complex IT system is a living, breathing, and most importantly, changing beast. This is why the traditional methods of creating a CMDB always seem to end up with something that is out of date with the real thing.

To make sure that the data is maintained in the CMDB correctly, a ‘governance’ policy needs to be instituted between the ‘gathering’ of the data, and the ‘structuring’ of the data within the CMDB itself. Obviously there will be many things that can be instituted as boiler-plate within this policy, but there will be many aspects of this policy that will be unique to the corporate entity that is instituting them. The policy is itself stored within the CMDB which introduces an aspect of feed-back that will help to tailor this policy to better capture and structure the dynamically collected data that ultimately forms the CMDB model.

The feedback loop is essentially a human interaction activity and could be compared to idea of domain experts who edit an encyclopedia (or a Wiki).

These experts examine the current structures of the collected knowledge within the CMDB to the real world, and modify the policy so that the collection and structuring of data more closely fits with, and changes with, the real world. The ultimate hope is to get to a policy that produces the closest possible CMDB model to the real world.

As the policy improves within the enterprise, the confidence factor rises and, of course, the usefulness of the CMDB within the enterprise also rises.

- David