Manageability

Summary

Overview
Designing for Manageability
Testing for Manageability
Best Practices for Manageability

Overview

With a distributed application you should always know how components are deployed and configured, where things are on a network, how to add 100s or even 1000s of users and manage their accounts, be able to know if the application is working as expected, and be able to apply upgrades to remote sites, and many other issues all relating to how well you can manage your application.

Managing an enterprise application is a major part of the total cost of ownership. Manageability addresses the question How do I efficiently deploy, configure, upgrade and monitor all local/remote components and services of my distributed application? Beyond this typical operational concerns (deploy, configure, upgrade and monitor), the business will benefit from the continued measurement of application health, including factors such as performance, resource consumption, work loads, and problem occurrences.

That sections that follows attempt to answer the above question.

Designing for Manageability

Designing for manageability is all about providing infrastructure information so that the application and its important components can be monitored for corrective and preventive action. In general, designing for manageability requires three design features:

Management agents (spies)
Monitor a specific resource and report on the resource's state and performance. Management agents also provide local configuration services to enable remote management.
Collection Process
This is a process that collects, filters, correlates, and stores information from all management agents.
Management console(s)
Typically a GUI application that graphically manages information from the collection process(es). From this central console, an administrator can monitor all devices, analyze data, automate certain recurring activities, and initiate remote configuration changes.

The following design recommendation describe how to provide manageability for a distributed application.

Use Windows Management Instrumentation (WMI)

WMI is an object-oriented interface to system management information. It is used to build tools that organize and manage system information so that administrators (or system managers) can monitor system activities more closely. For example, with WMI, you can develop an application that sends you (or your group) an email if available memory drops below a specific threshold. With WMI, you can create simple event thresholds, create complex correlated thresholds, and define actions to be take when thresholds are exceeded.

The following shows some of the most common uses of WMI:

Provide real-time information about the health of your servers.
Monitor the historical performance profile of your servers.
Collect application failures and all other downtimes to determine the actual availability percentages.
And as mentioned previously, set threshold events for application resources such as CPU consumption, available memory, and so on.

Note that the .NET Framework provides the System.Management.Instrumentation namespace which provides classes necessary to instrument the application for management, as well as expose management information and evens through WMI to potential consumers. These potential consumers can be Microsoft Application Center or Microsoft Operations Manager.

Use Windows Servers

Windows Servers contain functionality that can greatly enhance application manageability. These features often extend beyond the application management to network infrastructure management as well.

Testing for Manageability

Testing for manageability is about ensuring that deployment, maintenance, and monitoring that have been designed into the application are actually working as expected. The following recommendations can be used to verify that you have created a manageable application:

Test WMI
During the design phase, you make certain assumptions about the types of WMI information that should be available. These might include server and network configuration, application settings and many other application messages. Make sure you test for every source of information and be sure to monitor each one.
Test Cluster Management
Test adding/removing server from the current and on-line cluster. Adding/removing servers should not cause any interruption of normal service. Make sure that the workload is automatically shifted across all available servers.
Test Network Load Balancing
If you have a cluster with multiple machines, demonstrate that all servers are sharing the workload. Also demonstrate that if a server is placed offline, then the workload is shared across all remaining servers. Likewise, if a new server has been added, then verify that it gets its share of workload.
Test Application Synchronization
Files across servers should always be synchronized. If one set of files is deleted from a server, then files should be restored automatically.
Test Change Control Procedures
An important part of application management is being able to handle scheduled and emergency maintenance changes. Test and validate all change control procedures.

Best Practices for Manageability

The following best practices are recommended for creating manageable applications:

Use the Equivalent of Application Center 2000
Application center eliminates the complexity of deployment, configuring and maintaining distributed applications.
Use Windows Management Instrumentation
With WMI, you can query, monitor, and manipulate hardware and software through a distributed application.
Build Health Checks into the Application
How do you know if your application is running? What about five minutes from now? Heartbeats is one example where you can determine if your application is still running or not. You could also initiate some typical requests and monitor the time it takes to complete.