Manageability
Summary
With a distributed application you should always know how components are
deployed and configured, where things are on a network, how to add 100s or even
1000s of users and manage their accounts, be able to know if the
application is working as expected, and be able to apply upgrades to remote
sites, and many other issues all relating to how well you can manage your
application.
Managing an enterprise application is a major part of the total cost of
ownership. Manageability addresses the question How do I efficiently deploy,
configure, upgrade and monitor all local/remote components and services of my
distributed application? Beyond this typical operational concerns (deploy,
configure, upgrade and monitor), the business will benefit from the continued
measurement of application health, including factors such as performance,
resource consumption, work loads, and problem occurrences.
That sections that follows attempt to answer the above question.
Designing for manageability is all about providing infrastructure information
so that the application and its important components can be monitored for
corrective and preventive action. In general, designing for manageability
requires three design features:
- Management agents (spies)
Monitor a specific resource and report on the resource's state and
performance. Management agents also provide local configuration services to
enable remote management.
- Collection Process
This is a process that collects, filters, correlates, and stores information
from all management agents.
- Management console(s)
Typically a GUI application that graphically manages information from the
collection process(es). From this central console, an administrator
can monitor all devices, analyze data, automate certain recurring
activities, and initiate remote configuration changes.
The following design recommendation describe how to provide manageability for
a distributed application.
WMI is an object-oriented interface to system management information. It is
used to build tools that organize and manage system information so that
administrators (or system managers) can monitor system activities more
closely. For example, with WMI, you can develop an application that sends you
(or your group) an email if available memory drops below a specific threshold.
With WMI, you can create simple event thresholds, create complex correlated
thresholds, and define actions to be take when thresholds are exceeded.
The following shows some of the most common uses of WMI:
- Provide real-time information about the health of your servers.
- Monitor the historical performance profile of your servers.
- Collect application failures and all other downtimes to determine the
actual availability percentages.
- And as mentioned previously, set threshold events for application
resources such as CPU consumption, available memory, and so on.
Note that the .NET Framework provides the System.Management.Instrumentation
namespace which provides classes necessary to instrument the application for
management, as well as expose management information and evens through WMI to
potential consumers. These potential consumers can be Microsoft Application
Center or Microsoft Operations Manager.
Use Windows Servers
Windows Servers contain functionality that can greatly enhance application
manageability. These features often extend beyond the application management to
network infrastructure management as well.
Testing for manageability is about ensuring that deployment, maintenance, and
monitoring that have been designed into the application are actually working as
expected. The following recommendations can be used to verify that you have
created a manageable application:
- Test WMI
During the design phase, you make certain assumptions about the types of WMI
information that should be available. These might include server and network
configuration, application settings and many other application messages.
Make sure you test for every source of information and be sure to monitor
each one.
- Test Cluster Management
Test adding/removing server from the current and on-line cluster.
Adding/removing servers should not cause any interruption of normal service.
Make sure that the workload is automatically shifted across all available
servers.
- Test Network Load Balancing
If you have a cluster with multiple machines, demonstrate that all servers
are sharing the workload. Also demonstrate that if a server is placed
offline, then the workload is shared across all remaining servers. Likewise,
if a new server has been added, then verify that it gets its share of
workload.
- Test Application Synchronization
Files across servers should always be synchronized. If one set of files is
deleted from a server, then files should be restored automatically.
- Test Change Control Procedures
An important part of application management is being able to handle
scheduled and emergency maintenance changes. Test and validate all change
control procedures.
The following best practices are recommended for creating manageable applications:
- Use the Equivalent of Application Center 2000
Application center eliminates the complexity of deployment, configuring and
maintaining distributed applications.
- Use Windows Management Instrumentation
With WMI, you can query, monitor, and manipulate hardware and software
through a distributed application.
- Build Health Checks into the Application
How do you know if your application is running? What about five minutes
from now? Heartbeats is one example where you can determine if your
application is still running or not. You could also initiate some typical
requests and monitor the time it takes to complete.