Cache Collector

Summary

Purpose

Purges cache entries that are no longer required.

Scenario

A Cache Collector defines a set of rules for identifying and removing expired cache entries from the cache. Some common strategies for identifying and removing expired cache entries are:

Cache Collector allows to encapsulate a cache garbage collection strategy with a Cache Accessor implementation where you can decouple cache garbage collection form other orthogonal aspects of caching. In addition, cache collector can encapsulate multiple cache garbage collection strategies by defining its operations in terms of pluggable operations. A Cache Accessor implementation can then choose which strategy to use.

Structure & UML

The following figure illustrates the static structure for the Cache Collector :

ICacheCollector, ICacheEntry, and ICacheEntryFactory are interfaces that define various abstractions for cache collection strategies (obviously with interfaces, there are no specific algorithms or rules as these are just interfaces and not implemenation-details). These interfaces allow you to plug multiple collection strategies into the same CacheAccessor implementation, and more importantly, they allow you to isolate collection logic within its own component. Note the following:

The following figure illustrates the sequence diagram when a CacheAccessor writes a new cache entry:

The CacheAccessor uses its ConcreteCacheEntryFactory implementation to wrap the original data in a ConcreteCacheEntry object. This new ConcreteCacheEntry object is then stored in the Cache rather than the data itself. This allows the CacheAccessor object to maintain any related metadata that its ConcreteCacheCollector requires.

The following figure illustrates the sequence diagram when a CacheAccessor reads a cache entry:

CacheAccessor gets the corresponding ConcreteCacheEntry object from the Cache object. And instead of returning this ConcreteCacheEntry object to the client, CacheAccessor retrieves and returns the original data that the ConcreteCacheEntry object wraps. This has three positive effects:

  1. The concept of ConcreteCacheEntries remain hidden from clients and this eliminates client dependency on a specific cache collection strategy.
  2. ConcreteCacheEntries have an opportunity to update metadata accordingly (updating timestamps, etc.)
  3. CacheAccessor's read and write data are symmetrical so that clients read and write in the same form.

The following figure illustrates the sequence diagram when a CacheAccessor initiates a purge operation:

When a CacheAccessor initializes, it starts a background worker thread that is dedicated to collecting expired cache entries. Whenever, cache collection is required, be it at pre-determined times or on demand, CacheAccessor calls Collect on its assigned ConcreteCacheCollector object. ConcreteCacheCollector object responds by executing the rules for its cache collection strategy and removing expired entries from the Cache object.

Example

CacheAccessor depends on three interfaces whose implementation define a complete cache collection strategy:

internal interface ICacheEntry
{
    object GetData { get; }
}

internal interface ICacheEntryFactory
{
    ICacheEntry NewCacheEntry( object key, object data );
}

internal interface ICacheCollector
{
    void Collect();
}

Fixed Expiration

The following code shows how to implement a fixed expiration cache-collection strategy. Each FixedExpirationCacheEntry maintains its creation timestamp which the FixedExpirationCacheCollector uses to decide when it expires. These classes optimized garbage collection by implementing a queue that holds keys in their creation order; instead of iterating through the entire cache contents to identify expired entries, FixedExpirationCacheCollector checks only the keys on the front of the queue. Once it finds a key that has not expired, it safely assumes the rest of the keys in the queue follow suit:

internal class FixedExpirationCacheEntry : ICacheEntry
{
    // Data members
   
private object m_Data;
    private System.DateTime m_dtCreationTime;

    // Constructors
   
public FixedExpirationCacheEntry( object data )
    {
        m_Data = data;
        m_dtCreationTime = System.DateTime.Now;
    }

    // ICacheEntry implementation
   
public object GetData
    {
        get { return m_Data; } 
    }

    public DateTime CreationTime
    {
        get { return m_dtCreationTime; }
        }
}

internal class FixedExpirationCacheEntryFactory : ICacheEntryFactory
{
    // Data members
    private ArrayList m_alCreationQueue = null;

    // Constructors
   
public FixedExpirationCacheEntryFactory( ArrayList alQueue )
    {
        m_alCreationQueue = alQueue;
    }

    // ICacheEntryFactory implementation
   
public ICacheEntry NewCacheEntry( object key, object data )
    {
        // Add this key to the front of the list. If it already exists, remove the old entry since the corresponding
        // cache entry is removed as well
       
lock( this)
        {
            if (m_alCreationQueue.Contains( key ))
            {
                m_alCreationQueue.Remove( key );
                m_alCreationQueue.Add( key );
            }
            else
            {
                m_alCreationQueue.Add (key );
            }
        }

        // Now create a new instance
       
return new FixedExpirationCacheEntry( data );
    }
}

internal class FixedExpirationCacheCollector : ICacheCollector
{
    // Data members
   
private Hashtable m_Cache = null;
    private ArrayList m_alCreationQueue = null;
    private TimeSpan m_tsInterval;

    // Constructor
   
public FixedExpirationCacheCollector( Hashtable cache, ArrayList q, double dInterval)
    {
        m_Cache = cache;
        m_alCreationQueue = q;
        m_tsInterval = System.TimeSpan.FromMinutes( dInterval );
    }

    // ICacheCollector interface
   
public void Collect()
    {
        // Get a date/time value to represent expiration time. Entries whose creation time is less
        // than dtExpirationTime are expired
        DateTime dtExpirationTime = DateTime.Now.Subtract( m_tsInterval );

        lock( this )
        {
            ArrayList alKeysToRemove = new ArrayList();
            foreach (object key in this.m_alCreationQueue )
            {
                // Get current entry
                FixedExpirationCacheEntry entry = (FixedExpirationCacheEntry)m_Cache[key];

                // Remove entry if it has expired
               
if (entry.CreationTime < dtExpirationTime )
                    alKeysToRemove.Add ( key );
                else
                    break;
            }

            // Now remove the expired cache entries
    
        foreach (object key in alKeysToRemove)
            {
                m_Cache.Remove( key );
                m_alCreationQueue.Remove( key );
            }
        }
    }
}

Inactive Expiration

The following code shows how to implement an inactive cache-collection strategy. Each InactiveExpirationCacheEntry maintains its last access timestamp which it updates every time a client refers to its data. InactiveExpirationCacheCollector uses this timestamp to identify expired entries. The class maintains an access queue that lists the keys in their access order. This means InactiveExpirationCacheCollector can find inactive entries quickly by iterating from the beginning of the queue and finding the first expired entry. Once it finds an expired entry, it can safely assume that all subsequent entries have expired:

internal class InactiveExpirationCacheEntry : ICacheEntry
{
    // Data members
   
private object m_Key;
    private object m_Data;
    private System.DateTime m_dtLastAccessTime;
    private ArrayList m_alAccessQueue;

    // Constructors
    public InactiveExpirationCacheEntry( object key, object data, ArrayList alAccessQueue )
    {
        m_Data = data;
        m_Key = key;
        m_alAccessQueue = alAccessQueue;
    }

    // ICacheEntry implementation
    public object GetData
    {
        get
        {
            // Move the underlying data object to the beginning of the access queue
            // to indicate that this data item has been accessed
 
           lock( this )
            {
                m_alAccessQueue.Remove( m_Key );
                m_alAccessQueue.Add ( m_Key );
                m_dtLastAccessTime = DateTime.Now;
            }

            // Now return the underlying data item
            return m_Key;
        }
    }

    public DateTime LastAccessTime
    {
        get { return m_dtLastAccessTime; }
    }
}

internal class InactiveExpirationCacheEntryFactory : ICacheEntryFactory
{
    // Data members
    private ArrayList m_alAccessQueue = null;

    // Constructors
    public InactiveExpirationCacheEntryFactory( ArrayList alQueue )
    {
        m_alAccessQueue = alQueue;
    }

    // ICacheEntryFactory implementation
    public ICacheEntry NewCacheEntry( object key, object data )
    {
        // Create and return a new instance
        return new InactiveExpirationCacheEntry( key, data, m_alAccessQueue );
    }
}

internal class InactiveExpirationCacheCollector : ICacheCollector
{
    // Data members
    private Hashtable m_Cache = null;
    private ArrayList m_alAccessQueue = null;
    private TimeSpan m_tsInterval;

    // Constructor
    public InactiveExpirationCacheCollector( Hashtable cache, ArrayList q, double dInterval)
    {
        m_Cache = cache;
        m_alAccessQueue = q;
        m_tsInterval = System.TimeSpan.FromMinutes( dInterval );
    }

    // ICacheCollector interface
    public void Collect()
    {
        // Get a date/time value to represent expiration time. Entries whose creation time is less
        // than dtExpirationTime are expired
 
       DateTime dtExpirationTime = DateTime.Now.Subtract( m_tsInterval );

        lock( this )
        {
            ArrayList alKeysToRemove = new ArrayList();
            foreach (object key in m_alAccessQueue )
            {
                // Get curent entry
                InactiveExpirationCacheEntry entry = (InactiveExpirationCacheEntry)m_Cache[key];

                // Remove entry if it has expired
                if (entry.LastAccessTime < dtExpirationTime )
                    alKeysToRemove.Add ( key );
                else
                break;
            }

            // Now remove the expired cache entries
            foreach (object key in alKeysToRemove)
            {
                m_Cache.Remove( key );
                m_alAccessQueue.Remove( key );
            }
        }
    }
}

CacheAccessor class ties all the pieces together:

internal class CacheAccessor : IDisposable
{
    // Data members
    private Hashtable m_Cache;
    private long m_lInterval;
    private ICacheEntryFactory m_CacheEntryFactory = null;
    private ICacheCollector m_CacheCollector = null;
    private Thread m_threadCollector = null;
    private bool m_bStopThread = false;

    // Constructors
    public CacheAccessor(Hashtable cache, long interval, ICacheEntryFactory factory, ICacheCollector collector )
    {
        // Save variables
        this.m_Cache = cache;
        this.m_lInterval = interval;
        this.m_CacheCollector = collector;
        this.m_CacheEntryFactory = factory;

        // Start the collector thread
        m_threadCollector = new Thread( new ThreadStart( GarbageCollector ) );
        m_threadCollector.Start();
    }

    // Public interface
    public object Read( object key )
    {
        object oData = null;
        lock (m_Cache)
        {
            if (m_Cache.Contains( key ))
            {
                ICacheEntry entry = (ICacheEntry)m_Cache[ key ];
                oData = entry.GetData;
            } 
        }

        return oData;
    }

    public void Write( object key, object data )
    {
        ICacheEntry entry = m_CacheEntryFactory.NewCacheEntry( key, data );
        lock (m_Cache)
        {
            m_Cache.Add( key, data );
        }
                    }

    // IDispose
    public void Dispose()
    {
        // Wait until the thread stops
        m_bStopThread = true;
        m_threadCollector.Join();
    }

    // Helpers
    private void GarbageCollector()
    {
        // Loop until requested to stop
        while (!m_bStopThread)
        {
            // Collect expired cache entries
            this.m_CacheCollector.Collect();

            // Then sleep for the duration of the requested interval
            Thread.Sleep( (int)m_lInterval );
        }
    }
}

Applicability

Use this pattern when:

Strategies / Variants

Collection operations usually run periodically (or on demand) in a low priority background thread. However, this introduces concurrent cache access from multiple threads and requires additional synchronization involving the cache structure and its entries. Without synchronization, attempting to clean the cache using the background thread will ultimately corrupt data!

The most straightforward approach for synchronization is to place all code that accesses the cache within critical sections. This means that only one thread at a time can access code within critical sections. Consequently, when a cache collection operation is active in the background, foreground applications may block while waiting to access cache data. If the collection strategy takes a long time to run, or runs frequently, it significantly degrade performance for foreground cache operations. Therefore, when implementing a cache collection strategy, try to minimize cache collection frequency and well as using critical sections sparingly and only where necessary.

Another alternative is to integrate cache garbage collection with the cache implementation so that it gets invoked as part of some or all cache operations. This scheme works particularly well with Least Recently Used cache expiration strategy. Synchronization becomes less of an issue since the cache operation implementations normally account for it. 

Benefits

Liabilities

Related Patterns