Saturday, August 16, 2008

Distributed Caching with Microsoft "Velocity" - An Introduction

Something that I think a lot of software developers have dealt with when building high availability web applications is how to manage distributed cache.  It's one thing to be able to get your data/objects into cache, but it's another thing to come up with a great cache synchronization architecture that's reliable and scalable.  To break down the common synchronization problem simply: In a load balanced environment, how do changes made on one server, force cache purging on other servers in the farm? 

This problem has resulted in the creation of distributed caching frameworks, and the emergence of a caching tier.  There's been a couple of frameworks that have been available for some time now.  Some are free (memcached) and some that aren't (NCache).  Now Microsoft is entering the game with a distributed caching framework code named "Velocity".

Velocity solves this problem by providing the infrastructure required to keep caches synchronized across application boundaries.  It essentially fuses memory across application and network boundaries, providing a unified view of a cache from a distributed application.

The key features of Velocity are:

  • It can cache any CLR object that has been declared as Serializable (either through the SerializableAttribute, or by implementing the ISerializable interface).
  • Through configuration, caches can embedded into your application, or accessed over the network.
  • Provides another option for session storage by allowing you to configure sessions state to be persisted to the Velocity cache.
  • Highly flexible, allowing you to have "regions" of data within the same application to use different caching strategies.
  • Highly modular in structure, allowing you to hook in transactions, or even replace the network layer.

The Velocity team is part of the Data Platform group in Microsoft, and in fact Velocity shares the same clustering technology with the team that is building SQL Server Data Services (SSDS), which is a cloud computing initiative from the SQL Server team.  The result of this collaboration and sharing is going to be a huge ability to scale this technology to thousands of nodes within a cache. 

Additionally, they are working with the MSN.com and Live.com teams to see if those sites could benefit from Velocity.  To think, you have the same scaling facility that two of the highest trafficked sites have.  For free!

This is an extremely useful technology coming out of Microsoft, and long overdue, and I'm a fan of the way they are putting it together.  I'll be posting more about the different configuration options and appropriate uses of

5 comments:

Anonymous said...

Hi Jim,
It’s good to read your blog about Distributed Caching. You are right! Developers do have issues with managing distributed cache while building high end web applications. I have seen the solutions you mentioned here NCache, Memcached, and Velocity. With Microsoft entering this arena, I find two major benefits:

1. Awareness about distributed caching
2. Competition in market

You mentioned that NCache is not free. I am using NCache’s free version currently. It’s called NCache Express and its performing just fine.

Jim Fiorato said...

Hey Tom

Thanks for setting me straight. I didn't realize they had a free version. Looks like there's an edition comparison here:

http://www.alachisoft.com/ncache/index.html#et

Anonymous said...

Hi Tom,

Thanks for mentioning NCache. NCache Express is free for 2-server clusters and NCache Enterprise meets the needs as you need to scale up. We also offer attractive prices for small configurations.

Check it out at http://www.alachisoft.com.

Anonymous said...

Hi Jim,

Do you know if anybody's been experimenting with accessing Velocity through SQL Server's 2005/8 CLR?

We're interested in adding a result key set that is created by a particular SQL Server box to the distributed cache so it can be retrieved by any other SQL Server box. Rather than have the application tier do this, we're thinking about having SQL Server directly access the cache through the CLR.

Kind of figure it's possible and probably not too hard.

Anonymous said...

The last comment should have said "Hi Tom" not "Hi Jim" :)