Caching Strategies for Improved Web Performance

Caching is the method that most improves response time in web applications (as Steve Souders shows in Cache is King), but in order to make use of it, every layer of your application must be configured for that purpose.

Most applications are initially developed with little or no use of caching and then must be refactored to fulfill performance goals. However, this approach incurs extra development costs that could be saved if response time is taken into consideration in the early stages of the development process.

The methodology that can save your life while you are still developing your application is pretty straightforward: keep caching in mind whenever handling data in your system. Either web APIs or internal backend data flows need to ask one simple question:

Can I survive if the data seen by the user is not the latest?

Sometimes the answer to this question is ‘no.’ For example, I would be fired very quickly if I built a bank system that showed more money than one consumer’s account really has. On the other hand, if the system interacts with general data services like social networks, news, weather, car traffic, etc., there is less need to ensure the latest piece of information is immediately shown to the user.

Of course, the latest data needs to eventually get to the user. Data cannot be too old or you risk confusing the user, but configuring a short expiration time (let’s say 5-10 minutes or less) for dynamic data that can support it can significantly improve the response time experience. That is called temporal consistency and it is crucial for having a successful caching strategy in place.

Nowadays, web applications are based on mashing up several web services coming from different sources. The best way to tackle different response times as well as data designs is to temporally cache those elements across all system layers. It is also applicable to data coming from your own system if the information needs to travel from one part of the world to another in several hops. If information is not critical, consider caching it at any intermediate stage and reuse when it is needed. Caching in the backend can avoid half of a trip. Even better would be to cache at the target device or a CDN system that can dispose of the full data trip or reduce it to only the last mile as an easy way to enhance performance.

After identifying your smartly chosen cached data, the next step is to determine where and how long to cache. When considering where to cache the data, you have many choices. Most of the web application architectures include caching tools such as Couchbase or Memcached at the backend side, “pool from origin” CDN capability and/or browser caching. Depending on the data that is being cached (public or not) and where it is processed, the layer where it can be temporally stored can be determined. The way to do it is using the Cache-Control, Expires and ETag HTTP headers in the HTTP response from your application server (more information can be found at section 14.9, 14.19, and 14.21 of RFC 2616).

Backend and browser caching are the most used approaches, but you should also consider CDN capabilities for applications that are globally distributed. Although CDNs have been created for static content, you can also cache data expiring in short periods of time to handle dynamic content caching using a CDN. The CDN approach is very helpful when more than one user in the same location accesses the application. The initial impression will be very good in the consumer’s eyes (critical for a successful application) since the response time will be fast even though some data is updated later for a new dataset.

So, how hard could it be to implement caching into your application? If your application’s datasets are a mix of user data and data to be seen by more than one user and/or device, refactoring it is not a task that can be done overnight. General purpose data (specified by “public” in the Cache-Control HTTP header) caches more easily than users’ data (“private” for Cache-Control) and separating the two, in addition to all the development hours, could generate bugs and undesired output. So, make sure to always ask yourself the simple question: Can I cache this to improve performance? Doing so can save tons of coding and re-coding hours and make performance optimal.

On your next project, remember, a well thought-out caching strategy is essential for tackling performance requirements in your application.

NOTE: If you are interested in attending OSCON to check out David’s talk or the many other cool sessions, click over to the OSCON website where you can use the discount code OS13PROG to get 20% off your registration fee.

[adrotate banner=”7″]

Caching Strategies for Improved Web Performance

OSCON 2013 Speaker Series

Get the O’Reilly Systems Engineering and Operations Newsletter