Sometimes servers and applications fail, sooner rather than later.
This is because part of our work is to design and continuously plan a robust and highly reliable architecture for our application to ensure its availability most of the time.
Depending on the application, software used, architecture and environment, there are different challenges and strategies. So we’re depending on multiple factors, but in this post I’ll focus on one particular piece for architecture using web servers with Varnish that can help us to improve a bit more the availability with a couple of tweaks.
Varnish is an HTTP accelerator which is used to cache dynamic content from web servers, acting as a proxy between the client and the original web server. The objective of this post is not to focus on the functionality and configuration of Varnish however- if you need, you can find good documentation on this website.
One of the features of Varnish is the support of
Both features will allow us to handle troubles with our web servers and keep the service online, even if our backend servers go down. While this is in part true, of course we cannot guarantee the entire service will continue working just with Varnish… unless we can keep part of our application working.
So imagine a website or API service with thousands of requests per second, some of them may be
PUT requests to submit changes for the application (which those kinds of requests cannot be handled in situations where the backend servers go down). But in the case of
GET requests, where the client wants to obtain the information from the service and Varnish has that particular content on its memory cache, this can be handled perfectly as it returns the content to the client even if the backend is not working.
Of course, there are two things to bear in mind: this request has to be cached before from Varnish and there will be outdated responses to the clients– but that’s better than a reply with an error page!
As always, this behavior is useful depending on the requirements and the type of application. But most of the time, it can save us requests and keep part of the service working in case of failure. So in my opinion, it is highly recommended that you use that.
So let’s get started with the configuration of Varnish:
Edit the VCL definition file, usually located in
/etc/varnish/default.vcl and edit the following directives:
Let’s see with a bit more depth what this configuration does. The “
vcl_recv” is called when at a request comes from the client and the purpose of this method is decide what to do with that request. In that configuration, we’re saying if the backend servers are alive we’ll keep the content for 30 seconds beyond their TTL “
set req.grace = 30s;”. In case the backend becomes unavailable, we’ll keep the content for 6h to serve to the clients “
req.grace = 6h;”.
vcl_fetch” is called when a document has been successfully retrieved from the backend. In case the backend server returns a HTTP error code of 500, 502 or 503, Varnish will not ask that backend again for this object for 10 seconds “
set beresp.saintmode = 10s;”. “
return(restart);” and restart the HTTP request. This will automatically happen on the next available server, except for POST requests, to avoid duplicate form submits, etc… The max_restarts parameter defines the maximum number of restarts that can be issued in VCL before an error is triggered, thus avoiding an infinite looping.
set beresp.grace = 6h;” Keep all objects for 6h longer in the cache than their TTL specifies- so even if HTTP objects are expired (they’ve passed their TTL), we can still use them in case all backend servers go down.
Original Post by Iván Mora (SysOps Engineer @ CAPSiDE) at opentodo.net.