Conditional gzipping with Apache

This article was originally published in my blog (affectionately referred to as blargh) on . The original blog no longer exists as I've migrated everything to this wiki.

The original URL of this post was at https://tmont.com/blargh/2009/8/conditional-gzipping-with-apache. Hopefully that link redirects back to this page.

This has been slightly modified from the original to be more informative and less idiosyncratic. Also note that the two utilities I built are no longer around so these links are broken forever.

I've never really experimented with some of the more low-level website optimizations (server side stuff, like gzipping, caching and the like), mostly because I've never really had to. I'm not a server admin, and my sites don't generate enough traffic for me to really care about such minor issues. But then I released two little utilities wherein users could just include a piece of JavaScript from my own server. You know, they would so something like this:

html
<script type="text/javascript" src="http://linkurious.com/js"></script>

where the src attribute was pointing to an external server; namely, my server. If enough people start using these utilities (extremely unlikely, although they are pretty awesome) it could be a strain on my server.

So I decided to offer a plain text version and a gzipped version. But I didn't want to store two versions of the scripts in separate places on my server just to satisfy that need. The reasons had nothing to do with disk space or anything tangible; I just didn't "feel right" doing something so hackish.

Anyway, once I figured out how to use mod_deflate:

apache
SetOutputFilter DEFLATE

I realized that that would make everything gzipped! But then I noticed you could filter the... er... filter by mimetype, like so:

apache
# only text/html will be gzipped
AddOutputFilterByType DEFLATE text/html

But that still didn't help me. I needed conditional gzipping, like if the query string said ?gz it would know to serve the document gzipped; otherwise, it would just serve it with no compression. But how to accomplish this? Apache's docs were no help. Luckily I'm fairly proficient at mod_rewrite, and I'm decently intelligent. Let's see if I can figure it out.

Here were my requirements:

  1. A request to http://acronymulator.com/1 would serve the JavaScript document with no compression
  2. A request to http://acronymulator.com/1/gz would serve the JavaScript document with gzip compression
  3. Only one physical copy of the document exists on the server

Let's get to work!

First, let's do the rewrite rules:

apache
RewriteEngine On
RewriteRule ^/([1-9]\d*)(?:/gz)? /acronymulates/$1.js

The first line turns on the rewrite engine. The second line matches anything from the root that is made up of numbers, at least one number long where the first number is not a zero, with an optional /gz on the end. The rule finishes by internally redirecting those to the acronymulates directory, and serves up the JavaScript file whose name is the number with .js at the end. Simple enough, right? Sure.

Hopefully, you've deduced that /gz indicates that we want gzip compression; if it's omitted, we don't want gzip compression. Notice that it's redirecting to the real JavaScript file whether the /gz is tacked on the end or not.

Now we need to do the deflate stuff. How we do it? Well, the most obvious solution is something like this:

apache
<Directory /path/to/acronymulates>
    SetOutputFilter DEFLATE
</Directory>


This makes sense because the only thing in the acronymulates directory is the JavaScript files that we want to compress. See?

gzipping

However, this will force gzip compression no matter what, and we want it be conditional: only when the /gz is part of the URL. What do we do?

Location to the rescue!

The Location directive accomplishes this for us. more specifically, the LocationMatch directive accomplishes this for us. The difference between Directory and Location is that Directory matches a physical directory on the filesystem, whereas Location just matches a virtual directory (i.e. the path of the URL: /1/gz, in our case). Now it's fairly obvious what the solution should be:

apache
<LocationMatch /[1-9]\d*/gz>
    SetOutputFilter DEFLATE
</LocationMatch>

And there's your dynamic gzipping using Apache. Note that these are real life examples. Let's prove it with Firebug's help.

Without compression (http://acronymulator.com/1):

dynamic gzipping

With compression (http://acronymulator.com/1/gz):

Dynamic gzipping: with compression

And here's proof that both URLs serve the same content:

MD5 sum comparison