REDIS as Padrino cache
Using REDIS as and application cache is very handy. You can easily use it in, say, Padrino like this:
Note that we have a specific cache entry for each user
(current_account.email
). For instance, a user with email foo@bar.com
will have this entry cached at foo@bar.com:test:index
Cache invalidation
Now, sometimes you need to expire the cache forcibly. For instance, let’s say you know you’ve changed something in the database and that you don’t want stale data to be served, you can invalidate the cache manually. Or may be you want to invalidate a complete user cache at login time.
However, this is not easy in our case, since we want to remove all
entries matching *:test:index
(or foo@bar.com:*
if we want to
completely wipe out the user cache).
The first idea that comes to mind is to use the Redis KEYS
command that can
accept globs to match key names, like KEYS foo@bar.com:*
.
But in the documentation1, you’ll find a big fat warning about KEYS:
Warning: consider KEYS as a command that should only be used in production environments with extreme care. It may ruin performance when it is executed against large databases. This command is intended for debugging and special operations, such as changing your keyspace layout. Don’t use KEYS in your regular application code. If you’re looking for a way to find keys in a subset of your keyspace, consider using SCAN or sets.
Scary as it sounds.
Cursors to the rescue
REDIS comes with a nice, but not very known feature since v2.8: SCAN. The SCAN command is a cursor based iterator. You give him a key pattern, and every time you call it, it will return the next set of matching keys, and an index for the next call.
Here is a piece of code that can invalidate key wildcards from padrino :
You can not easily invalidate a cache wildcard calling
invalidate_cache_like
.
For instance, at user login, you could call :
and the user cache is now cleared.
Benchmarking
Let’s play with Benchmark
a bit to compare SCAN
and KEYS
performance on a
moderately sized database. While we’re at it, we’ll also check these commands using redis
and hiredis
drivers, to see if it makes any difference.
I used the following piece of code for that:
After a few minutes running, I got those surprising results:
$ ./redis-expire-wildcard.rb
user system total real
looping (ruby): 0.040000 0.010000 0.050000 ( 1.059056)
scanning (ruby): 49.000000 11.490000 60.490000 ( 61.113561)
looping (hiredis): 0.020000 0.010000 0.030000 ( 1.073681)
scanning (hiredis): 19.680000 12.880000 32.560000 ( 44.972220)
First, there is no much improvements using hiredis
over redis
when looping
in our case. This sounds legit, since we loop only 26 times here and the
hiredis
performance benefit doesn’t rise with so few commands (hiredis
does
a much more better job if you change the tested range so more commands are
issued).
Second, using SCAN
here is much slower than using KEYS
!
So why use SCAN
instead of KEYS
? The problem with KEYS
is that it will block your server while retrieving all the keys. The cursor based approach will return small chunks of keys and won’t block the server for the time of a whole key scan.
However, handling cursor based expiration can be tricky in a web application. Since it takes so much longer (but is friendlier to Redis), you might have to handle it in a separate task from your application process (in Sidekiq for instance).
It all depends on your app. You can start using simply KEYS
, but will have to
keep in mind that cursors will be needed if usage or concurrent trafic rises and
monitor your Redis statistics for this.