Inside the Beaker Cache

Caching

First lets start out with some slow function that we would like to cache. This function is not slow but it will show us when it was cached so we can see things are working as we expect:

import time
def slooow(myarg):
  # some slow database or template stuff here
  return "%s at %s" % (myarg,time.asctime())

When we have the cached function, multiple calls will tell us whether are seeing a cached or a new version.

DBMCache

The DBMCache stores (actually pickles) the response in a dbm style database.

What may not be obvious is that the are two levels of keys. They are essentially created as one for the function or template name (called the namespace) and one for the keys within that (called the key). So for Some_Function_name, there is a cache created as one dbm file/database. As that function is called with different arguments, those arguments are keys within the dbm file. First lets create and populate a cache. This cache might be a cache for the function Some_Function_name called three times with three different arguments: x, yy, and zzz:

from beaker.cache import CacheManager
cm = CacheManager(type='dbm', data_dir='beaker.cache')
cache = cm.get_cache('Some_Function_name')
# the cache is setup but the dbm file is not created until needed 
# so let's populate it with three values:
cache.get_value('x',createfunc=lambda:slooow('x'),expiretime=15)
cache.get_value('yy',createfunc=lambda:slooow('yy'),expiretime=15)
cache.get_value('zzz',createfunc=lambda:slooow('zzz'),expiretime=15)

Nothing much new yet. After getting the cache we can use the cache as per the Beaker Documentation.

import beaker.container as container
nam = container.DBMNamespaceManager('Some_Function_name', data_dir='beaker.cache')
filename=nam.file

Now we have the file name. The file name is a sha hash of a string which is a join of the container class name and the function name (used in the get_cache function call). It would return something like:

'beaker.cache/container_dbm/a/a7/a768f120e39d0248d3d2f23d15ee0a20be5226de.dbm'

With that file name you could look directly inside the cache database (for your education/debugging experience not your cache interations!)

## this file name is can be used directly (for debug ONLY)
import anydbm
import pickle
db=anydbm.open(filename)
old_t, expire, old_v = pickle.loads(db['zzz'])

The database only contains the old time, expire time and old value. Where did the function to create/update the value go?. It never make it to the database. They reside in the cache object returned from get_cache call above.

Note the the createfunc is stored during the first call to get_value. If there are difficulties with these values, remember that one call to cache.clear() resets everything.

Database Cache

Using the ext:database cache type.

from beaker.cache import CacheManager
#cm = CacheManager(type='dbm', data_dir='beaker.cache')
cm = CacheManager(type='ext:database', url="sqlite:///beaker.cache/beaker.sqlite",data_dir='beaker.cache')
cache = cm.get_cache('Some_Function_name')
# the cache is setup but the dbm file is not created until needed 
# so let's populate it with three values:
cache.get_value('x',createfunc=lambda:slooow('x'),expiretime=15)
cache.get_value('yy',createfunc=lambda:slooow('yy'),expiretime=15)
cache.get_value('zzz',createfunc=lambda:slooow('zzz'),expiretime=15)

This is identical to the cache usage above with the only difference being the creation of the CacheManager. It is much easier to view the caches outside the beaker code (again for edification and debugging, not for api usage).

sqlite was used here so you can access the sqlite file directly:

# from command line:
sqlite3 beaker.cache/beaker.sqlite
# from inside sqlite:
sqlite> .schema
CREATE TABLE beaker_cache (
        id INTEGER NOT NULL, 
        namespace VARCHAR(255) NOT NULL, 
        key VARCHAR(255) NOT NULL, 
        value BLOB NOT NULL, 
        PRIMARY KEY (id), 
         UNIQUE (namespace, key)
);
select * from beaker_cache;

Note: in ver 0.8 of beaker the data structure is different. I will include the access time but only store rows on a one row per namespace (storing a pickled dict) rather than one row per namespace/key combination. Not sure if I like this. This is optimized best for large number of namespaces with limited keys (like session).

For large numbers of keys with expensive pre-key lookups memcached it the way to go.

Memcached Cache

if memcached is running on the the default port of 11211:

from beaker.cache import CacheManager
cm = CacheManager(type='ext:memcached',url='127.0.0.1:11211',lock_dir='beaker.cache')
cache = cm.get_cache('Some_Function_name')
# the cache is setup but the dbm file is not created until needed 
# so let's populate it with three values:
cache.get_value('x',createfunc=lambda:slooow('x'),expiretime=15)
cache.get_value('yy',createfunc=lambda:slooow('yy'),expiretime=15)
cache.get_value('zzz',createfunc=lambda:slooow('zzz'),expiretime=15)